Inheritance diagram for lsst.daf.persistence.butler.Butler:

Public Member Functions
def	__init__ (self, root=None, mapper=None, inputs=None, outputs=None, mapperArgs)

def	__repr__ (self)

def	defineAlias (self, alias, datasetType)

def	getKeys (self, datasetType=None, level=None, tag=None)

def	queryMetadata (self, datasetType, format, dataId={}, rest)

def	datasetExists (self, datasetType, dataId={}, write=False, rest)

def	get (self, datasetType, dataId=None, immediate=True, rest)

def	put (self, obj, datasetType, dataId={}, doBackup=False, rest)

def	subset (self, datasetType, level=None, dataId={}, rest)

def	dataRef (self, datasetType, level=None, dataId={}, rest)

def	getUri (self, datasetType, dataId=None, write=False, rest)

def	__reduce__ (self)

Static Public Member Functions
def	getMapperClass (root)

Public Attributes
	log

	datasetTypeAliasDict

	storage

Static Public Attributes
int	GENERATION = 2

Detailed Description

Butler provides a generic mechanism for persisting and retrieving data using mappers.

A Butler manages a collection of datasets known as a repository. Each dataset has a type representing its
intended usage and a location. Note that the dataset type is not the same as the C++ or Python type of the
object containing the data. For example, an ExposureF object might be used to hold the data for a raw
image, a post-ISR image, a calibrated science image, or a difference image. These would all be different
dataset types.

A Butler can produce a collection of possible values for a key (or tuples of values for multiple keys) if
given a partial data identifier. It can check for the existence of a file containing a dataset given its
type and data identifier. The Butler can then retrieve the dataset. Similarly, it can persist an object to
an appropriate location when given its associated data identifier.

Note that the Butler has two more advanced features when retrieving a data set. First, the retrieval is
lazy. Input does not occur until the data set is actually accessed. This allows datasets to be retrieved
and placed on a clipboard prospectively with little cost, even if the algorithm of a stage ends up not
using them. Second, the Butler will call a standardization hook upon retrieval of the dataset. This
function, contained in the input mapper object, must perform any necessary manipulations to force the
retrieved object to conform to standards, including translating metadata.

Public methods:

__init__(self, root, mapper=None, **mapperArgs)

defineAlias(self, alias, datasetType)

getKeys(self, datasetType=None, level=None)

queryMetadata(self, datasetType, format=None, dataId={}, **rest)

datasetExists(self, datasetType, dataId={}, **rest)

get(self, datasetType, dataId={}, immediate=False, **rest)

put(self, obj, datasetType, dataId={}, **rest)

subset(self, datasetType, level=None, dataId={}, **rest)

dataRef(self, datasetType, level=None, dataId={}, **rest)

Initialization:

The preferred method of initialization is to use the `inputs` and `outputs` __init__ parameters. These
are described in the parameters section, below.

For backward compatibility: this initialization method signature can take a posix root path, and
optionally a mapper class instance or class type that will be instantiated using the mapperArgs input
argument. However, for this to work in a backward compatible way it creates a single repository that is
used as both an input and an output repository. This is NOT preferred, and will likely break any
provenance system we have in place.

Parameters
----------
root : string
.. note:: Deprecated in 12_0
`root` will be removed in TBD, it is replaced by `inputs` and `outputs` for
multiple-repository support.
A file system path. Will only work with a PosixRepository.
mapper : string or instance
.. note:: Deprecated in 12_0
`mapper` will be removed in TBD, it is replaced by `inputs` and `outputs` for
multiple-repository support.
Provides a mapper to be used with Butler.
mapperArgs : dict
.. note:: Deprecated in 12_0
`mapperArgs` will be removed in TBD, it is replaced by `inputs` and `outputs` for
multiple-repository support.
Provides arguments to be passed to the mapper if the mapper input argument is a class type to be
instantiated by Butler.
inputs : RepositoryArgs, dict, or string
Can be a single item or a list. Provides arguments to load an existing repository (or repositories).
String is assumed to be a URI and is used as the cfgRoot (URI to the location of the cfg file). (Local
file system URI does not have to start with 'file://' and in this way can be a relative path). The
`RepositoryArgs` class can be used to provide more parameters with which to initialize a repository
(such as `mapper`, `mapperArgs`, `tags`, etc. See the `RepositoryArgs` documentation for more
details). A dict may be used as shorthand for a `RepositoryArgs` class instance. The dict keys must
match parameters to the `RepositoryArgs.__init__` function.
outputs : RepositoryArgs, dict, or string
Provides arguments to load one or more existing repositories or create new ones. The different types
are handled the same as for `inputs`.

The Butler init sequence loads all of the input and output repositories.
This creates the object hierarchy to read from and write to them. Each
repository can have 0 or more parents, which also get loaded as inputs.
This becomes a DAG of repositories. Ultimately, Butler creates a list of
these Repositories in the order that they are used.

Initialization Sequence
=======================

During initialization Butler creates a Repository class instance & support structure for each object
passed to `inputs` and `outputs` as well as the parent repositories recorded in the `RepositoryCfg` of
each existing readable repository.

This process is complex. It is explained below to shed some light on the intent of each step.

1. Input Argument Standardization
---------------------------------

In `Butler._processInputArguments` the input arguments are verified to be legal (and a RuntimeError is
raised if not), and they are converted into an expected format that is used for the rest of the Butler
init sequence. See the docstring for `_processInputArguments`.

2. Create RepoData Objects
--------------------------

Butler uses an object, called `RepoData`, to keep track of information about each repository; each
repository is contained in a single `RepoData`. The attributes are explained in its docstring.

After `_processInputArguments`, a RepoData is instantiated and put in a list for each repository in
`outputs` and `inputs`. This list of RepoData, the `repoDataList`, now represents all the output and input
repositories (but not parent repositories) that this Butler instance will use.

3. Get `RepositoryCfg`s
-----------------------

`Butler._getCfgs` gets the `RepositoryCfg` for each repository the `repoDataList`. The behavior is
described in the docstring.

4. Add Parents
--------------

`Butler._addParents` then considers the parents list in the `RepositoryCfg` of each `RepoData` in the
`repoDataList` and inserts new `RepoData` objects for each parent not represented in the proper location
in the `repoDataList`. Ultimately a flat list is built to represent the DAG of readable repositories
represented in depth-first order.

5. Set and Verify Parents of Outputs
------------------------------------

To be able to load parent repositories when output repositories are used as inputs, the input repositories
are recorded as parents in the `RepositoryCfg` file of new output repositories. When an output repository
already exists, for consistency the Butler's inputs must match the list of parents specified the already-
existing output repository's `RepositoryCfg` file.

In `Butler._setAndVerifyParentsLists`, the list of parents is recorded in the `RepositoryCfg` of new
repositories. For existing repositories the list of parents is compared with the `RepositoryCfg`'s parents
list, and if they do not match a `RuntimeError` is raised.

6. Set the Default Mapper
-------------------------

If all the input repositories use the same mapper then we can assume that mapper to be the
"default mapper". If there are new output repositories whose `RepositoryArgs` do not specify a mapper and
there is a default mapper then the new output repository will be set to use that default mapper.

This is handled in `Butler._setDefaultMapper`.

7. Cache References to Parent RepoDatas
---------------------------------------

In `Butler._connectParentRepoDatas`, in each `RepoData` in `repoDataList`, a list of `RepoData` object
references is built that matches the parents specified in that `RepoData`'s `RepositoryCfg`.

This list is used later to find things in that repository's parents, without considering peer repository's
parents. (e.g. finding the registry of a parent)

8. Set Tags
-----------

Tags are described at https://ldm-463.lsst.io/v/draft/#tagging

In `Butler._setRepoDataTags`, for each `RepoData`, the tags specified by its `RepositoryArgs` are recorded
in a set, and added to the tags set in each of its parents, for ease of lookup when mapping.

9. Find Parent Registry and Instantiate RepoData
------------------------------------------------

At this point there is enough information to instantiate the `Repository` instances. There is one final
step before instantiating the Repository, which is to try to get a parent registry that can be used by the
child repository. The criteria for "can be used" is spelled out in `Butler._setParentRegistry`. However,
to get the registry from the parent, the parent must be instantiated. The `repoDataList`, in depth-first
search order, is built so that the most-dependent repositories are first, and the least dependent
repositories are last. So the `repoDataList` is reversed and the Repositories are instantiated in that
order; for each RepoData a parent registry is searched for, and then the Repository is instantiated with
whatever registry could be found.

Definition at line 325 of file butler.py.

Constructor & Destructor Documentation

◆ init()

def lsst.daf.persistence.butler.Butler.__init__	(	self,
		root = `None`,
		mapper = `None`,
		inputs = `None`,
		outputs = `None`,
		mapperArgs
	)

Definition at line 507 of file butler.py.

     def __init__(self, root=None, mapper=None, inputs=None, outputs=None, **mapperArgs):
         self._initArgs = {'root': root, 'mapper': mapper, 'inputs': inputs, 'outputs': outputs,
                           'mapperArgs': mapperArgs}
 
         self.log = Log.getLogger("daf.persistence.butler")
 
         inputs, outputs = self._processInputArguments(
             root=root, mapper=mapper, inputs=inputs, outputs=outputs, **mapperArgs)
 
         # convert the RepoArgs into RepoData
         inputs = [RepoData(args, 'input') for args in inputs]
         outputs = [RepoData(args, 'output') for args in outputs]
         repoDataList = outputs + inputs
 
         self._getCfgs(repoDataList)
 
         self._addParents(repoDataList)
 
         self._setAndVerifyParentsLists(repoDataList)
 
         self._setDefaultMapper(repoDataList)
 
         self._connectParentRepoDatas(repoDataList)
 
         self._repos = RepoDataContainer(repoDataList)
 
         self._setRepoDataTags()
 
         for repoData in repoDataList:
             self._initRepo(repoData)
 

Member Function Documentation

◆ reduce()

def lsst.daf.persistence.butler.Butler.__reduce__ ( self )

Definition at line 1596 of file butler.py.

     def __reduce__(self):
         ret = (_unreduce, (self._initArgs, self.datasetTypeAliasDict))
         return ret
 

◆ repr()

def lsst.daf.persistence.butler.Butler.__repr__ ( self )

Definition at line 1035 of file butler.py.

     def __repr__(self):
         return 'Butler(datasetTypeAliasDict=%s, repos=%s)' % (
             self.datasetTypeAliasDict, self._repos)
 

◆ dataRef()

def lsst.daf.persistence.butler.Butler.dataRef	(	self,
		datasetType,
		level = `None`,
		dataId = `{}`,
		rest
	)

Returns a single ButlerDataRef.

Given a complete dataId specified in dataId and **rest, find the unique dataset at the given level
specified by a dataId key (e.g. visit or sensor or amp for a camera) and return a ButlerDataRef.

Parameters
----------
datasetType - string
    The type of dataset collection to reference
level - string
    The level of dataId at which to reference
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
dataRef - ButlerDataRef
    ButlerDataRef for dataset matching the data id

Definition at line 1481 of file butler.py.

     def dataRef(self, datasetType, level=None, dataId={}, **rest):
         """Returns a single ButlerDataRef.
 
         Given a complete dataId specified in dataId and **rest, find the unique dataset at the given level
         specified by a dataId key (e.g. visit or sensor or amp for a camera) and return a ButlerDataRef.
 
         Parameters
         ----------
         datasetType - string
             The type of dataset collection to reference
         level - string
             The level of dataId at which to reference
         dataId - dict
             The data id.
         **rest
             Keyword arguments for the data id.
 
         Returns
         -------
         dataRef - ButlerDataRef
             ButlerDataRef for dataset matching the data id
         """
 
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         subset = self.subset(datasetType, level, dataId, **rest)
         if len(subset) != 1:
             raise RuntimeError("No unique dataset for: Dataset type:%s Level:%s Data ID:%s Keywords:%s" %
                                (str(datasetType), str(level), str(dataId), str(rest)))
         return ButlerDataRef(subset, subset.cache[0])
 

◆ datasetExists()

def lsst.daf.persistence.butler.Butler.datasetExists	(	self,
		datasetType,
		dataId = `{}`,
		write = `False`,
		rest
	)

Determines if a dataset file exists.

Parameters
----------
datasetType - string
    The type of dataset to inquire about.
dataId - DataId, dict
    The data id of the dataset.
write - bool
    If True, look only in locations where the dataset could be written,
    and return True only if it is present in all of them.
**rest keyword arguments for the data id.

Returns
-------
exists - bool
    True if the dataset exists or is non-file-based.

Definition at line 1218 of file butler.py.

     def datasetExists(self, datasetType, dataId={}, write=False, **rest):
         """Determines if a dataset file exists.
 
         Parameters
         ----------
         datasetType - string
             The type of dataset to inquire about.
         dataId - DataId, dict
             The data id of the dataset.
         write - bool
             If True, look only in locations where the dataset could be written,
             and return True only if it is present in all of them.
         **rest keyword arguments for the data id.
 
         Returns
         -------
         exists - bool
             True if the dataset exists or is non-file-based.
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         dataId.update(**rest)
         locations = self._locate(datasetType, dataId, write=write)
         if not write:  # when write=False, locations is not a sequence
             if locations is None:
                 return False
             locations = [locations]
 
         if not locations:  # empty list
             return False
 
         for location in locations:
             # If the location is a ButlerComposite (as opposed to a ButlerLocation),
             # verify the component objects exist.
             if isinstance(location, ButlerComposite):
                 for name, componentInfo in location.componentInfo.items():
                     if componentInfo.subset:
                         subset = self.subset(datasetType=componentInfo.datasetType, dataId=location.dataId)
                         exists = all([obj.datasetExists() for obj in subset])
                     else:
                         exists = self.datasetExists(componentInfo.datasetType, location.dataId)
                     if exists is False:
                         return False
             else:
                 if not location.repository.exists(location):
                     return False
         return True
 

◆ defineAlias()

def lsst.daf.persistence.butler.Butler.defineAlias	(	self,
		alias,
		datasetType
	)

Register an alias that will be substituted in datasetTypes.

Parameters
----------
alias - string
    The alias keyword. It may start with @ or not. It may not contain @ except as the first character.
datasetType - string
    The string that will be substituted when @alias is passed into datasetType. It may not contain '@'

Definition at line 1105 of file butler.py.

     def defineAlias(self, alias, datasetType):
         """Register an alias that will be substituted in datasetTypes.
 
         Parameters
         ----------
         alias - string
             The alias keyword. It may start with @ or not. It may not contain @ except as the first character.
         datasetType - string
             The string that will be substituted when @alias is passed into datasetType. It may not contain '@'
         """
         # verify formatting of alias:
         # it can have '@' as the first character (if not it's okay, we will add it) or not at all.
         atLoc = alias.rfind('@')
         if atLoc == -1:
             alias = "@" + str(alias)
         elif atLoc > 0:
             raise RuntimeError("Badly formatted alias string: %s" % (alias,))
 
         # verify that datasetType does not contain '@'
         if datasetType.count('@') != 0:
             raise RuntimeError("Badly formatted type string: %s" % (datasetType))
 
         # verify that the alias keyword does not start with another alias keyword,
         # and vice versa
         for key in self.datasetTypeAliasDict:
             if key.startswith(alias) or alias.startswith(key):
                 raise RuntimeError("Alias: %s overlaps with existing alias: %s" % (alias, key))
 
         self.datasetTypeAliasDict[alias] = datasetType
 

◆ get()

def lsst.daf.persistence.butler.Butler.get	(	self,
		datasetType,
		dataId = `None`,
		immediate = `True`,
		rest
	)

Retrieves a dataset given an input collection data id.

Parameters
----------
datasetType - string
    The type of dataset to retrieve.
dataId - dict
    The data id.
immediate - bool
    If False use a proxy for delayed loading.
**rest
    keyword arguments for the data id.

Returns
-------
    An object retrieved from the dataset (or a proxy for one).

Definition at line 1356 of file butler.py.

     def get(self, datasetType, dataId=None, immediate=True, **rest):
         """Retrieves a dataset given an input collection data id.
 
         Parameters
         ----------
         datasetType - string
             The type of dataset to retrieve.
         dataId - dict
             The data id.
         immediate - bool
             If False use a proxy for delayed loading.
         **rest
             keyword arguments for the data id.
 
         Returns
         -------
             An object retrieved from the dataset (or a proxy for one).
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         dataId.update(**rest)
 
         location = self._locate(datasetType, dataId, write=False)
         if location is None:
             raise NoResults("No locations for get:", datasetType, dataId)
         self.log.debug("Get type=%s keys=%s from %s", datasetType, dataId, str(location))
 
         if hasattr(location, 'bypass'):
             # this type loader block should get moved into a helper someplace, and duplications removed.
             def callback():
                 return location.bypass
         else:
             def callback():
                 return self._read(location)
         if location.mapper.canStandardize(location.datasetType):
             innerCallback = callback
 
             def callback():
                 return location.mapper.standardize(location.datasetType, innerCallback(), dataId)
         if immediate:
             return callback()
         return ReadProxy(callback)
 

◆ getKeys()

def lsst.daf.persistence.butler.Butler.getKeys	(	self,
		datasetType = `None`,
		level = `None`,
		tag = `None`
	)

Get the valid data id keys at or above the given level of hierarchy for the dataset type or the
entire collection if None. The dict values are the basic Python types corresponding to the keys (int,
float, string).

Parameters
----------
datasetType - string
    The type of dataset to get keys for, entire collection if None.
level - string
    The hierarchy level to descend to. None if it should not be restricted. Use an empty string if the
    mapper should lookup the default level.
tags - any, or list of any
    Any object that can be tested to be the same as the tag in a dataId passed into butler input
    functions. Applies only to input repositories: If tag is specified by the dataId then the repo
    will only be read from used if the tag in the dataId matches a tag used for that repository.

Returns
-------
Returns a dict. The dict keys are the valid data id keys at or above the given level of hierarchy for
the dataset type or the entire collection if None. The dict values are the basic Python types
corresponding to the keys (int, float, string).

Definition at line 1135 of file butler.py.

     def getKeys(self, datasetType=None, level=None, tag=None):
         """Get the valid data id keys at or above the given level of hierarchy for the dataset type or the
         entire collection if None. The dict values are the basic Python types corresponding to the keys (int,
         float, string).
 
         Parameters
         ----------
         datasetType - string
             The type of dataset to get keys for, entire collection if None.
         level - string
             The hierarchy level to descend to. None if it should not be restricted. Use an empty string if the
             mapper should lookup the default level.
         tags - any, or list of any
             Any object that can be tested to be the same as the tag in a dataId passed into butler input
             functions. Applies only to input repositories: If tag is specified by the dataId then the repo
             will only be read from used if the tag in the dataId matches a tag used for that repository.
 
         Returns
         -------
         Returns a dict. The dict keys are the valid data id keys at or above the given level of hierarchy for
         the dataset type or the entire collection if None. The dict values are the basic Python types
         corresponding to the keys (int, float, string).
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
 
         keys = None
         tag = setify(tag)
         for repoData in self._repos.inputs():
             if not tag or len(tag.intersection(repoData.tags)) > 0:
                 keys = repoData.repo.getKeys(datasetType, level)
                 # An empty dict is a valid "found" condition for keys. The only value for keys that should
                 # cause the search to continue is None
                 if keys is not None:
                     break
         return keys
 

◆ getMapperClass()

def lsst.daf.persistence.butler.Butler.getMapperClass ( root )

static

posix-only; gets the mapper class at the path specified by root (if a file _mapper can be found at
that location or in a parent location.

As we abstract the storage and support different types of storage locations this method will be
moved entirely into Butler Access, or made more dynamic, and the API will very likely change.

Definition at line 1097 of file butler.py.

     def getMapperClass(root):
         """posix-only; gets the mapper class at the path specified by root (if a file _mapper can be found at
         that location or in a parent location.
 
         As we abstract the storage and support different types of storage locations this method will be
         moved entirely into Butler Access, or made more dynamic, and the API will very likely change."""
         return Storage.getMapperClass(root)
 

◆ getUri()

def lsst.daf.persistence.butler.Butler.getUri	(	self,
		datasetType,
		dataId = `None`,
		write = `False`,
		rest
	)

Return the URI for a dataset

.. warning:: This is intended only for debugging. The URI should
never be used for anything other than printing.

.. note:: In the event there are multiple URIs for read, we return only
the first.

.. note:: getUri() does not currently support composite datasets.

Parameters
----------
datasetType : `str`
   The dataset type of interest.
dataId : `dict`, optional
   The data identifier.
write : `bool`, optional
   Return the URI for writing?
rest : `dict`, optional
   Keyword arguments for the data id.

Returns
-------
uri : `str`
   URI for dataset.

Definition at line 1512 of file butler.py.

     def getUri(self, datasetType, dataId=None, write=False, **rest):
         """Return the URI for a dataset
 
         .. warning:: This is intended only for debugging. The URI should
         never be used for anything other than printing.
 
         .. note:: In the event there are multiple URIs for read, we return only
         the first.
 
         .. note:: getUri() does not currently support composite datasets.
 
         Parameters
         ----------
         datasetType : `str`
            The dataset type of interest.
         dataId : `dict`, optional
            The data identifier.
         write : `bool`, optional
            Return the URI for writing?
         rest : `dict`, optional
            Keyword arguments for the data id.
 
         Returns
         -------
         uri : `str`
            URI for dataset.
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         dataId.update(**rest)
         locations = self._locate(datasetType, dataId, write=write)
         if locations is None:
             raise NoResults("No locations for getUri: ", datasetType, dataId)
 
         if write:
             # Follow the write path
             # Return the first valid write location.
             for location in locations:
                 if isinstance(location, ButlerComposite):
                     for name, info in location.componentInfo.items():
                         if not info.inputOnly:
                             return self.getUri(info.datasetType, location.dataId, write=True)
                 else:
                     return location.getLocationsWithRoot()[0]
             # fall back to raise
             raise NoResults("No locations for getUri(write=True): ", datasetType, dataId)
         else:
             # Follow the read path, only return the first valid read
             return locations.getLocationsWithRoot()[0]
 

◆ put()

def lsst.daf.persistence.butler.Butler.put	(	self,
		obj,
		datasetType,
		dataId = `{}`,
		doBackup = `False`,
		rest
	)

Persists a dataset given an output collection data id.

Parameters
----------
obj -
    The object to persist.
datasetType - string
    The type of dataset to persist.
dataId - dict
    The data id.
doBackup - bool
    If True, rename existing instead of overwriting.
    WARNING: Setting doBackup=True is not safe for parallel processing, as it may be subject to race
    conditions.
**rest
    Keyword arguments for the data id.

Definition at line 1399 of file butler.py.

     def put(self, obj, datasetType, dataId={}, doBackup=False, **rest):
         """Persists a dataset given an output collection data id.
 
         Parameters
         ----------
         obj -
             The object to persist.
         datasetType - string
             The type of dataset to persist.
         dataId - dict
             The data id.
         doBackup - bool
             If True, rename existing instead of overwriting.
             WARNING: Setting doBackup=True is not safe for parallel processing, as it may be subject to race
             conditions.
         **rest
             Keyword arguments for the data id.
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         dataId.update(**rest)
 
         locations = self._locate(datasetType, dataId, write=True)
         if not locations:
             raise NoResults("No locations for put:", datasetType, dataId)
         for location in locations:
             if isinstance(location, ButlerComposite):
                 disassembler = location.disassembler if location.disassembler else genericDisassembler
                 disassembler(obj=obj, dataId=location.dataId, componentInfo=location.componentInfo)
                 for name, info in location.componentInfo.items():
                     if not info.inputOnly:
                         self.put(info.obj, info.datasetType, location.dataId, doBackup=doBackup)
             else:
                 if doBackup:
                     location.getRepository().backup(location.datasetType, dataId)
                 location.getRepository().write(location, obj)
 

◆ queryMetadata()

def lsst.daf.persistence.butler.Butler.queryMetadata	(	self,
		datasetType,
		format,
		dataId = `{}`,
		rest
	)

Returns the valid values for one or more keys when given a partial
input collection data id.

Parameters
----------
datasetType - string
    The type of dataset to inquire about.
format - str, tuple
    Key or tuple of keys to be returned.
dataId - DataId, dict
    The partial data id.
**rest -
    Keyword arguments for the partial data id.

Returns
-------
A list of valid values or tuples of valid values as specified by the
format.

Definition at line 1171 of file butler.py.

     def queryMetadata(self, datasetType, format, dataId={}, **rest):
         """Returns the valid values for one or more keys when given a partial
         input collection data id.
 
         Parameters
         ----------
         datasetType - string
             The type of dataset to inquire about.
         format - str, tuple
             Key or tuple of keys to be returned.
         dataId - DataId, dict
             The partial data id.
         **rest -
             Keyword arguments for the partial data id.
 
         Returns
         -------
         A list of valid values or tuples of valid values as specified by the
         format.
         """
 
         datasetType = self._resolveDatasetTypeAlias(datasetType)
         dataId = DataId(dataId)
         dataId.update(**rest)
         format = sequencify(format)
 
         tuples = None
         for repoData in self._repos.inputs():
             if not dataId.tag or len(dataId.tag.intersection(repoData.tags)) > 0:
                 tuples = repoData.repo.queryMetadata(datasetType, format, dataId)
                 if tuples:
                     break
 
         if not tuples:
             return []
 
         if len(format) == 1:
             ret = []
             for x in tuples:
                 try:
                     ret.append(x[0])
                 except TypeError:
                     ret.append(x)
             return ret
 
         return tuples
 

◆ subset()

def lsst.daf.persistence.butler.Butler.subset	(	self,
		datasetType,
		level = `None`,
		dataId = `{}`,
		rest
	)

Return complete dataIds for a dataset type that match a partial (or empty) dataId.

Given a partial (or empty) dataId specified in dataId and **rest, find all datasets that match the
dataId.  Optionally restrict the results to a given level specified by a dataId key (e.g. visit or
sensor or amp for a camera).  Return an iterable collection of complete dataIds as ButlerDataRefs.
Datasets with the resulting dataIds may not exist; that needs to be tested with datasetExists().

Parameters
----------
datasetType - string
    The type of dataset collection to subset
level - string
    The level of dataId at which to subset. Use an empty string if the mapper should look up the
    default level.
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
subset - ButlerSubset
    Collection of ButlerDataRefs for datasets matching the data id.

Examples
-----------
To print the full dataIds for all r-band measurements in a source catalog
(note that the subset call is equivalent to: `butler.subset('src', dataId={'filter':'r'})`):

>>> subset = butler.subset('src', filter='r')
>>> for data_ref in subset: print(data_ref.dataId)

Definition at line 1436 of file butler.py.

     def subset(self, datasetType, level=None, dataId={}, **rest):
         """Return complete dataIds for a dataset type that match a partial (or empty) dataId.
 
         Given a partial (or empty) dataId specified in dataId and **rest, find all datasets that match the
         dataId.  Optionally restrict the results to a given level specified by a dataId key (e.g. visit or
         sensor or amp for a camera).  Return an iterable collection of complete dataIds as ButlerDataRefs.
         Datasets with the resulting dataIds may not exist; that needs to be tested with datasetExists().
 
         Parameters
         ----------
         datasetType - string
             The type of dataset collection to subset
         level - string
             The level of dataId at which to subset. Use an empty string if the mapper should look up the
             default level.
         dataId - dict
             The data id.
         **rest
             Keyword arguments for the data id.
 
         Returns
         -------
         subset - ButlerSubset
             Collection of ButlerDataRefs for datasets matching the data id.
 
         Examples
         -----------
         To print the full dataIds for all r-band measurements in a source catalog
         (note that the subset call is equivalent to: `butler.subset('src', dataId={'filter':'r'})`):
         >>> subset = butler.subset('src', filter='r')
         >>> for data_ref in subset: print(data_ref.dataId)
         """
         datasetType = self._resolveDatasetTypeAlias(datasetType)
 
         # Currently expected behavior of subset is that if specified level is None then the mapper's default
         # level should be used. Convention for level within Butler is that an empty string is used to indicate
         # 'get default'.
         if level is None:
             level = ''
 
         dataId = DataId(dataId)
         dataId.update(**rest)
         return ButlerSubset(self, datasetType, level, dataId)
 
 

Member Data Documentation

◆ datasetTypeAliasDict

lsst.daf.persistence.butler.Butler.datasetTypeAliasDict

Definition at line 601 of file butler.py.

◆ GENERATION

int lsst.daf.persistence.butler.Butler.GENERATION = 2

static

Definition at line 503 of file butler.py.

◆ log

lsst.daf.persistence.butler.Butler.log

Definition at line 511 of file butler.py.

◆ storage

lsst.daf.persistence.butler.Butler.storage

Definition at line 603 of file butler.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/Linux64/daf_persistence/16.0-17-g7c01f5c+3/python/lsst/daf/persistence/butler.py

Public Member Functions

Static Public Member Functions

Public Attributes

Static Public Attributes