LSSTApplications  18.0.0+106,18.0.0+50,19.0.0,19.0.0+1,19.0.0+10,19.0.0+11,19.0.0+13,19.0.0+17,19.0.0+2,19.0.0-1-g20d9b18+6,19.0.0-1-g425ff20,19.0.0-1-g5549ca4,19.0.0-1-g580fafe+6,19.0.0-1-g6fe20d0+1,19.0.0-1-g7011481+9,19.0.0-1-g8c57eb9+6,19.0.0-1-gb5175dc+11,19.0.0-1-gdc0e4a7+9,19.0.0-1-ge272bc4+6,19.0.0-1-ge3aa853,19.0.0-10-g448f008b,19.0.0-12-g6990b2c,19.0.0-2-g0d9f9cd+11,19.0.0-2-g3d9e4fb2+11,19.0.0-2-g5037de4,19.0.0-2-gb96a1c4+3,19.0.0-2-gd955cfd+15,19.0.0-3-g2d13df8,19.0.0-3-g6f3c7dc,19.0.0-4-g725f80e+11,19.0.0-4-ga671dab3b+1,19.0.0-4-gad373c5+3,19.0.0-5-ga2acb9c+2,19.0.0-5-gfe96e6c+2,w.2020.01
LSSTDataManagementBasePackage
Public Member Functions | Static Public Member Functions | Public Attributes | Static Public Attributes | List of all members
lsst.daf.persistence.butler.Butler Class Reference

Public Member Functions

def __init__ (self, root=None, mapper=None, inputs=None, outputs=None, mapperArgs)
 
def __repr__ (self)
 
def defineAlias (self, alias, datasetType)
 
def getKeys (self, datasetType=None, level=None, tag=None)
 
def queryMetadata (self, datasetType, format, dataId={}, rest)
 
def datasetExists (self, datasetType, dataId={}, write=False, rest)
 
def get (self, datasetType, dataId=None, immediate=True, rest)
 
def put (self, obj, datasetType, dataId={}, doBackup=False, rest)
 
def subset (self, datasetType, level=None, dataId={}, rest)
 
def dataRef (self, datasetType, level=None, dataId={}, rest)
 
def getUri (self, datasetType, dataId=None, write=False, rest)
 
def __reduce__ (self)
 

Static Public Member Functions

def getMapperClass (root)
 

Public Attributes

 log
 
 datasetTypeAliasDict
 
 storage
 

Static Public Attributes

int GENERATION = 2
 

Detailed Description

Butler provides a generic mechanism for persisting and retrieving data using mappers.

A Butler manages a collection of datasets known as a repository. Each dataset has a type representing its
intended usage and a location. Note that the dataset type is not the same as the C++ or Python type of the
object containing the data. For example, an ExposureF object might be used to hold the data for a raw
image, a post-ISR image, a calibrated science image, or a difference image. These would all be different
dataset types.

A Butler can produce a collection of possible values for a key (or tuples of values for multiple keys) if
given a partial data identifier. It can check for the existence of a file containing a dataset given its
type and data identifier. The Butler can then retrieve the dataset. Similarly, it can persist an object to
an appropriate location when given its associated data identifier.

Note that the Butler has two more advanced features when retrieving a data set. First, the retrieval is
lazy. Input does not occur until the data set is actually accessed. This allows datasets to be retrieved
and placed on a clipboard prospectively with little cost, even if the algorithm of a stage ends up not
using them. Second, the Butler will call a standardization hook upon retrieval of the dataset. This
function, contained in the input mapper object, must perform any necessary manipulations to force the
retrieved object to conform to standards, including translating metadata.

Public methods:

__init__(self, root, mapper=None, **mapperArgs)

defineAlias(self, alias, datasetType)

getKeys(self, datasetType=None, level=None)

queryMetadata(self, datasetType, format=None, dataId={}, **rest)

datasetExists(self, datasetType, dataId={}, **rest)

get(self, datasetType, dataId={}, immediate=False, **rest)

put(self, obj, datasetType, dataId={}, **rest)

subset(self, datasetType, level=None, dataId={}, **rest)

dataRef(self, datasetType, level=None, dataId={}, **rest)

Initialization:

The preferred method of initialization is to use the `inputs` and `outputs` __init__ parameters. These
are described in the parameters section, below.

For backward compatibility: this initialization method signature can take a posix root path, and
optionally a mapper class instance or class type that will be instantiated using the mapperArgs input
argument. However, for this to work in a backward compatible way it creates a single repository that is
used as both an input and an output repository. This is NOT preferred, and will likely break any
provenance system we have in place.

Parameters
----------
root : string
    .. note:: Deprecated in 12_0
              `root` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    A file system path. Will only work with a PosixRepository.
mapper : string or instance
    .. note:: Deprecated in 12_0
              `mapper` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    Provides a mapper to be used with Butler.
mapperArgs : dict
    .. note:: Deprecated in 12_0
              `mapperArgs` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    Provides arguments to be passed to the mapper if the mapper input argument is a class type to be
    instantiated by Butler.
inputs : RepositoryArgs, dict, or string
    Can be a single item or a list. Provides arguments to load an existing repository (or repositories).
    String is assumed to be a URI and is used as the cfgRoot (URI to the location of the cfg file). (Local
    file system URI does not have to start with 'file://' and in this way can be a relative path). The
    `RepositoryArgs` class can be used to provide more parameters with which to initialize a repository
    (such as `mapper`, `mapperArgs`, `tags`, etc. See the `RepositoryArgs` documentation for more
    details). A dict may be used as shorthand for a `RepositoryArgs` class instance. The dict keys must
    match parameters to the `RepositoryArgs.__init__` function.
outputs : RepositoryArgs, dict, or string
    Provides arguments to load one or more existing repositories or create new ones. The different types
    are handled the same as for `inputs`.

The Butler init sequence loads all of the input and output repositories.
This creates the object hierarchy to read from and write to them. Each
repository can have 0 or more parents, which also get loaded as inputs.
This becomes a DAG of repositories. Ultimately, Butler creates a list of
these Repositories in the order that they are used.

Initialization Sequence
=======================

During initialization Butler creates a Repository class instance & support structure for each object
passed to `inputs` and `outputs` as well as the parent repositories recorded in the `RepositoryCfg` of
each existing readable repository.

This process is complex. It is explained below to shed some light on the intent of each step.

1. Input Argument Standardization
---------------------------------

In `Butler._processInputArguments` the input arguments are verified to be legal (and a RuntimeError is
raised if not), and they are converted into an expected format that is used for the rest of the Butler
init sequence. See the docstring for `_processInputArguments`.

2. Create RepoData Objects
--------------------------

Butler uses an object, called `RepoData`, to keep track of information about each repository; each
repository is contained in a single `RepoData`. The attributes are explained in its docstring.

After `_processInputArguments`, a RepoData is instantiated and put in a list for each repository in
`outputs` and `inputs`. This list of RepoData, the `repoDataList`, now represents all the output and input
repositories (but not parent repositories) that this Butler instance will use.

3. Get `RepositoryCfg`s
-----------------------

`Butler._getCfgs` gets the `RepositoryCfg` for each repository the `repoDataList`. The behavior is
described in the docstring.

4. Add Parents
--------------

`Butler._addParents` then considers the parents list in the `RepositoryCfg` of each `RepoData` in the
`repoDataList` and inserts new `RepoData` objects for each parent not represented in the proper location
in the `repoDataList`. Ultimately a flat list is built to represent the DAG of readable repositories
represented in depth-first order.

5. Set and Verify Parents of Outputs
------------------------------------

To be able to load parent repositories when output repositories are used as inputs, the input repositories
are recorded as parents in the `RepositoryCfg` file of new output repositories. When an output repository
already exists, for consistency the Butler's inputs must match the list of parents specified the already-
existing output repository's `RepositoryCfg` file.

In `Butler._setAndVerifyParentsLists`, the list of parents is recorded in the `RepositoryCfg` of new
repositories. For existing repositories the list of parents is compared with the `RepositoryCfg`'s parents
list, and if they do not match a `RuntimeError` is raised.

6. Set the Default Mapper
-------------------------

If all the input repositories use the same mapper then we can assume that mapper to be the
"default mapper". If there are new output repositories whose `RepositoryArgs` do not specify a mapper and
there is a default mapper then the new output repository will be set to use that default mapper.

This is handled in `Butler._setDefaultMapper`.

7. Cache References to Parent RepoDatas
---------------------------------------

In `Butler._connectParentRepoDatas`, in each `RepoData` in `repoDataList`, a list of `RepoData` object
references is  built that matches the parents specified in that `RepoData`'s `RepositoryCfg`.

This list is used later to find things in that repository's parents, without considering peer repository's
parents. (e.g. finding the registry of a parent)

8. Set Tags
-----------

Tags are described at https://ldm-463.lsst.io/v/draft/#tagging

In `Butler._setRepoDataTags`, for each `RepoData`, the tags specified by its `RepositoryArgs` are recorded
in a set, and added to the tags set in each of its parents, for ease of lookup when mapping.

9. Find Parent Registry and Instantiate RepoData
------------------------------------------------

At this point there is enough information to instantiate the `Repository` instances. There is one final
step before instantiating the Repository, which is to try to get a parent registry that can be used by the
child repository. The criteria for "can be used" is spelled out in `Butler._setParentRegistry`. However,
to get the registry from the parent, the parent must be instantiated. The `repoDataList`, in depth-first
search order, is built so that the most-dependent repositories are first, and the least dependent
repositories are last. So the `repoDataList` is reversed and the Repositories are instantiated in that
order; for each RepoData a parent registry is searched for, and then the Repository is instantiated with
whatever registry could be found.

Definition at line 321 of file butler.py.

Constructor & Destructor Documentation

◆ __init__()

def lsst.daf.persistence.butler.Butler.__init__ (   self,
  root = None,
  mapper = None,
  inputs = None,
  outputs = None,
  mapperArgs 
)

Definition at line 503 of file butler.py.

503  def __init__(self, root=None, mapper=None, inputs=None, outputs=None, **mapperArgs):
504  self._initArgs = {'root': root, 'mapper': mapper, 'inputs': inputs, 'outputs': outputs,
505  'mapperArgs': mapperArgs}
506 
507  self.log = Log.getLogger("daf.persistence.butler")
508 
509  inputs, outputs = self._processInputArguments(
510  root=root, mapper=mapper, inputs=inputs, outputs=outputs, **mapperArgs)
511 
512  # convert the RepoArgs into RepoData
513  inputs = [RepoData(args, 'input') for args in inputs]
514  outputs = [RepoData(args, 'output') for args in outputs]
515  repoDataList = outputs + inputs
516 
517  self._getCfgs(repoDataList)
518 
519  self._addParents(repoDataList)
520 
521  self._setAndVerifyParentsLists(repoDataList)
522 
523  self._setDefaultMapper(repoDataList)
524 
525  self._connectParentRepoDatas(repoDataList)
526 
527  self._repos = RepoDataContainer(repoDataList)
528 
529  self._setRepoDataTags()
530 
531  for repoData in repoDataList:
532  self._initRepo(repoData)
533 

Member Function Documentation

◆ __reduce__()

def lsst.daf.persistence.butler.Butler.__reduce__ (   self)

Definition at line 1592 of file butler.py.

1592  def __reduce__(self):
1593  ret = (_unreduce, (self._initArgs, self.datasetTypeAliasDict))
1594  return ret
1595 

◆ __repr__()

def lsst.daf.persistence.butler.Butler.__repr__ (   self)

Definition at line 1031 of file butler.py.

1031  def __repr__(self):
1032  return 'Butler(datasetTypeAliasDict=%s, repos=%s)' % (
1033  self.datasetTypeAliasDict, self._repos)
1034 

◆ dataRef()

def lsst.daf.persistence.butler.Butler.dataRef (   self,
  datasetType,
  level = None,
  dataId = {},
  rest 
)
Returns a single ButlerDataRef.

Given a complete dataId specified in dataId and **rest, find the unique dataset at the given level
specified by a dataId key (e.g. visit or sensor or amp for a camera) and return a ButlerDataRef.

Parameters
----------
datasetType - string
    The type of dataset collection to reference
level - string
    The level of dataId at which to reference
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
dataRef - ButlerDataRef
    ButlerDataRef for dataset matching the data id

Definition at line 1477 of file butler.py.

1477  def dataRef(self, datasetType, level=None, dataId={}, **rest):
1478  """Returns a single ButlerDataRef.
1479 
1480  Given a complete dataId specified in dataId and **rest, find the unique dataset at the given level
1481  specified by a dataId key (e.g. visit or sensor or amp for a camera) and return a ButlerDataRef.
1482 
1483  Parameters
1484  ----------
1485  datasetType - string
1486  The type of dataset collection to reference
1487  level - string
1488  The level of dataId at which to reference
1489  dataId - dict
1490  The data id.
1491  **rest
1492  Keyword arguments for the data id.
1493 
1494  Returns
1495  -------
1496  dataRef - ButlerDataRef
1497  ButlerDataRef for dataset matching the data id
1498  """
1499 
1500  datasetType = self._resolveDatasetTypeAlias(datasetType)
1501  dataId = DataId(dataId)
1502  subset = self.subset(datasetType, level, dataId, **rest)
1503  if len(subset) != 1:
1504  raise RuntimeError("No unique dataset for: Dataset type:%s Level:%s Data ID:%s Keywords:%s" %
1505  (str(datasetType), str(level), str(dataId), str(rest)))
1506  return ButlerDataRef(subset, subset.cache[0])
1507 

◆ datasetExists()

def lsst.daf.persistence.butler.Butler.datasetExists (   self,
  datasetType,
  dataId = {},
  write = False,
  rest 
)
Determines if a dataset file exists.

Parameters
----------
datasetType - string
    The type of dataset to inquire about.
dataId - DataId, dict
    The data id of the dataset.
write - bool
    If True, look only in locations where the dataset could be written,
    and return True only if it is present in all of them.
**rest keyword arguments for the data id.

Returns
-------
exists - bool
    True if the dataset exists or is non-file-based.

Definition at line 1214 of file butler.py.

1214  def datasetExists(self, datasetType, dataId={}, write=False, **rest):
1215  """Determines if a dataset file exists.
1216 
1217  Parameters
1218  ----------
1219  datasetType - string
1220  The type of dataset to inquire about.
1221  dataId - DataId, dict
1222  The data id of the dataset.
1223  write - bool
1224  If True, look only in locations where the dataset could be written,
1225  and return True only if it is present in all of them.
1226  **rest keyword arguments for the data id.
1227 
1228  Returns
1229  -------
1230  exists - bool
1231  True if the dataset exists or is non-file-based.
1232  """
1233  datasetType = self._resolveDatasetTypeAlias(datasetType)
1234  dataId = DataId(dataId)
1235  dataId.update(**rest)
1236  locations = self._locate(datasetType, dataId, write=write)
1237  if not write: # when write=False, locations is not a sequence
1238  if locations is None:
1239  return False
1240  locations = [locations]
1241 
1242  if not locations: # empty list
1243  return False
1244 
1245  for location in locations:
1246  # If the location is a ButlerComposite (as opposed to a ButlerLocation),
1247  # verify the component objects exist.
1248  if isinstance(location, ButlerComposite):
1249  for name, componentInfo in location.componentInfo.items():
1250  if componentInfo.subset:
1251  subset = self.subset(datasetType=componentInfo.datasetType, dataId=location.dataId)
1252  exists = all([obj.datasetExists() for obj in subset])
1253  else:
1254  exists = self.datasetExists(componentInfo.datasetType, location.dataId)
1255  if exists is False:
1256  return False
1257  else:
1258  if not location.repository.exists(location):
1259  return False
1260  return True
1261 
bool all(CoordinateExpr< N > const &expr) noexcept
Return true if all elements are true.

◆ defineAlias()

def lsst.daf.persistence.butler.Butler.defineAlias (   self,
  alias,
  datasetType 
)
Register an alias that will be substituted in datasetTypes.

Parameters
----------
alias - string
    The alias keyword. It may start with @ or not. It may not contain @ except as the first character.
datasetType - string
    The string that will be substituted when @alias is passed into datasetType. It may not contain '@'

Definition at line 1101 of file butler.py.

1101  def defineAlias(self, alias, datasetType):
1102  """Register an alias that will be substituted in datasetTypes.
1103 
1104  Parameters
1105  ----------
1106  alias - string
1107  The alias keyword. It may start with @ or not. It may not contain @ except as the first character.
1108  datasetType - string
1109  The string that will be substituted when @alias is passed into datasetType. It may not contain '@'
1110  """
1111  # verify formatting of alias:
1112  # it can have '@' as the first character (if not it's okay, we will add it) or not at all.
1113  atLoc = alias.rfind('@')
1114  if atLoc == -1:
1115  alias = "@" + str(alias)
1116  elif atLoc > 0:
1117  raise RuntimeError("Badly formatted alias string: %s" % (alias,))
1118 
1119  # verify that datasetType does not contain '@'
1120  if datasetType.count('@') != 0:
1121  raise RuntimeError("Badly formatted type string: %s" % (datasetType))
1122 
1123  # verify that the alias keyword does not start with another alias keyword,
1124  # and vice versa
1125  for key in self.datasetTypeAliasDict:
1126  if key.startswith(alias) or alias.startswith(key):
1127  raise RuntimeError("Alias: %s overlaps with existing alias: %s" % (alias, key))
1128 
1129  self.datasetTypeAliasDict[alias] = datasetType
1130 

◆ get()

def lsst.daf.persistence.butler.Butler.get (   self,
  datasetType,
  dataId = None,
  immediate = True,
  rest 
)
Retrieves a dataset given an input collection data id.

Parameters
----------
datasetType - string
    The type of dataset to retrieve.
dataId - dict
    The data id.
immediate - bool
    If False use a proxy for delayed loading.
**rest
    keyword arguments for the data id.

Returns
-------
    An object retrieved from the dataset (or a proxy for one).

Definition at line 1352 of file butler.py.

1352  def get(self, datasetType, dataId=None, immediate=True, **rest):
1353  """Retrieves a dataset given an input collection data id.
1354 
1355  Parameters
1356  ----------
1357  datasetType - string
1358  The type of dataset to retrieve.
1359  dataId - dict
1360  The data id.
1361  immediate - bool
1362  If False use a proxy for delayed loading.
1363  **rest
1364  keyword arguments for the data id.
1365 
1366  Returns
1367  -------
1368  An object retrieved from the dataset (or a proxy for one).
1369  """
1370  datasetType = self._resolveDatasetTypeAlias(datasetType)
1371  dataId = DataId(dataId)
1372  dataId.update(**rest)
1373 
1374  location = self._locate(datasetType, dataId, write=False)
1375  if location is None:
1376  raise NoResults("No locations for get:", datasetType, dataId)
1377  self.log.debug("Get type=%s keys=%s from %s", datasetType, dataId, str(location))
1378 
1379  if hasattr(location, 'bypass'):
1380  # this type loader block should get moved into a helper someplace, and duplications removed.
1381  def callback():
1382  return location.bypass
1383  else:
1384  def callback():
1385  return self._read(location)
1386  if location.mapper.canStandardize(location.datasetType):
1387  innerCallback = callback
1388 
1389  def callback():
1390  return location.mapper.standardize(location.datasetType, innerCallback(), dataId)
1391  if immediate:
1392  return callback()
1393  return ReadProxy(callback)
1394 

◆ getKeys()

def lsst.daf.persistence.butler.Butler.getKeys (   self,
  datasetType = None,
  level = None,
  tag = None 
)
Get the valid data id keys at or above the given level of hierarchy for the dataset type or the
entire collection if None. The dict values are the basic Python types corresponding to the keys (int,
float, string).

Parameters
----------
datasetType - string
    The type of dataset to get keys for, entire collection if None.
level - string
    The hierarchy level to descend to. None if it should not be restricted. Use an empty string if the
    mapper should lookup the default level.
tags - any, or list of any
    Any object that can be tested to be the same as the tag in a dataId passed into butler input
    functions. Applies only to input repositories: If tag is specified by the dataId then the repo
    will only be read from used if the tag in the dataId matches a tag used for that repository.

Returns
-------
Returns a dict. The dict keys are the valid data id keys at or above the given level of hierarchy for
the dataset type or the entire collection if None. The dict values are the basic Python types
corresponding to the keys (int, float, string).

Definition at line 1131 of file butler.py.

1131  def getKeys(self, datasetType=None, level=None, tag=None):
1132  """Get the valid data id keys at or above the given level of hierarchy for the dataset type or the
1133  entire collection if None. The dict values are the basic Python types corresponding to the keys (int,
1134  float, string).
1135 
1136  Parameters
1137  ----------
1138  datasetType - string
1139  The type of dataset to get keys for, entire collection if None.
1140  level - string
1141  The hierarchy level to descend to. None if it should not be restricted. Use an empty string if the
1142  mapper should lookup the default level.
1143  tags - any, or list of any
1144  Any object that can be tested to be the same as the tag in a dataId passed into butler input
1145  functions. Applies only to input repositories: If tag is specified by the dataId then the repo
1146  will only be read from used if the tag in the dataId matches a tag used for that repository.
1147 
1148  Returns
1149  -------
1150  Returns a dict. The dict keys are the valid data id keys at or above the given level of hierarchy for
1151  the dataset type or the entire collection if None. The dict values are the basic Python types
1152  corresponding to the keys (int, float, string).
1153  """
1154  datasetType = self._resolveDatasetTypeAlias(datasetType)
1155 
1156  keys = None
1157  tag = setify(tag)
1158  for repoData in self._repos.inputs():
1159  if not tag or len(tag.intersection(repoData.tags)) > 0:
1160  keys = repoData.repo.getKeys(datasetType, level)
1161  # An empty dict is a valid "found" condition for keys. The only value for keys that should
1162  # cause the search to continue is None
1163  if keys is not None:
1164  break
1165  return keys
1166 

◆ getMapperClass()

def lsst.daf.persistence.butler.Butler.getMapperClass (   root)
static
posix-only; gets the mapper class at the path specified by root (if a file _mapper can be found at
that location or in a parent location.

As we abstract the storage and support different types of storage locations this method will be
moved entirely into Butler Access, or made more dynamic, and the API will very likely change.

Definition at line 1093 of file butler.py.

1093  def getMapperClass(root):
1094  """posix-only; gets the mapper class at the path specified by root (if a file _mapper can be found at
1095  that location or in a parent location.
1096 
1097  As we abstract the storage and support different types of storage locations this method will be
1098  moved entirely into Butler Access, or made more dynamic, and the API will very likely change."""
1099  return Storage.getMapperClass(root)
1100 

◆ getUri()

def lsst.daf.persistence.butler.Butler.getUri (   self,
  datasetType,
  dataId = None,
  write = False,
  rest 
)
Return the URI for a dataset

.. warning:: This is intended only for debugging. The URI should
never be used for anything other than printing.

.. note:: In the event there are multiple URIs for read, we return only
the first.

.. note:: getUri() does not currently support composite datasets.

Parameters
----------
datasetType : `str`
   The dataset type of interest.
dataId : `dict`, optional
   The data identifier.
write : `bool`, optional
   Return the URI for writing?
rest : `dict`, optional
   Keyword arguments for the data id.

Returns
-------
uri : `str`
   URI for dataset.

Definition at line 1508 of file butler.py.

1508  def getUri(self, datasetType, dataId=None, write=False, **rest):
1509  """Return the URI for a dataset
1510 
1511  .. warning:: This is intended only for debugging. The URI should
1512  never be used for anything other than printing.
1513 
1514  .. note:: In the event there are multiple URIs for read, we return only
1515  the first.
1516 
1517  .. note:: getUri() does not currently support composite datasets.
1518 
1519  Parameters
1520  ----------
1521  datasetType : `str`
1522  The dataset type of interest.
1523  dataId : `dict`, optional
1524  The data identifier.
1525  write : `bool`, optional
1526  Return the URI for writing?
1527  rest : `dict`, optional
1528  Keyword arguments for the data id.
1529 
1530  Returns
1531  -------
1532  uri : `str`
1533  URI for dataset.
1534  """
1535  datasetType = self._resolveDatasetTypeAlias(datasetType)
1536  dataId = DataId(dataId)
1537  dataId.update(**rest)
1538  locations = self._locate(datasetType, dataId, write=write)
1539  if locations is None:
1540  raise NoResults("No locations for getUri: ", datasetType, dataId)
1541 
1542  if write:
1543  # Follow the write path
1544  # Return the first valid write location.
1545  for location in locations:
1546  if isinstance(location, ButlerComposite):
1547  for name, info in location.componentInfo.items():
1548  if not info.inputOnly:
1549  return self.getUri(info.datasetType, location.dataId, write=True)
1550  else:
1551  return location.getLocationsWithRoot()[0]
1552  # fall back to raise
1553  raise NoResults("No locations for getUri(write=True): ", datasetType, dataId)
1554  else:
1555  # Follow the read path, only return the first valid read
1556  return locations.getLocationsWithRoot()[0]
1557 

◆ put()

def lsst.daf.persistence.butler.Butler.put (   self,
  obj,
  datasetType,
  dataId = {},
  doBackup = False,
  rest 
)
Persists a dataset given an output collection data id.

Parameters
----------
obj -
    The object to persist.
datasetType - string
    The type of dataset to persist.
dataId - dict
    The data id.
doBackup - bool
    If True, rename existing instead of overwriting.
    WARNING: Setting doBackup=True is not safe for parallel processing, as it may be subject to race
    conditions.
**rest
    Keyword arguments for the data id.

Definition at line 1395 of file butler.py.

1395  def put(self, obj, datasetType, dataId={}, doBackup=False, **rest):
1396  """Persists a dataset given an output collection data id.
1397 
1398  Parameters
1399  ----------
1400  obj -
1401  The object to persist.
1402  datasetType - string
1403  The type of dataset to persist.
1404  dataId - dict
1405  The data id.
1406  doBackup - bool
1407  If True, rename existing instead of overwriting.
1408  WARNING: Setting doBackup=True is not safe for parallel processing, as it may be subject to race
1409  conditions.
1410  **rest
1411  Keyword arguments for the data id.
1412  """
1413  datasetType = self._resolveDatasetTypeAlias(datasetType)
1414  dataId = DataId(dataId)
1415  dataId.update(**rest)
1416 
1417  locations = self._locate(datasetType, dataId, write=True)
1418  if not locations:
1419  raise NoResults("No locations for put:", datasetType, dataId)
1420  for location in locations:
1421  if isinstance(location, ButlerComposite):
1422  disassembler = location.disassembler if location.disassembler else genericDisassembler
1423  disassembler(obj=obj, dataId=location.dataId, componentInfo=location.componentInfo)
1424  for name, info in location.componentInfo.items():
1425  if not info.inputOnly:
1426  self.put(info.obj, info.datasetType, location.dataId, doBackup=doBackup)
1427  else:
1428  if doBackup:
1429  location.getRepository().backup(location.datasetType, dataId)
1430  location.getRepository().write(location, obj)
1431 
def write(self, patchRef, catalog)
Write the output.

◆ queryMetadata()

def lsst.daf.persistence.butler.Butler.queryMetadata (   self,
  datasetType,
  format,
  dataId = {},
  rest 
)
Returns the valid values for one or more keys when given a partial
input collection data id.

Parameters
----------
datasetType - string
    The type of dataset to inquire about.
format - str, tuple
    Key or tuple of keys to be returned.
dataId - DataId, dict
    The partial data id.
**rest -
    Keyword arguments for the partial data id.

Returns
-------
A list of valid values or tuples of valid values as specified by the
format.

Definition at line 1167 of file butler.py.

1167  def queryMetadata(self, datasetType, format, dataId={}, **rest):
1168  """Returns the valid values for one or more keys when given a partial
1169  input collection data id.
1170 
1171  Parameters
1172  ----------
1173  datasetType - string
1174  The type of dataset to inquire about.
1175  format - str, tuple
1176  Key or tuple of keys to be returned.
1177  dataId - DataId, dict
1178  The partial data id.
1179  **rest -
1180  Keyword arguments for the partial data id.
1181 
1182  Returns
1183  -------
1184  A list of valid values or tuples of valid values as specified by the
1185  format.
1186  """
1187 
1188  datasetType = self._resolveDatasetTypeAlias(datasetType)
1189  dataId = DataId(dataId)
1190  dataId.update(**rest)
1191  format = sequencify(format)
1192 
1193  tuples = None
1194  for repoData in self._repos.inputs():
1195  if not dataId.tag or len(dataId.tag.intersection(repoData.tags)) > 0:
1196  tuples = repoData.repo.queryMetadata(datasetType, format, dataId)
1197  if tuples:
1198  break
1199 
1200  if not tuples:
1201  return []
1202 
1203  if len(format) == 1:
1204  ret = []
1205  for x in tuples:
1206  try:
1207  ret.append(x[0])
1208  except TypeError:
1209  ret.append(x)
1210  return ret
1211 
1212  return tuples
1213 

◆ subset()

def lsst.daf.persistence.butler.Butler.subset (   self,
  datasetType,
  level = None,
  dataId = {},
  rest 
)
Return complete dataIds for a dataset type that match a partial (or empty) dataId.

Given a partial (or empty) dataId specified in dataId and **rest, find all datasets that match the
dataId.  Optionally restrict the results to a given level specified by a dataId key (e.g. visit or
sensor or amp for a camera).  Return an iterable collection of complete dataIds as ButlerDataRefs.
Datasets with the resulting dataIds may not exist; that needs to be tested with datasetExists().

Parameters
----------
datasetType - string
    The type of dataset collection to subset
level - string
    The level of dataId at which to subset. Use an empty string if the mapper should look up the
    default level.
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
subset - ButlerSubset
    Collection of ButlerDataRefs for datasets matching the data id.

Examples
-----------
To print the full dataIds for all r-band measurements in a source catalog
(note that the subset call is equivalent to: `butler.subset('src', dataId={'filter':'r'})`):

>>> subset = butler.subset('src', filter='r')
>>> for data_ref in subset: print(data_ref.dataId)

Definition at line 1432 of file butler.py.

1432  def subset(self, datasetType, level=None, dataId={}, **rest):
1433  """Return complete dataIds for a dataset type that match a partial (or empty) dataId.
1434 
1435  Given a partial (or empty) dataId specified in dataId and **rest, find all datasets that match the
1436  dataId. Optionally restrict the results to a given level specified by a dataId key (e.g. visit or
1437  sensor or amp for a camera). Return an iterable collection of complete dataIds as ButlerDataRefs.
1438  Datasets with the resulting dataIds may not exist; that needs to be tested with datasetExists().
1439 
1440  Parameters
1441  ----------
1442  datasetType - string
1443  The type of dataset collection to subset
1444  level - string
1445  The level of dataId at which to subset. Use an empty string if the mapper should look up the
1446  default level.
1447  dataId - dict
1448  The data id.
1449  **rest
1450  Keyword arguments for the data id.
1451 
1452  Returns
1453  -------
1454  subset - ButlerSubset
1455  Collection of ButlerDataRefs for datasets matching the data id.
1456 
1457  Examples
1458  -----------
1459  To print the full dataIds for all r-band measurements in a source catalog
1460  (note that the subset call is equivalent to: `butler.subset('src', dataId={'filter':'r'})`):
1461  >>> subset = butler.subset('src', filter='r')
1462  >>> for data_ref in subset: print(data_ref.dataId)
1463  """
1464  datasetType = self._resolveDatasetTypeAlias(datasetType)
1465 
1466  # Currently expected behavior of subset is that if specified level is None then the mapper's default
1467  # level should be used. Convention for level within Butler is that an empty string is used to indicate
1468  # 'get default'.
1469  if level is None:
1470  level = ''
1471 
1472  dataId = DataId(dataId)
1473  dataId.update(**rest)
1474  return ButlerSubset(self, datasetType, level, dataId)
1475 
1476 

Member Data Documentation

◆ datasetTypeAliasDict

lsst.daf.persistence.butler.Butler.datasetTypeAliasDict

Definition at line 597 of file butler.py.

◆ GENERATION

int lsst.daf.persistence.butler.Butler.GENERATION = 2
static

Definition at line 499 of file butler.py.

◆ log

lsst.daf.persistence.butler.Butler.log

Definition at line 507 of file butler.py.

◆ storage

lsst.daf.persistence.butler.Butler.storage

Definition at line 599 of file butler.py.


The documentation for this class was generated from the following file: