Inheritance diagram for lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter:

Public Member Functions
def	__init__ (self, CameraMapper mapper, Sequence[str] labels=(), *kwargs)

bool	isDatasetTypeSpecial (self, str datasetTypeName)

Iterator[Tuple[str, CameraMapperMapping]]	iterMappings (self)

RepoWalker.Target	makeRepoWalkerTarget (self, str datasetTypeName, str template, Dict[str, type] keys, StorageClass storageClass, FormatterParameter formatter=None, Optional[PathElementHandler] targetHandler=None)

str	getRun (self, str datasetTypeName, Optional[str] calibDate=None)

List[str]	getSpecialDirectories (self)

def	prep (self)

Iterator[FileDataset]	iterDatasets (self)

def	findDatasets (self)

def	expandDataIds (self)

def	ingest (self)

None	finish (self)

Public Attributes
	mapper

	collection

	task

	root

	instrument

	subset

Detailed Description

A specialization of `RepoConverter` for calibration repositories.

Parameters
----------
mapper : `CameraMapper`
    Gen2 mapper for the data repository.  The root associated with the
    mapper is ignored and need not match the root of the repository.
labels : `Sequence` [ `str` ]
    Strings injected into the names of the collections that calibration
    datasets are written and certified into (forwarded as the ``extra``
    argument to `Instrument` methods that generate collection names and
    write curated calibrations).
**kwargs
    Additional keyword arguments are forwarded to (and required by)
    `RepoConverter`.

Definition at line 44 of file calibRepoConverter.py.

Constructor & Destructor Documentation

◆ init()

def lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.__init__	(		self,
		*CameraMapper	mapper,
		Sequence[str]	labels = `()`,
		**	kwargs
	)

Definition at line 62 of file calibRepoConverter.py.

     def __init__(self, *, mapper: CameraMapper, labels: Sequence[str] = (), **kwargs):
         super().__init__(run=None, **kwargs)
         self.mapper = mapper
         self.collection = self.task.instrument.makeCalibrationCollectionName(*labels)
         self._labels = tuple(labels)
         self._datasetTypes = set()
  

Member Function Documentation

◆ expandDataIds()

def lsst.obs.base.gen2to3.repoConverter.RepoConverter.expandDataIds ( self )

inherited

Expand the data IDs for all datasets to be inserted.

Subclasses may override this method, but must delegate to the base
class implementation if they do.

This involves queries to the registry, but not writes.  It is
guaranteed to be called between `findDatasets` and `ingest`.

Definition at line 441 of file repoConverter.py.

     def expandDataIds(self):
         """Expand the data IDs for all datasets to be inserted.
  
         Subclasses may override this method, but must delegate to the base
         class implementation if they do.
  
         This involves queries to the registry, but not writes.  It is
         guaranteed to be called between `findDatasets` and `ingest`.
         """
         import itertools
         for datasetType, datasetsByCalibDate in self._fileDatasets.items():
             for calibDate, datasetsForCalibDate in datasetsByCalibDate.items():
                 nDatasets = len(datasetsForCalibDate)
                 suffix = "" if nDatasets == 1 else "s"
                 if calibDate is not None:
                     self.task.log.info("Expanding data IDs for %s %s dataset%s at calibDate %s.",
                                        nDatasets,
                                        datasetType.name,
                                        suffix,
                                        calibDate)
                 else:
                     self.task.log.info("Expanding data IDs for %s %s non-calibration dataset%s.",
                                        nDatasets,
                                        datasetType.name,
                                        suffix)
                 expanded = []
                 for dataset in datasetsForCalibDate:
                     for i, ref in enumerate(dataset.refs):
                         self.task.log.debug("Expanding data ID %s.", ref.dataId)
                         try:
                             dataId = self.task.registry.expandDataId(ref.dataId)
                             dataset.refs[i] = ref.expanded(dataId)
                         except LookupError as err:
                             self.task.log.warn("Skipping ingestion for '%s': %s", dataset.path, err)
                             # Remove skipped datasets from multi-extension
                             # FileDatasets
                             dataset.refs[i] = None  # We will strip off the `None`s after the loop.
                     dataset.refs[:] = itertools.filterfalse(lambda x: x is None, dataset.refs)
                     if dataset.refs:
                         expanded.append(dataset)
                 datasetsForCalibDate[:] = expanded
  

◆ findDatasets()

def lsst.obs.base.gen2to3.repoConverter.RepoConverter.findDatasets ( self )

inherited

Definition at line 424 of file repoConverter.py.

     def findDatasets(self):
         assert self._repoWalker, "prep() must be called before findDatasets."
         self.task.log.info("Adding special datasets in repo %s.", self.root)
         for dataset in self.iterDatasets():
             assert len(dataset.refs) == 1
             # None index below is for calibDate, which is only relevant for
             # CalibRepoConverter.
             self._fileDatasets[dataset.refs[0].datasetType][None].append(dataset)
         self.task.log.info("Finding datasets from files in repo %s.", self.root)
         datasetsByTypeAndCalibDate = self._repoWalker.walk(
             self.root,
             predicate=(self.subset.isRelated if self.subset is not None else None)
         )
         for datasetType, datasetsByCalibDate in datasetsByTypeAndCalibDate.items():
             for calibDate, datasets in datasetsByCalibDate.items():
                 self._fileDatasets[datasetType][calibDate].extend(datasets)
  

◆ finish()

None lsst.obs.base.gen2to3.repoConverter.RepoConverter.finish ( self )

inherited

Finish conversion of a repository.

This is run after ``ingest``, and delegates to `_finish`, which should
be overridden by derived classes instead of this method.

Definition at line 511 of file repoConverter.py.

     def finish(self) -> None:
         """Finish conversion of a repository.
  
         This is run after ``ingest``, and delegates to `_finish`, which should
         be overridden by derived classes instead of this method.
         """
         self._finish(self._fileDatasets)
  

◆ getRun()

str lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.getRun	(		self,
		str	datasetTypeName,
		Optional[str]	calibDate = `None`
	)

Return the name of the run to insert instances of the given dataset
type into in this collection.

Parameters
----------
datasetTypeName : `str`
    Name of the dataset type.
calibDate : `str`, optional
    If not `None`, the "CALIBDATE" associated with this (calibration)
    dataset in the Gen2 data repository.

Returns
-------
run : `str`
    Name of the `~lsst.daf.butler.CollectionType.RUN` collection.

Reimplemented from lsst.obs.base.gen2to3.repoConverter.RepoConverter.

Definition at line 294 of file calibRepoConverter.py.

     def getRun(self, datasetTypeName: str, calibDate: Optional[str] = None) -> str:
         # Docstring inherited from RepoConverter.
         if calibDate is None:
             return super().getRun(datasetTypeName)
         else:
             return self.instrument.makeCalibrationCollectionName(
                 *self._labels,
                 self.instrument.formatCollectionTimestamp(calibDate),
             )
  

◆ getSpecialDirectories()

List[str] lsst.obs.base.gen2to3.repoConverter.RepoConverter.getSpecialDirectories ( self )

inherited

Return a list of directory paths that should not be searched for
files.

These may be directories that simply do not contain datasets (or
contain datasets in another repository), or directories whose datasets
are handled specially by a subclass.

Returns
-------
directories : `list` [`str`]
    The full paths of directories to skip, relative to the repository
    root.

Reimplemented in lsst.obs.base.gen2to3.rootRepoConverter.RootRepoConverter.

Definition at line 292 of file repoConverter.py.

     def getSpecialDirectories(self) -> List[str]:
         """Return a list of directory paths that should not be searched for
         files.
  
         These may be directories that simply do not contain datasets (or
         contain datasets in another repository), or directories whose datasets
         are handled specially by a subclass.
  
         Returns
         -------
         directories : `list` [`str`]
             The full paths of directories to skip, relative to the repository
             root.
         """
         return []
  

◆ ingest()

def lsst.obs.base.gen2to3.repoConverter.RepoConverter.ingest ( self )

inherited

Insert converted datasets into the Gen3 repository.

Subclasses may override this method, but must delegate to the base
class implementation at some point in their own logic.

This method is guaranteed to be called after `expandDataIds`.

Definition at line 483 of file repoConverter.py.

     def ingest(self):
         """Insert converted datasets into the Gen3 repository.
  
         Subclasses may override this method, but must delegate to the base
         class implementation at some point in their own logic.
  
         This method is guaranteed to be called after `expandDataIds`.
         """
         for datasetType, datasetsByCalibDate in self._fileDatasets.items():
             self.task.registry.registerDatasetType(datasetType)
             for calibDate, datasetsForCalibDate in datasetsByCalibDate.items():
                 try:
                     run = self.getRun(datasetType.name, calibDate)
                 except LookupError:
                     self.task.log.warn(f"No run configured for dataset type {datasetType.name}.")
                     continue
                 nDatasets = len(datasetsForCalibDate)
                 self.task.log.info("Ingesting %s %s dataset%s into run %s.", nDatasets,
                                    datasetType.name, "" if nDatasets == 1 else "s", run)
                 try:
                     self.task.registry.registerRun(run)
                     self.task.butler3.ingest(*datasetsForCalibDate, transfer=self.task.config.transfer,
                                              run=run)
                 except LookupError as err:
                     raise LookupError(
                         f"Error expanding data ID for dataset type {datasetType.name}."
                     ) from err
  

◆ isDatasetTypeSpecial()

bool lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.isDatasetTypeSpecial	(		self,
		str	datasetTypeName
	)

Test whether the given dataset is handled specially by this
converter and hence should be ignored by generic base-class logic that
searches for dataset types to convert.

Parameters
----------
datasetTypeName : `str`
    Name of the dataset type to test.

Returns
-------
special : `bool`
    `True` if the dataset type is special.

Reimplemented from lsst.obs.base.gen2to3.repoConverter.RepoConverter.

Definition at line 69 of file calibRepoConverter.py.

     def isDatasetTypeSpecial(self, datasetTypeName: str) -> bool:
         # Docstring inherited from RepoConverter.
         return datasetTypeName in self.instrument.getCuratedCalibrationNames()
  

◆ iterDatasets()

Iterator[FileDataset] lsst.obs.base.gen2to3.repoConverter.RepoConverter.iterDatasets ( self )

inherited

Iterate over datasets in the repository that should be ingested into
the Gen3 repository.

The base class implementation yields nothing; the datasets handled by
the `RepoConverter` base class itself are read directly in
`findDatasets`.

Subclasses should override this method if they support additional
datasets that are handled some other way.

Yields
------
dataset : `FileDataset`
    Structures representing datasets to be ingested.  Paths should be
    absolute.

Reimplemented in lsst.obs.base.gen2to3.standardRepoConverter.StandardRepoConverter, and lsst.obs.base.gen2to3.rootRepoConverter.RootRepoConverter.

Definition at line 405 of file repoConverter.py.

     def iterDatasets(self) -> Iterator[FileDataset]:
         """Iterate over datasets in the repository that should be ingested into
         the Gen3 repository.
  
         The base class implementation yields nothing; the datasets handled by
         the `RepoConverter` base class itself are read directly in
         `findDatasets`.
  
         Subclasses should override this method if they support additional
         datasets that are handled some other way.
  
         Yields
         ------
         dataset : `FileDataset`
             Structures representing datasets to be ingested.  Paths should be
             absolute.
         """
         yield from ()
  

◆ iterMappings()

Iterator[Tuple[str, CameraMapperMapping]] lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.iterMappings ( self )

Iterate over all `CameraMapper` `Mapping` objects that should be
considered for conversion by this repository.

This this should include any datasets that may appear in the
repository, including those that are special (see
`isDatasetTypeSpecial`) and those that are being ignored (see
`ConvertRepoTask.isDatasetTypeIncluded`); this allows the converter
to identify and hence skip these datasets quietly instead of warning
about them as unrecognized.

Yields
------
datasetTypeName: `str`
    Name of the dataset type.
mapping : `lsst.obs.base.mapping.Mapping`
    Mapping object used by the Gen2 `CameraMapper` to describe the
    dataset type.

Reimplemented from lsst.obs.base.gen2to3.repoConverter.RepoConverter.

Definition at line 73 of file calibRepoConverter.py.

     def iterMappings(self) -> Iterator[Tuple[str, CameraMapperMapping]]:
         # Docstring inherited from RepoConverter.
         yield from self.mapper.calibrations.items()
  

◆ makeRepoWalkerTarget()

RepoWalker.Target lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.makeRepoWalkerTarget	(		self,
		str	datasetTypeName,
		str	template,
		Dict[str, type]	keys,
		StorageClass	storageClass,
		FormatterParameter	formatter = `None`,
		Optional[PathElementHandler]	targetHandler = `None`
	)

Make a struct that identifies a dataset type to be extracted by
walking the repo directory structure.

Parameters
----------
datasetTypeName : `str`
    Name of the dataset type (the same in both Gen2 and Gen3).
template : `str`
    The full Gen2 filename template.
keys : `dict` [`str`, `type`]
    A dictionary mapping Gen2 data ID key to the type of its value.
storageClass : `lsst.daf.butler.StorageClass`
    Gen3 storage class for this dataset type.
formatter : `lsst.daf.butler.Formatter` or `str`, optional
    A Gen 3 formatter class or fully-qualified name.
targetHandler : `PathElementHandler`, optional
    Specialist target handler to use for this dataset type.

Returns
-------
target : `RepoWalker.Target`
    A struct containing information about the target dataset (much of
    it simplify forwarded from the arguments).

Reimplemented from lsst.obs.base.gen2to3.repoConverter.RepoConverter.

Definition at line 77 of file calibRepoConverter.py.

                              ) -> RepoWalker.Target:
         # Docstring inherited from RepoConverter.
         target = RepoWalker.Target(
             datasetTypeName=datasetTypeName,
             storageClass=storageClass,
             template=template,
             keys=keys,
             instrument=self.task.instrument.getName(),
             universe=self.task.registry.dimensions,
             formatter=formatter,
             targetHandler=targetHandler,
             translatorFactory=self.task.translatorFactory,
         )
         self._datasetTypes.add(target.datasetType)
         return target
  

◆ prep()

def lsst.obs.base.gen2to3.repoConverter.RepoConverter.prep ( self )

inherited

Perform preparatory work associated with the dataset types to be
converted from this repository (but not the datasets themselves).

Notes
-----
This should be a relatively fast operation that should not depend on
the size of the repository.

Subclasses may override this method, but must delegate to the base
class implementation at some point in their own logic.
More often, subclasses will specialize the behavior of `prep` by
overriding other methods to which the base class implementation
delegates.  These include:
 - `iterMappings`
 - `isDatasetTypeSpecial`
 - `getSpecialDirectories`
 - `makeRepoWalkerTarget`

This should not perform any write operations to the Gen3 repository.
It is guaranteed to be called before `ingest`.

Reimplemented in lsst.obs.base.gen2to3.standardRepoConverter.StandardRepoConverter, and lsst.obs.base.gen2to3.rootRepoConverter.RootRepoConverter.

Definition at line 308 of file repoConverter.py.

     def prep(self):
         """Perform preparatory work associated with the dataset types to be
         converted from this repository (but not the datasets themselves).
  
         Notes
         -----
         This should be a relatively fast operation that should not depend on
         the size of the repository.
  
         Subclasses may override this method, but must delegate to the base
         class implementation at some point in their own logic.
         More often, subclasses will specialize the behavior of `prep` by
         overriding other methods to which the base class implementation
         delegates.  These include:
          - `iterMappings`
          - `isDatasetTypeSpecial`
          - `getSpecialDirectories`
          - `makeRepoWalkerTarget`
  
         This should not perform any write operations to the Gen3 repository.
         It is guaranteed to be called before `ingest`.
         """
         self.task.log.info(f"Preparing other dataset types from root {self.root}.")
         walkerInputs: List[Union[RepoWalker.Target, RepoWalker.Skip]] = []
         for datasetTypeName, mapping in self.iterMappings():
             try:
                 template = mapping.template
             except RuntimeError:
                 # No template for this dataset in this mapper, so there's no
                 # way there should be instances of this dataset in this repo.
                 continue
             extensions = [""]
             skip = False
             message = None
             storageClass = None
             if (not self.task.isDatasetTypeIncluded(datasetTypeName)
                     or self.isDatasetTypeSpecial(datasetTypeName)):
                 # User indicated not to include this data, but we still want
                 # to recognize files of that type to avoid warning about them.
                 skip = True
             else:
                 storageClass = self._guessStorageClass(datasetTypeName, mapping)
                 if storageClass is None:
                     # This may be a problem, but only if we actually encounter
                     # any files corresponding to this dataset.  Of course, we
                     # need to be able to parse those files in order to
                     # recognize that situation.
                     message = f"no storage class found for {datasetTypeName}"
                     skip = True
             # Handle files that are compressed on disk, but the gen2 template
             # is just `.fits`
             if template.endswith(".fits"):
                 extensions.extend((".gz", ".fz"))
             for extension in extensions:
                 if skip:
                     walkerInput = RepoWalker.Skip(
                         template=template+extension,
                         keys=mapping.keys(),
                         message=message,
                     )
                     self.task.log.debug("Skipping template in walker: %s", template)
                 else:
                     assert message is None
                     targetHandler = self.task.config.targetHandlerClasses.get(datasetTypeName)
                     if targetHandler is not None:
                         targetHandler = doImport(targetHandler)
                     walkerInput = self.makeRepoWalkerTarget(
                         datasetTypeName=datasetTypeName,
                         template=template+extension,
                         keys=mapping.keys(),
                         storageClass=storageClass,
                         formatter=self.task.config.formatterClasses.get(datasetTypeName),
                         targetHandler=targetHandler,
                     )
                     self.task.log.debug("Adding template to walker: %s + %s, for %s", template, extension,
                                         walkerInput.datasetType)
                 walkerInputs.append(walkerInput)
  
         for dirPath in self.getSpecialDirectories():
             walkerInputs.append(
                 RepoWalker.Skip(
                     template=dirPath,  # not really a template, but that's fine; it's relative to root.
                     keys={},
                     message=None,
                     isForFiles=True,
                 )
             )
         fileIgnoreRegExTerms = []
         for pattern in self.task.config.fileIgnorePatterns:
             fileIgnoreRegExTerms.append(fnmatch.translate(pattern))
         if fileIgnoreRegExTerms:
             fileIgnoreRegEx = re.compile("|".join(fileIgnoreRegExTerms))
         else:
             fileIgnoreRegEx = None
         self._repoWalker = RepoWalker(walkerInputs, fileIgnoreRegEx=fileIgnoreRegEx,
                                       log=self.task.log.getChild("repoWalker"))
  

Member Data Documentation

◆ collection

lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.collection

Definition at line 65 of file calibRepoConverter.py.

◆ instrument

lsst.obs.base.gen2to3.repoConverter.RepoConverter.instrument

inherited

Definition at line 213 of file repoConverter.py.

◆ mapper

lsst.obs.base.gen2to3.calibRepoConverter.CalibRepoConverter.mapper

Definition at line 64 of file calibRepoConverter.py.

◆ root

lsst.obs.base.gen2to3.repoConverter.RepoConverter.root

inherited

Definition at line 212 of file repoConverter.py.

◆ subset

lsst.obs.base.gen2to3.repoConverter.RepoConverter.subset

inherited

Definition at line 214 of file repoConverter.py.

◆ task

lsst.obs.base.gen2to3.repoConverter.RepoConverter.task

inherited

Definition at line 211 of file repoConverter.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/0.2.1/Linux64/obs_base/21.0.0-27-gbbd0d29+ae871e0f33/python/lsst/obs/base/gen2to3/calibRepoConverter.py

Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ expandDataIds()

◆ findDatasets()

◆ finish()

◆ getRun()

◆ getSpecialDirectories()

◆ ingest()

◆ isDatasetTypeSpecial()

◆ iterDatasets()

◆ iterMappings()

◆ makeRepoWalkerTarget()

◆ prep()

Member Data Documentation

◆ collection

◆ instrument

◆ mapper

◆ root

◆ subset

◆ task

◆ init()