Inheritance diagram for lsst.pipe.base.connections.PipelineTaskConnections:

Public Member Functions
def	__init__ (self, *'PipelineTaskConfig' config=None)

typing.Tuple[InputQuantizedConnection, OutputQuantizedConnection]	buildDatasetRefs (self, Quantum quantum)

NamedKeyDict[DatasetType, typing.Set[DatasetRef]]	adjustQuantum (self, NamedKeyDict[DatasetType, typing.Set[DatasetRef]] datasetRefMap)

def	__prepare__ (name, bases, **kwargs)

def	__new__ (cls, name, bases, dct, **kwargs)

Public Attributes
	inputs

	prerequisiteInputs

	outputs

	initInputs

	initOutputs

	allConnections

	config

Detailed Description

PipelineTaskConnections is a class used to declare desired IO when a
PipelineTask is run by an activator

Parameters
----------
config : `PipelineTaskConfig`
    A `PipelineTaskConfig` class instance whose class has been configured
    to use this `PipelineTaskConnectionsClass`

Notes
-----
``PipelineTaskConnection`` classes are created by declaring class
attributes of types defined in `lsst.pipe.base.connectionTypes` and are
listed as follows:

* ``InitInput`` - Defines connections in a quantum graph which are used as
  inputs to the ``__init__`` function of the `PipelineTask` corresponding
  to this class
* ``InitOuput`` - Defines connections in a quantum graph which are to be
  persisted using a butler at the end of the ``__init__`` function of the
  `PipelineTask` corresponding to this class. The variable name used to
  define this connection should be the same as an attribute name on the
  `PipelineTask` instance. E.g. if an ``InitOutput`` is declared with
  the name ``outputSchema`` in a ``PipelineTaskConnections`` class, then
  a `PipelineTask` instance should have an attribute
  ``self.outputSchema`` defined. Its value is what will be saved by the
  activator framework.
* ``PrerequisiteInput`` - An input connection type that defines a
  `lsst.daf.butler.DatasetType` that must be present at execution time,
  but that will not be used during the course of creating the quantum
  graph to be executed. These most often are things produced outside the
  processing pipeline, such as reference catalogs.
* ``Input`` - Input `lsst.daf.butler.DatasetType` objects that will be used
  in the ``run`` method of a `PipelineTask`.  The name used to declare
  class attribute must match a function argument name in the ``run``
  method of a `PipelineTask`. E.g. If the ``PipelineTaskConnections``
  defines an ``Input`` with the name ``calexp``, then the corresponding
  signature should be ``PipelineTask.run(calexp, ...)``
* ``Output`` - A `lsst.daf.butler.DatasetType` that will be produced by an
  execution of a `PipelineTask`. The name used to declare the connection
  must correspond to an attribute of a `Struct` that is returned by a
  `PipelineTask` ``run`` method.  E.g. if an output connection is
  defined with the name ``measCat``, then the corresponding
  ``PipelineTask.run`` method must return ``Struct(measCat=X,..)`` where
  X matches the ``storageClass`` type defined on the output connection.

The process of declaring a ``PipelineTaskConnection`` class involves
parameters passed in the declaration statement.

The first parameter is ``dimensions`` which is an iterable of strings which
defines the unit of processing the run method of a corresponding
`PipelineTask` will operate on. These dimensions must match dimensions that
exist in the butler registry which will be used in executing the
corresponding `PipelineTask`.

The second parameter is labeled ``defaultTemplates`` and is conditionally
optional. The name attributes of connections can be specified as python
format strings, with named format arguments. If any of the name parameters
on connections defined in a `PipelineTaskConnections` class contain a
template, then a default template value must be specified in the
``defaultTemplates`` argument. This is done by passing a dictionary with
keys corresponding to a template identifier, and values corresponding to
the value to use as a default when formatting the string. For example if
``ConnectionClass.calexp.name = '{input}Coadd_calexp'`` then
``defaultTemplates`` = {'input': 'deep'}.

Once a `PipelineTaskConnections` class is created, it is used in the
creation of a `PipelineTaskConfig`. This is further documented in the
documentation of `PipelineTaskConfig`. For the purposes of this
documentation, the relevant information is that the config class allows
configuration of connection names by users when running a pipeline.

Instances of a `PipelineTaskConnections` class are used by the pipeline
task execution framework to introspect what a corresponding `PipelineTask`
will require, and what it will produce.

Examples
--------
>>> from lsst.pipe.base import connectionTypes as cT
>>> from lsst.pipe.base import PipelineTaskConnections
>>> from lsst.pipe.base import PipelineTaskConfig
>>> class ExampleConnections(PipelineTaskConnections,
...                          dimensions=("A", "B"),
...                          defaultTemplates={"foo": "Example"}):
...     inputConnection = cT.Input(doc="Example input",
...                                dimensions=("A", "B"),
...                                storageClass=Exposure,
...                                name="{foo}Dataset")
...     outputConnection = cT.Output(doc="Example output",
...                                  dimensions=("A", "B"),
...                                  storageClass=Exposure,
...                                  name="{foo}output")
>>> class ExampleConfig(PipelineTaskConfig,
...                     pipelineConnections=ExampleConnections):
...    pass
>>> config = ExampleConfig()
>>> config.connections.foo = Modified
>>> config.connections.outputConnection = "TotallyDifferent"
>>> connections = ExampleConnections(config=config)
>>> assert(connections.inputConnection.name == "ModifiedDataset")
>>> assert(connections.outputConnection.name == "TotallyDifferent")

Definition at line 260 of file connections.py.

Constructor & Destructor Documentation

◆ init()

def lsst.pipe.base.connections.PipelineTaskConnections.__init__	(		self,
		*'PipelineTaskConfig'	config = `None`
	)

Definition at line 364 of file connections.py.

     def __init__(self, *, config: 'PipelineTaskConfig' = None):
         self.inputs = set(self.inputs)
         self.prerequisiteInputs = set(self.prerequisiteInputs)
         self.outputs = set(self.outputs)
         self.initInputs = set(self.initInputs)
         self.initOutputs = set(self.initOutputs)
         self.allConnections = dict(self.allConnections)
  
         if config is None or not isinstance(config, configMod.PipelineTaskConfig):
             raise ValueError("PipelineTaskConnections must be instantiated with"
                              " a PipelineTaskConfig instance")
         self.config = config
         # Extract the template names that were defined in the config instance
         # by looping over the keys of the defaultTemplates dict specified at
         # class declaration time
         templateValues = {name: getattr(config.connections, name) for name in getattr(self,
                           'defaultTemplates').keys()}
         # Extract the configured value corresponding to each connection
         # variable. I.e. for each connection identifier, populate a override
         # for the connection.name attribute
         self._nameOverrides = {name: getattr(config.connections, name).format(**templateValues)
                                for name in self.allConnections.keys()}
  
         # connections.name corresponds to a dataset type name, create a reverse
         # mapping that goes from dataset type name to attribute identifier name
         # (variable name) on the connection class
         self._typeNameToVarName = {v: k for k, v in self._nameOverrides.items()}
  

Member Function Documentation

◆ new()

def lsst.pipe.base.connections.PipelineTaskConnectionsMetaclass.__new__	(		cls,
			name,
			bases,
			dct,
		**	kwargs
	)

inherited

Definition at line 110 of file connections.py.

     def __new__(cls, name, bases, dct, **kwargs):
         dimensionsValueError = TypeError("PipelineTaskConnections class must be created with a dimensions "
                                          "attribute which is an iterable of dimension names")
  
         if name != 'PipelineTaskConnections':
             # Verify that dimensions are passed as a keyword in class
             # declaration
             if 'dimensions' not in kwargs:
                 for base in bases:
                     if hasattr(base, 'dimensions'):
                         kwargs['dimensions'] = base.dimensions
                         break
                 if 'dimensions' not in kwargs:
                     raise dimensionsValueError
             try:
                 if isinstance(kwargs['dimensions'], str):
                     raise TypeError("Dimensions must be iterable of dimensions, got str,"
                                     "possibly omitted trailing comma")
                 if not isinstance(kwargs['dimensions'], typing.Iterable):
                     raise TypeError("Dimensions must be iterable of dimensions")
                 dct['dimensions'] = set(kwargs['dimensions'])
             except TypeError as exc:
                 raise dimensionsValueError from exc
             # Lookup any python string templates that may have been used in the
             # declaration of the name field of a class connection attribute
             allTemplates = set()
             stringFormatter = string.Formatter()
             # Loop over all connections
             for obj in dct['allConnections'].values():
                 nameValue = obj.name
                 # add all the parameters to the set of templates
                 for param in stringFormatter.parse(nameValue):
                     if param[1] is not None:
                         allTemplates.add(param[1])
  
             # look up any template from base classes and merge them all
             # together
             mergeDict = {}
             for base in bases[::-1]:
                 if hasattr(base, 'defaultTemplates'):
                     mergeDict.update(base.defaultTemplates)
             if 'defaultTemplates' in kwargs:
                 mergeDict.update(kwargs['defaultTemplates'])
  
             if len(mergeDict) > 0:
                 kwargs['defaultTemplates'] = mergeDict
  
             # Verify that if templated strings were used, defaults were
             # supplied as an argument in the declaration of the connection
             # class
             if len(allTemplates) > 0 and 'defaultTemplates' not in kwargs:
                 raise TypeError("PipelineTaskConnection class contains templated attribute names, but no "
                                 "defaut templates were provided, add a dictionary attribute named "
                                 "defaultTemplates which contains the mapping between template key and value")
             if len(allTemplates) > 0:
                 # Verify all templates have a default, and throw if they do not
                 defaultTemplateKeys = set(kwargs['defaultTemplates'].keys())
                 templateDifference = allTemplates.difference(defaultTemplateKeys)
                 if templateDifference:
                     raise TypeError(f"Default template keys were not provided for {templateDifference}")
                 # Verify that templates do not share names with variable names
                 # used for a connection, this is needed because of how
                 # templates are specified in an associated config class.
                 nameTemplateIntersection = allTemplates.intersection(set(dct['allConnections'].keys()))
                 if len(nameTemplateIntersection) > 0:
                     raise TypeError(f"Template parameters cannot share names with Class attributes"
                                     f" (conflicts are {nameTemplateIntersection}).")
             dct['defaultTemplates'] = kwargs.get('defaultTemplates', {})
  
         # Convert all the connection containers into frozensets so they cannot
         # be modified at the class scope
         for connectionName in ("inputs", "prerequisiteInputs", "outputs", "initInputs", "initOutputs"):
             dct[connectionName] = frozenset(dct[connectionName])
         # our custom dict type must be turned into an actual dict to be used in
         # type.__new__
         return super().__new__(cls, name, bases, dict(dct))
  

◆ prepare()

def lsst.pipe.base.connections.PipelineTaskConnectionsMetaclass.__prepare__	(		name,
			bases,
		**	kwargs
	)

inherited

Definition at line 99 of file connections.py.

     def __prepare__(name, bases, **kwargs):  # noqa: 805
         # Create an instance of our special dict to catch and track all
         # variables that are instances of connectionTypes.BaseConnection
         # Copy any existing connections from a parent class
         dct = PipelineTaskConnectionDict()
         for base in bases:
             if isinstance(base, PipelineTaskConnectionsMetaclass):
                 for name, value in base.allConnections.items():
                     dct[name] = value
         return dct
  

◆ adjustQuantum()

NamedKeyDict[DatasetType, typing.Set[DatasetRef]] lsst.pipe.base.connections.PipelineTaskConnections.adjustQuantum	(		self,
		NamedKeyDict[DatasetType, typing.Set[DatasetRef]]	datasetRefMap
	)

Override to make adjustments to `lsst.daf.butler.DatasetRef` objects
in the `lsst.daf.butler.core.Quantum` during the graph generation stage
of the activator.

The base class implementation simply checks that input connections with
``multiple`` set to `False` have no more than one dataset.

Parameters
----------
datasetRefMap : `NamedKeyDict`
    Mapping from dataset type to a `set` of
    `lsst.daf.butler.DatasetRef` objects

Returns
-------
datasetRefMap : `NamedKeyDict`
    Modified mapping of input with possibly adjusted
    `lsst.daf.butler.DatasetRef` objects.

Raises
------
ScalarError
    Raised if any `Input` or `PrerequisiteInput` connection has
    ``multiple`` set to `False`, but multiple datasets.
Exception
    Overrides of this function have the option of raising an Exception
    if a field in the input does not satisfy a need for a corresponding
    pipelineTask, i.e. no reference catalogs are found.

Definition at line 459 of file connections.py.

                       ) -> NamedKeyDict[DatasetType, typing.Set[DatasetRef]]:
         """Override to make adjustments to `lsst.daf.butler.DatasetRef` objects
         in the `lsst.daf.butler.core.Quantum` during the graph generation stage
         of the activator.
  
         The base class implementation simply checks that input connections with
         ``multiple`` set to `False` have no more than one dataset.
  
         Parameters
         ----------
         datasetRefMap : `NamedKeyDict`
             Mapping from dataset type to a `set` of
             `lsst.daf.butler.DatasetRef` objects
  
         Returns
         -------
         datasetRefMap : `NamedKeyDict`
             Modified mapping of input with possibly adjusted
             `lsst.daf.butler.DatasetRef` objects.
  
         Raises
         ------
         ScalarError
             Raised if any `Input` or `PrerequisiteInput` connection has
             ``multiple`` set to `False`, but multiple datasets.
         Exception
             Overrides of this function have the option of raising an Exception
             if a field in the input does not satisfy a need for a corresponding
             pipelineTask, i.e. no reference catalogs are found.
         """
         for connection in itertools.chain(iterConnections(self, "inputs"),
                                           iterConnections(self, "prerequisiteInputs")):
             refs = datasetRefMap[connection.name]
             if not connection.multiple and len(refs) > 1:
                 raise ScalarError(
                     f"Found multiple datasets {', '.join(str(r.dataId) for r in refs)} "
                     f"for scalar connection {connection.name} ({refs[0].datasetType.name})."
                 )
         return datasetRefMap
  
  

◆ buildDatasetRefs()

typing.Tuple[InputQuantizedConnection, OutputQuantizedConnection] lsst.pipe.base.connections.PipelineTaskConnections.buildDatasetRefs	(		self,
		Quantum	quantum
	)

Builds QuantizedConnections corresponding to input Quantum

Parameters
----------
quantum : `lsst.daf.butler.Quantum`
    Quantum object which defines the inputs and outputs for a given
    unit of processing

Returns
-------
retVal : `tuple` of (`InputQuantizedConnection`,
    `OutputQuantizedConnection`) Namespaces mapping attribute names
    (identifiers of connections) to butler references defined in the
    input `lsst.daf.butler.Quantum`

Definition at line 392 of file connections.py.

                                                                  OutputQuantizedConnection]:
         """Builds QuantizedConnections corresponding to input Quantum
  
         Parameters
         ----------
         quantum : `lsst.daf.butler.Quantum`
             Quantum object which defines the inputs and outputs for a given
             unit of processing
  
         Returns
         -------
         retVal : `tuple` of (`InputQuantizedConnection`,
             `OutputQuantizedConnection`) Namespaces mapping attribute names
             (identifiers of connections) to butler references defined in the
             input `lsst.daf.butler.Quantum`
         """
         inputDatasetRefs = InputQuantizedConnection()
         outputDatasetRefs = OutputQuantizedConnection()
         # operate on a reference object and an interable of names of class
         # connection attributes
         for refs, names in zip((inputDatasetRefs, outputDatasetRefs),
                                (itertools.chain(self.inputs, self.prerequisiteInputs), self.outputs)):
             # get a name of a class connection attribute
             for attributeName in names:
                 # get the attribute identified by name
                 attribute = getattr(self, attributeName)
                 # Branch if the attribute dataset type is an input
                 if attribute.name in quantum.inputs:
                     # Get the DatasetRefs
                     quantumInputRefs = quantum.inputs[attribute.name]
                     # if the dataset is marked to load deferred, wrap it in a
                     # DeferredDatasetRef
                     if attribute.deferLoad:
                         quantumInputRefs = [DeferredDatasetRef(datasetRef=ref) for ref in quantumInputRefs]
                     # Unpack arguments that are not marked multiples (list of
                     # length one)
                     if not attribute.multiple:
                         if len(quantumInputRefs) > 1:
                             raise ScalarError(
                                 f"Received multiple datasets "
                                 f"{', '.join(str(r.dataId) for r in quantumInputRefs)} "
                                 f"for scalar connection {attributeName} "
                                 f"({quantumInputRefs[0].datasetType.name}) "
                                 f"of quantum for {quantum.taskName} with data ID {quantum.dataId}."
                             )
                         if len(quantumInputRefs) == 0:
                             continue
                         quantumInputRefs = quantumInputRefs[0]
                     # Add to the QuantizedConnection identifier
                     setattr(refs, attributeName, quantumInputRefs)
                 # Branch if the attribute dataset type is an output
                 elif attribute.name in quantum.outputs:
                     value = quantum.outputs[attribute.name]
                     # Unpack arguments that are not marked multiples (list of
                     # length one)
                     if not attribute.multiple:
                         value = value[0]
                     # Add to the QuantizedConnection identifier
                     setattr(refs, attributeName, value)
                 # Specified attribute is not in inputs or outputs dont know how
                 # to handle, throw
                 else:
                     raise ValueError(f"Attribute with name {attributeName} has no counterpoint "
                                      "in input quantum")
         return inputDatasetRefs, outputDatasetRefs
  

Member Data Documentation

◆ allConnections

lsst.pipe.base.connections.PipelineTaskConnections.allConnections

Definition at line 370 of file connections.py.

◆ config

lsst.pipe.base.connections.PipelineTaskConnections.config

Definition at line 375 of file connections.py.

◆ initInputs

lsst.pipe.base.connections.PipelineTaskConnections.initInputs

Definition at line 368 of file connections.py.

◆ initOutputs

lsst.pipe.base.connections.PipelineTaskConnections.initOutputs

Definition at line 369 of file connections.py.

◆ inputs

lsst.pipe.base.connections.PipelineTaskConnections.inputs

Definition at line 365 of file connections.py.

◆ outputs

lsst.pipe.base.connections.PipelineTaskConnections.outputs

Definition at line 367 of file connections.py.

◆ prerequisiteInputs

lsst.pipe.base.connections.PipelineTaskConnections.prerequisiteInputs

Definition at line 366 of file connections.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/lsst-scipipe-0.4.3/Linux64/pipe_base/22.0.1+94e66cc9ed/python/lsst/pipe/base/connections.py

Public Member Functions

Public Attributes