LSST Applications  22.0.1,22.0.1+01bcf6a671,22.0.1+046ee49490,22.0.1+05c7de27da,22.0.1+0c6914dbf6,22.0.1+1220d50b50,22.0.1+12fd109e95,22.0.1+1a1dd69893,22.0.1+1c910dc348,22.0.1+1ef34551f5,22.0.1+30170c3d08,22.0.1+39153823fd,22.0.1+611137eacc,22.0.1+771eb1e3e8,22.0.1+94e66cc9ed,22.0.1+9a075d06e2,22.0.1+a5ff6e246e,22.0.1+a7db719c1a,22.0.1+ba0d97e778,22.0.1+bfe1ee9056,22.0.1+c4e1e0358a,22.0.1+cc34b8281e,22.0.1+d640e2c0fa,22.0.1+d72a2e677a,22.0.1+d9a6b571bd,22.0.1+e485e9761b,22.0.1+ebe8d3385e
LSST Data Management Base Package
Public Member Functions | Public Attributes | Static Public Attributes | List of all members
lsst.pipe.base.pipelineTask.PipelineTask Class Reference
Inheritance diagram for lsst.pipe.base.pipelineTask.PipelineTask:
lsst.pipe.base.task.Task lsst.pipe.tasks.deblendCoaddSourcesPipeline.DeblendCoaddSourcesBaseTask lsst.pipe.tasks.deblendCoaddSourcesPipeline.DeblendCoaddSourcesMultiTask lsst.pipe.tasks.deblendCoaddSourcesPipeline.DeblendCoaddSourcesSingleTask

Public Member Functions

def __init__ (self, *config=None, log=None, initInputs=None, **kwargs)
 
def run (self, **kwargs)
 
def runQuantum (self, ButlerQuantumContext butlerQC, InputQuantizedConnection inputRefs, OutputQuantizedConnection outputRefs)
 
def getResourceConfig (self)
 
def emptyMetadata (self)
 
def getSchemaCatalogs (self)
 
def getAllSchemaCatalogs (self)
 
def getFullMetadata (self)
 
def getFullName (self)
 
def getName (self)
 
def getTaskDict (self)
 
def makeSubtask (self, name, **keyArgs)
 
def timer (self, name, logLevel=Log.DEBUG)
 
def makeField (cls, doc)
 
def __reduce__ (self)
 

Public Attributes

 metadata
 
 log
 
 config
 

Static Public Attributes

bool canMultiprocess = True
 

Detailed Description

Base class for all pipeline tasks.

This is an abstract base class for PipelineTasks which represents an
algorithm executed by framework(s) on data which comes from data butler,
resulting data is also stored in a data butler.

PipelineTask inherits from a `pipe.base.Task` and uses the same
configuration mechanism based on `pex.config`. `PipelineTask` classes also
have a `PipelineTaskConnections` class associated with their config which
defines all of the IO a `PipelineTask` will need to do. PipelineTask
sub-class typically implements `run()` method which receives Python-domain
data objects and returns `pipe.base.Struct` object with resulting data.
`run()` method is not supposed to perform any I/O, it operates entirely on
in-memory objects. `runQuantum()` is the method (can be re-implemented in
sub-class) where all necessary I/O is performed, it reads all input data
from data butler into memory, calls `run()` method with that data, examines
returned `Struct` object and saves some or all of that data back to data
butler. `runQuantum()` method receives a `ButlerQuantumContext` instance to
facilitate I/O, a `InputQuantizedConnection` instance which defines all
input `lsst.daf.butler.DatasetRef`, and a `OutputQuantizedConnection`
instance which defines all the output `lsst.daf.butler.DatasetRef` for a
single invocation of PipelineTask.

Subclasses must be constructable with exactly the arguments taken by the
PipelineTask base class constructor, but may support other signatures as
well.

Attributes
----------
canMultiprocess : bool, True by default (class attribute)
    This class attribute is checked by execution framework, sub-classes
    can set it to ``False`` in case task does not support multiprocessing.

Parameters
----------
config : `pex.config.Config`, optional
    Configuration for this task (an instance of ``self.ConfigClass``,
    which is a task-specific subclass of `PipelineTaskConfig`).
    If not specified then it defaults to `self.ConfigClass()`.
log : `lsst.log.Log`, optional
    Logger instance whose name is used as a log name prefix, or ``None``
    for no prefix.
initInputs : `dict`, optional
    A dictionary of objects needed to construct this PipelineTask, with
    keys matching the keys of the dictionary returned by
    `getInitInputDatasetTypes` and values equivalent to what would be
    obtained by calling `Butler.get` with those DatasetTypes and no data
    IDs.  While it is optional for the base class, subclasses are
    permitted to require this argument.

Definition at line 32 of file pipelineTask.py.

Constructor & Destructor Documentation

◆ __init__()

def lsst.pipe.base.pipelineTask.PipelineTask.__init__ (   self,
config = None,
  log = None,
  initInputs = None,
**  kwargs 
)

Definition at line 85 of file pipelineTask.py.

85  def __init__(self, *, config=None, log=None, initInputs=None, **kwargs):
86  super().__init__(config=config, log=log, **kwargs)
87 

Member Function Documentation

◆ __reduce__()

def lsst.pipe.base.task.Task.__reduce__ (   self)
inherited
Pickler.

Reimplemented in lsst.pipe.drivers.multiBandDriver.MultiBandDriverTask, and lsst.pipe.drivers.coaddDriver.CoaddDriverTask.

Definition at line 432 of file task.py.

432  def __reduce__(self):
433  """Pickler.
434  """
435  return self._unpickle_via_factory, (self.__class__, [], self._reduce_kwargs())

◆ emptyMetadata()

def lsst.pipe.base.task.Task.emptyMetadata (   self)
inherited
Empty (clear) the metadata for this Task and all sub-Tasks.

Definition at line 166 of file task.py.

166  def emptyMetadata(self):
167  """Empty (clear) the metadata for this Task and all sub-Tasks.
168  """
169  for subtask in self._taskDict.values():
170  subtask.metadata = dafBase.PropertyList()
171 
Class for storing ordered metadata with comments.
Definition: PropertyList.h:68

◆ getAllSchemaCatalogs()

def lsst.pipe.base.task.Task.getAllSchemaCatalogs (   self)
inherited
Get schema catalogs for all tasks in the hierarchy, combining the
results into a single dict.

Returns
-------
schemacatalogs : `dict`
    Keys are butler dataset type, values are a empty catalog (an
    instance of the appropriate `lsst.afw.table` Catalog type) for all
    tasks in the hierarchy, from the top-level task down
    through all subtasks.

Notes
-----
This method may be called on any task in the hierarchy; it will return
the same answer, regardless.

The default implementation should always suffice. If your subtask uses
schemas the override `Task.getSchemaCatalogs`, not this method.

Definition at line 204 of file task.py.

204  def getAllSchemaCatalogs(self):
205  """Get schema catalogs for all tasks in the hierarchy, combining the
206  results into a single dict.
207 
208  Returns
209  -------
210  schemacatalogs : `dict`
211  Keys are butler dataset type, values are a empty catalog (an
212  instance of the appropriate `lsst.afw.table` Catalog type) for all
213  tasks in the hierarchy, from the top-level task down
214  through all subtasks.
215 
216  Notes
217  -----
218  This method may be called on any task in the hierarchy; it will return
219  the same answer, regardless.
220 
221  The default implementation should always suffice. If your subtask uses
222  schemas the override `Task.getSchemaCatalogs`, not this method.
223  """
224  schemaDict = self.getSchemaCatalogs()
225  for subtask in self._taskDict.values():
226  schemaDict.update(subtask.getSchemaCatalogs())
227  return schemaDict
228 

◆ getFullMetadata()

def lsst.pipe.base.task.Task.getFullMetadata (   self)
inherited
Get metadata for all tasks.

Returns
-------
metadata : `lsst.daf.base.PropertySet`
    The `~lsst.daf.base.PropertySet` keys are the full task name.
    Values are metadata for the top-level task and all subtasks,
    sub-subtasks, etc.

Notes
-----
The returned metadata includes timing information (if
``@timer.timeMethod`` is used) and any metadata set by the task. The
name of each item consists of the full task name with ``.`` replaced
by ``:``, followed by ``.`` and the name of the item, e.g.::

    topLevelTaskName:subtaskName:subsubtaskName.itemName

using ``:`` in the full task name disambiguates the rare situation
that a task has a subtask and a metadata item with the same name.

Definition at line 229 of file task.py.

229  def getFullMetadata(self):
230  """Get metadata for all tasks.
231 
232  Returns
233  -------
234  metadata : `lsst.daf.base.PropertySet`
235  The `~lsst.daf.base.PropertySet` keys are the full task name.
236  Values are metadata for the top-level task and all subtasks,
237  sub-subtasks, etc.
238 
239  Notes
240  -----
241  The returned metadata includes timing information (if
242  ``@timer.timeMethod`` is used) and any metadata set by the task. The
243  name of each item consists of the full task name with ``.`` replaced
244  by ``:``, followed by ``.`` and the name of the item, e.g.::
245 
246  topLevelTaskName:subtaskName:subsubtaskName.itemName
247 
248  using ``:`` in the full task name disambiguates the rare situation
249  that a task has a subtask and a metadata item with the same name.
250  """
251  fullMetadata = dafBase.PropertySet()
252  for fullName, task in self.getTaskDict().items():
253  fullMetadata.set(fullName.replace(".", ":"), task.metadata)
254  return fullMetadata
255 
std::vector< SchemaItem< Flag > > * items
Class for storing generic metadata.
Definition: PropertySet.h:67

◆ getFullName()

def lsst.pipe.base.task.Task.getFullName (   self)
inherited
Get the task name as a hierarchical name including parent task
names.

Returns
-------
fullName : `str`
    The full name consists of the name of the parent task and each
    subtask separated by periods. For example:

    - The full name of top-level task "top" is simply "top".
    - The full name of subtask "sub" of top-level task "top" is
      "top.sub".
    - The full name of subtask "sub2" of subtask "sub" of top-level
      task "top" is "top.sub.sub2".

Definition at line 256 of file task.py.

256  def getFullName(self):
257  """Get the task name as a hierarchical name including parent task
258  names.
259 
260  Returns
261  -------
262  fullName : `str`
263  The full name consists of the name of the parent task and each
264  subtask separated by periods. For example:
265 
266  - The full name of top-level task "top" is simply "top".
267  - The full name of subtask "sub" of top-level task "top" is
268  "top.sub".
269  - The full name of subtask "sub2" of subtask "sub" of top-level
270  task "top" is "top.sub.sub2".
271  """
272  return self._fullName
273 

◆ getName()

def lsst.pipe.base.task.Task.getName (   self)
inherited
Get the name of the task.

Returns
-------
taskName : `str`
    Name of the task.

See also
--------
getFullName

Definition at line 274 of file task.py.

274  def getName(self):
275  """Get the name of the task.
276 
277  Returns
278  -------
279  taskName : `str`
280  Name of the task.
281 
282  See also
283  --------
284  getFullName
285  """
286  return self._name
287 
std::string const & getName() const noexcept
Return a filter's name.
Definition: Filter.h:78

◆ getResourceConfig()

def lsst.pipe.base.pipelineTask.PipelineTask.getResourceConfig (   self)
Return resource configuration for this task.

Returns
-------
Object of type `~config.ResourceConfig` or ``None`` if resource
configuration is not defined for this task.

Definition at line 158 of file pipelineTask.py.

158  def getResourceConfig(self):
159  """Return resource configuration for this task.
160 
161  Returns
162  -------
163  Object of type `~config.ResourceConfig` or ``None`` if resource
164  configuration is not defined for this task.
165  """
166  return getattr(self.config, "resources", None)

◆ getSchemaCatalogs()

def lsst.pipe.base.task.Task.getSchemaCatalogs (   self)
inherited
Get the schemas generated by this task.

Returns
-------
schemaCatalogs : `dict`
    Keys are butler dataset type, values are an empty catalog (an
    instance of the appropriate `lsst.afw.table` Catalog type) for
    this task.

Notes
-----

.. warning::

   Subclasses that use schemas must override this method. The default
   implementation returns an empty dict.

This method may be called at any time after the Task is constructed,
which means that all task schemas should be computed at construction
time, *not* when data is actually processed. This reflects the
philosophy that the schema should not depend on the data.

Returning catalogs rather than just schemas allows us to save e.g.
slots for SourceCatalog as well.

See also
--------
Task.getAllSchemaCatalogs

Definition at line 172 of file task.py.

172  def getSchemaCatalogs(self):
173  """Get the schemas generated by this task.
174 
175  Returns
176  -------
177  schemaCatalogs : `dict`
178  Keys are butler dataset type, values are an empty catalog (an
179  instance of the appropriate `lsst.afw.table` Catalog type) for
180  this task.
181 
182  Notes
183  -----
184 
185  .. warning::
186 
187  Subclasses that use schemas must override this method. The default
188  implementation returns an empty dict.
189 
190  This method may be called at any time after the Task is constructed,
191  which means that all task schemas should be computed at construction
192  time, *not* when data is actually processed. This reflects the
193  philosophy that the schema should not depend on the data.
194 
195  Returning catalogs rather than just schemas allows us to save e.g.
196  slots for SourceCatalog as well.
197 
198  See also
199  --------
200  Task.getAllSchemaCatalogs
201  """
202  return {}
203 

◆ getTaskDict()

def lsst.pipe.base.task.Task.getTaskDict (   self)
inherited
Get a dictionary of all tasks as a shallow copy.

Returns
-------
taskDict : `dict`
    Dictionary containing full task name: task object for the top-level
    task and all subtasks, sub-subtasks, etc.

Definition at line 288 of file task.py.

288  def getTaskDict(self):
289  """Get a dictionary of all tasks as a shallow copy.
290 
291  Returns
292  -------
293  taskDict : `dict`
294  Dictionary containing full task name: task object for the top-level
295  task and all subtasks, sub-subtasks, etc.
296  """
297  return self._taskDict.copy()
298 
def getTaskDict(config, taskDict=None, baseName="")

◆ makeField()

def lsst.pipe.base.task.Task.makeField (   cls,
  doc 
)
inherited
Make a `lsst.pex.config.ConfigurableField` for this task.

Parameters
----------
doc : `str`
    Help text for the field.

Returns
-------
configurableField : `lsst.pex.config.ConfigurableField`
    A `~ConfigurableField` for this task.

Examples
--------
Provides a convenient way to specify this task is a subtask of another
task.

Here is an example of use:

.. code-block:: python

    class OtherTaskConfig(lsst.pex.config.Config):
        aSubtask = ATaskClass.makeField("brief description of task")

Definition at line 359 of file task.py.

359  def makeField(cls, doc):
360  """Make a `lsst.pex.config.ConfigurableField` for this task.
361 
362  Parameters
363  ----------
364  doc : `str`
365  Help text for the field.
366 
367  Returns
368  -------
369  configurableField : `lsst.pex.config.ConfigurableField`
370  A `~ConfigurableField` for this task.
371 
372  Examples
373  --------
374  Provides a convenient way to specify this task is a subtask of another
375  task.
376 
377  Here is an example of use:
378 
379  .. code-block:: python
380 
381  class OtherTaskConfig(lsst.pex.config.Config):
382  aSubtask = ATaskClass.makeField("brief description of task")
383  """
384  return ConfigurableField(doc=doc, target=cls)
385 

◆ makeSubtask()

def lsst.pipe.base.task.Task.makeSubtask (   self,
  name,
**  keyArgs 
)
inherited
Create a subtask as a new instance as the ``name`` attribute of this
task.

Parameters
----------
name : `str`
    Brief name of the subtask.
keyArgs
    Extra keyword arguments used to construct the task. The following
    arguments are automatically provided and cannot be overridden:

    - "config".
    - "parentTask".

Notes
-----
The subtask must be defined by ``Task.config.name``, an instance of
`~lsst.pex.config.ConfigurableField` or
`~lsst.pex.config.RegistryField`.

Definition at line 299 of file task.py.

299  def makeSubtask(self, name, **keyArgs):
300  """Create a subtask as a new instance as the ``name`` attribute of this
301  task.
302 
303  Parameters
304  ----------
305  name : `str`
306  Brief name of the subtask.
307  keyArgs
308  Extra keyword arguments used to construct the task. The following
309  arguments are automatically provided and cannot be overridden:
310 
311  - "config".
312  - "parentTask".
313 
314  Notes
315  -----
316  The subtask must be defined by ``Task.config.name``, an instance of
317  `~lsst.pex.config.ConfigurableField` or
318  `~lsst.pex.config.RegistryField`.
319  """
320  taskField = getattr(self.config, name, None)
321  if taskField is None:
322  raise KeyError(f"{self.getFullName()}'s config does not have field {name!r}")
323  subtask = taskField.apply(name=name, parentTask=self, **keyArgs)
324  setattr(self, name, subtask)
325 

◆ run()

def lsst.pipe.base.pipelineTask.PipelineTask.run (   self,
**  kwargs 
)
Run task algorithm on in-memory data.

This method should be implemented in a subclass. This method will
receive keyword arguments whose names will be the same as names of
connection fields describing input dataset types. Argument values will
be data objects retrieved from data butler. If a dataset type is
configured with ``multiple`` field set to ``True`` then the argument
value will be a list of objects, otherwise it will be a single object.

If the task needs to know its input or output DataIds then it has to
override `runQuantum` method instead.

This method should return a `Struct` whose attributes share the same
name as the connection fields describing output dataset types.

Returns
-------
struct : `Struct`
    Struct with attribute names corresponding to output connection
    fields

Examples
--------
Typical implementation of this method may look like:

.. code-block:: python

    def run(self, input, calib):
        # "input", "calib", and "output" are the names of the config
        # fields

        # Assuming that input/calib datasets are `scalar` they are
        # simple objects, do something with inputs and calibs, produce
        # output image.
        image = self.makeImage(input, calib)

        # If output dataset is `scalar` then return object, not list
        return Struct(output=image)

Definition at line 88 of file pipelineTask.py.

88  def run(self, **kwargs):
89  """Run task algorithm on in-memory data.
90 
91  This method should be implemented in a subclass. This method will
92  receive keyword arguments whose names will be the same as names of
93  connection fields describing input dataset types. Argument values will
94  be data objects retrieved from data butler. If a dataset type is
95  configured with ``multiple`` field set to ``True`` then the argument
96  value will be a list of objects, otherwise it will be a single object.
97 
98  If the task needs to know its input or output DataIds then it has to
99  override `runQuantum` method instead.
100 
101  This method should return a `Struct` whose attributes share the same
102  name as the connection fields describing output dataset types.
103 
104  Returns
105  -------
106  struct : `Struct`
107  Struct with attribute names corresponding to output connection
108  fields
109 
110  Examples
111  --------
112  Typical implementation of this method may look like:
113 
114  .. code-block:: python
115 
116  def run(self, input, calib):
117  # "input", "calib", and "output" are the names of the config
118  # fields
119 
120  # Assuming that input/calib datasets are `scalar` they are
121  # simple objects, do something with inputs and calibs, produce
122  # output image.
123  image = self.makeImage(input, calib)
124 
125  # If output dataset is `scalar` then return object, not list
126  return Struct(output=image)
127 
128  """
129  raise NotImplementedError("run() is not implemented")
130 
def run(self, skyInfo, tempExpRefList, imageScalerList, weightList, altMaskList=None, mask=None, supplementaryData=None)

◆ runQuantum()

def lsst.pipe.base.pipelineTask.PipelineTask.runQuantum (   self,
ButlerQuantumContext  butlerQC,
InputQuantizedConnection  inputRefs,
OutputQuantizedConnection  outputRefs 
)
Method to do butler IO and or transforms to provide in memory
objects for tasks run method

Parameters
----------
butlerQC : `ButlerQuantumContext`
    A butler which is specialized to operate in the context of a
    `lsst.daf.butler.Quantum`.
inputRefs : `InputQuantizedConnection`
    Datastructure whose attribute names are the names that identify
    connections defined in corresponding `PipelineTaskConnections`
    class. The values of these attributes are the
    `lsst.daf.butler.DatasetRef` objects associated with the defined
    input/prerequisite connections.
outputRefs : `OutputQuantizedConnection`
    Datastructure whose attribute names are the names that identify
    connections defined in corresponding `PipelineTaskConnections`
    class. The values of these attributes are the
    `lsst.daf.butler.DatasetRef` objects associated with the defined
    output connections.

Definition at line 131 of file pipelineTask.py.

132  outputRefs: OutputQuantizedConnection):
133  """Method to do butler IO and or transforms to provide in memory
134  objects for tasks run method
135 
136  Parameters
137  ----------
138  butlerQC : `ButlerQuantumContext`
139  A butler which is specialized to operate in the context of a
140  `lsst.daf.butler.Quantum`.
141  inputRefs : `InputQuantizedConnection`
142  Datastructure whose attribute names are the names that identify
143  connections defined in corresponding `PipelineTaskConnections`
144  class. The values of these attributes are the
145  `lsst.daf.butler.DatasetRef` objects associated with the defined
146  input/prerequisite connections.
147  outputRefs : `OutputQuantizedConnection`
148  Datastructure whose attribute names are the names that identify
149  connections defined in corresponding `PipelineTaskConnections`
150  class. The values of these attributes are the
151  `lsst.daf.butler.DatasetRef` objects associated with the defined
152  output connections.
153  """
154  inputs = butlerQC.get(inputRefs)
155  outputs = self.run(**inputs)
156  butlerQC.put(outputs, outputRefs)
157 

◆ timer()

def lsst.pipe.base.task.Task.timer (   self,
  name,
  logLevel = Log.DEBUG 
)
inherited
Context manager to log performance data for an arbitrary block of
code.

Parameters
----------
name : `str`
    Name of code being timed; data will be logged using item name:
    ``Start`` and ``End``.
logLevel
    A `lsst.log` level constant.

Examples
--------
Creating a timer context:

.. code-block:: python

    with self.timer("someCodeToTime"):
        pass  # code to time

See also
--------
timer.logInfo

Definition at line 327 of file task.py.

327  def timer(self, name, logLevel=Log.DEBUG):
328  """Context manager to log performance data for an arbitrary block of
329  code.
330 
331  Parameters
332  ----------
333  name : `str`
334  Name of code being timed; data will be logged using item name:
335  ``Start`` and ``End``.
336  logLevel
337  A `lsst.log` level constant.
338 
339  Examples
340  --------
341  Creating a timer context:
342 
343  .. code-block:: python
344 
345  with self.timer("someCodeToTime"):
346  pass # code to time
347 
348  See also
349  --------
350  timer.logInfo
351  """
352  logInfo(obj=self, prefix=name + "Start", logLevel=logLevel)
353  try:
354  yield
355  finally:
356  logInfo(obj=self, prefix=name + "End", logLevel=logLevel)
357 
def logInfo(obj, prefix, logLevel=Log.DEBUG)
Definition: timer.py:63

Member Data Documentation

◆ canMultiprocess

bool lsst.pipe.base.pipelineTask.PipelineTask.canMultiprocess = True
static

Definition at line 83 of file pipelineTask.py.

◆ config

lsst.pipe.base.task.Task.config
inherited

Definition at line 162 of file task.py.

◆ log

lsst.pipe.base.task.Task.log
inherited

Definition at line 161 of file task.py.

◆ metadata

lsst.pipe.base.task.Task.metadata
inherited

Definition at line 134 of file task.py.


The documentation for this class was generated from the following file: