LSSTApplications  10.0-2-g4f67435,11.0.rc2+1,11.0.rc2+12,11.0.rc2+3,11.0.rc2+4,11.0.rc2+5,11.0.rc2+6,11.0.rc2+7,11.0.rc2+8
LSSTDataManagementBasePackage
task.py
Go to the documentation of this file.
1 from __future__ import absolute_import, division
2 #
3 # LSST Data Management System
4 # Copyright 2008, 2009, 2010, 2011 LSST Corporation.
5 #
6 # This product includes software developed by the
7 # LSST Project (http://www.lsst.org/).
8 #
9 # This program is free software: you can redistribute it and/or modify
10 # it under the terms of the GNU General Public License as published by
11 # the Free Software Foundation, either version 3 of the License, or
12 # (at your option) any later version.
13 #
14 # This program is distributed in the hope that it will be useful,
15 # but WITHOUT ANY WARRANTY; without even the implied warranty of
16 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 # GNU General Public License for more details.
18 #
19 # You should have received a copy of the LSST License Statement and
20 # the GNU General Public License along with this program. If not,
21 # see <http://www.lsstcorp.org/LegalNotices/>.
22 #
23 import contextlib
24 
25 import lsstDebug
26 from lsst.pex.config import ConfigurableField
27 
28 try:
29  import lsst.afw.display.ds9 as ds9
30 except ImportError:
31  # afw is above pipe_base in the class hierarchy, so we have to cope without it.
32  # We'll warn on first use that it's unavailable, and then quietly swallow all
33  # references to it.
34  class Ds9Warning(object):
35  """A null pattern which warns once that ds9 is not available"""
36  def __init__(self):
37  super(Ds9Warning, self).__setattr__("_warned", False)
38  def __getattr__(self, name):
39  if name in ("GREEN", "YELLOW", "RED", "BLUE"):
40  # These are used in the Task.display definition, so don't warn when we use them
41  return self
42  if not super(Ds9Warning, self).__getattribute__("_warned"):
43  print "WARNING: afw's ds9 is not available"
44  super(Ds9Warning, self).__setattr__("_warned", True)
45  return self
46  def __setattr__(self, name, value):
47  return self
48  def __call__(self, *args, **kwargs):
49  return self
50  ds9 = Ds9Warning()
51 
52 import lsst.pex.logging as pexLog
53 import lsst.daf.base as dafBase
54 from .timer import logInfo
55 
56 __all__ = ["Task", "TaskError"]
57 
58 ## default ds9 colors for Task.display's ctypes argument
59 _DefaultDS9CTypes = (ds9.GREEN, ds9.YELLOW, ds9.RED, ds9.BLUE)
60 
61 ## default ds9 point types for Task.display's ptypes argument
62 _DefaultDS9PTypes = ("o", "+", "x", "*")
63 
64 class TaskError(Exception):
65  """!Use to report errors for which a traceback is not useful.
66 
67  Examples of such errors:
68  - processCcd is asked to run detection, but not calibration, and no calexp is found.
69  - coadd finds no valid images in the specified patch.
70  """
71  pass
72 
73 class Task(object):
74  """!Base class for data processing tasks
75 
76  See \ref pipeBase_introduction "pipe_base introduction" to learn what tasks are,
77  and \ref pipeTasks_writeTask "how to write a task" for more information about writing tasks.
78  If the second link is broken (as it will be before the documentation is cross-linked)
79  then look at the main page of pipe_tasks documentation for a link.
80 
81  Useful attributes include:
82  * log: an lsst.pex.logging.Log
83  * config: task-specific configuration; an instance of ConfigClass (see below)
84  * metadata: an lsst.daf.base.PropertyList for collecting task-specific metadata,
85  e.g. data quality and performance metrics. This is data that is only meant to be
86  persisted, never to be used by the task.
87 
88  Subclasses typically have a method named "run" to perform the main data processing. Details:
89  * run should process the minimum reasonable amount of data, typically a single CCD.
90  Iteration, if desired, is performed by a caller of the run method. This is good design and allows
91  multiprocessing without the run method having to support it directly.
92  * If "run" can persist or unpersist data:
93  * "run" should accept a butler data reference (or a collection of data references, if appropriate,
94  e.g. coaddition).
95  * There should be a way to run the task without persisting data. Typically the run method returns all
96  data, even if it is persisted, and the task's config method offers a flag to disable persistence.
97 
98  \deprecated Tasks other than cmdLineTask.CmdLineTask%s should \em not accept a blob such as a butler data
99  reference. How we will handle data references is still TBD, so don't make changes yet! RHL 2014-06-27
100 
101  Subclasses must also have an attribute ConfigClass that is a subclass of lsst.pex.config.Config
102  which configures the task. Subclasses should also have an attribute _DefaultName:
103  the default name if there is no parent task. _DefaultName is required for subclasses of
104  \ref cmdLineTask.CmdLineTask "CmdLineTask" and recommended for subclasses of Task because it simplifies
105  construction (e.g. for unit tests).
106 
107  Tasks intended to be run from the command line should be subclasses of \ref cmdLineTask.CmdLineTask
108  "CmdLineTask", not Task.
109  """
110  def __init__(self, config=None, name=None, parentTask=None, log=None):
111  """!Create a Task
112 
113  @param[in] config configuration for this task (an instance of self.ConfigClass,
114  which is a task-specific subclass of lsst.pex.config.Config), or None. If None:
115  - If parentTask specified then defaults to parentTask.config.<name>
116  - If parentTask is None then defaults to self.ConfigClass()
117  @param[in] name brief name of task, or None; if None then defaults to self._DefaultName
118  @param[in] parentTask the parent task of this subtask, if any.
119  - If None (a top-level task) then you must specify config and name is ignored.
120  - If not None (a subtask) then you must specify name
121  @param[in] log pexLog log; if None then the default is used;
122  in either case a copy is made using the full task name.
123 
124  @throw RuntimeError if parentTask is None and config is None.
125  @throw RuntimeError if parentTask is not None and name is None.
126  @throw RuntimeError if name is None and _DefaultName does not exist.
127  """
129 
130  if parentTask != None:
131  if name is None:
132  raise RuntimeError("name is required for a subtask")
133  self._name = name
134  self._fullName = parentTask._computeFullName(name)
135  if config == None:
136  config = getattr(parentTask.config, name)
137  self._taskDict = parentTask._taskDict
138  else:
139  if name is None:
140  name = getattr(self, "_DefaultName", None)
141  if name is None:
142  raise RuntimeError("name is required for a task unless it has attribute _DefaultName")
143  name = self._DefaultName
144  self._name = name
145  self._fullName = self._name
146  if config == None:
147  config = self.ConfigClass()
148  self._taskDict = dict()
149 
150  self.config = config
151  if log == None:
152  log = pexLog.getDefaultLog()
153  self.log = pexLog.Log(log, self._fullName)
154  self._display = lsstDebug.Info(self.__module__).display
155  self._taskDict[self._fullName] = self
156 
157  def emptyMetadata(self):
158  """!Empty (clear) the metadata for this Task and all sub-Tasks."""
159  for subtask in self._taskDict.itervalues():
160  subtask.metadata = dafBase.PropertyList()
161 
162  def getSchemaCatalogs(self):
163  """!Return the schemas generated by this task
164 
165  @warning Subclasses the use schemas must override this method. The default implemenation
166  returns an empty dict.
167 
168  @return a dict of butler dataset type: empty catalog (an instance of the appropriate
169  lsst.afw.table Catalog type) for this task
170 
171  This method may be called at any time after the Task is constructed, which means that
172  all task schemas should be computed at construction time, __not__ when data is actually
173  processed. This reflects the philosophy that the schema should not depend on the data.
174 
175  Returning catalogs rather than just schemas allows us to save e.g. slots for SourceCatalog as well.
176 
177  See also Task.getAllSchemaCatalogs
178  """
179  return {}
180 
182  """!Call getSchemaCatalogs() on all tasks in the hiearchy, combining the results into a single dict.
183 
184  @return a dict of butler dataset type: empty catalog (an instance of the appropriate
185  lsst.afw.table Catalog type) for all tasks in the hierarchy, from the top-level task down
186  through all subtasks
187 
188  This method may be called on any task in the hierarchy; it will return the same answer, regardless.
189 
190  The default implementation should always suffice. If your subtask uses schemas the override
191  Task.getSchemaCatalogs, not this method.
192  """
193  schemaDict = self.getSchemaCatalogs()
194  for subtask in self._taskDict.itervalues():
195  schemaDict.update(subtask.getSchemaCatalogs())
196  return schemaDict
197 
198  def getFullMetadata(self):
199  """!Get metadata for all tasks
200 
201  The returned metadata includes timing information (if \@timer.timeMethod is used)
202  and any metadata set by the task. The name of each item consists of the full task name
203  with "." replaced by ":", followed by "." and the name of the item, e.g.:
204  topLeveltTaskName:subtaskName:subsubtaskName.itemName
205  using ":" in the full task name disambiguates the rare situation that a task has a subtask
206  and a metadata item with the same name.
207 
208  @return metadata: an lsst.daf.base.PropertySet containing full task name: metadata
209  for the top-level task and all subtasks, sub-subtasks, etc.
210  """
211  fullMetadata = dafBase.PropertySet()
212  for fullName, task in self.getTaskDict().iteritems():
213  fullMetadata.set(fullName.replace(".", ":"), task.metadata)
214  return fullMetadata
215 
216  def getFullName(self):
217  """!Return the task name as a hierarchical name including parent task names
218 
219  The full name consists of the name of the parent task and each subtask separated by periods.
220  For example:
221  - The full name of top-level task "top" is simply "top"
222  - The full name of subtask "sub" of top-level task "top" is "top.sub"
223  - The full name of subtask "sub2" of subtask "sub" of top-level task "top" is "top.sub.sub2".
224  """
225  return self._fullName
226 
227  def getName(self):
228  """!Return the name of the task
229 
230  See getFullName to get a hierarchical name including parent task names
231  """
232  return self._name
233 
234  def getTaskDict(self):
235  """!Return a dictionary of all tasks as a shallow copy.
236 
237  @return taskDict: a dict containing full task name: task object
238  for the top-level task and all subtasks, sub-subtasks, etc.
239  """
240  return self._taskDict.copy()
241 
242  def makeSubtask(self, name, **keyArgs):
243  """!Create a subtask as a new instance self.<name>
244 
245  The subtask must be defined by self.config.<name>, an instance of pex_config ConfigurableField.
246 
247  @param name brief name of subtask
248  @param **keyArgs extra keyword arguments used to construct the task.
249  The following arguments are automatically provided and cannot be overridden:
250  "config" and "parentTask".
251  """
252  configurableField = getattr(self.config, name, None)
253  if configurableField is None:
254  raise KeyError("%s's config does not have field %r" % (self.getFullName, name))
255  subtask = configurableField.apply(name=name, parentTask=self, **keyArgs)
256  setattr(self, name, subtask)
257 
258  @contextlib.contextmanager
259  def timer(self, name, logLevel = pexLog.Log.DEBUG):
260  """!Context manager to log performance data for an arbitrary block of code
261 
262  @param[in] name name of code being timed;
263  data will be logged using item name: <name>Start<item> and <name>End<item>
264  @param[in] logLevel one of the lsst.pex.logging.Log level constants
265 
266  Example of use:
267  \code
268  with self.timer("someCodeToTime"):
269  ...code to time...
270  \endcode
271 
272  See timer.logInfo for the information logged
273  """
274  logInfo(obj = self, prefix = name + "Start", logLevel = logLevel)
275  try:
276  yield
277  finally:
278  logInfo(obj = self, prefix = name + "End", logLevel = logLevel)
279 
280  def display(self, name, exposure=None, sources=(), matches=None,
281  ctypes=_DefaultDS9CTypes, ptypes=_DefaultDS9PTypes,
282  sizes=(4,),
283  pause=None, prompt=None):
284  """!Display an exposure and/or sources
285 
286  @warning This method is deprecated. New code should call lsst.afw.display.ds9 directly.
287 
288  @param[in] name name of product to display
289  @param[in] exposure exposure to display (instance of lsst::afw::image::Exposure), or None
290  @param[in] sources list of Sources to display, as a single lsst.afw.table.SourceCatalog
291  or a list of lsst.afw.table.SourceCatalog,
292  or an empty list to not display sources
293  @param[in] matches list of source matches to display (instances of
294  lsst.afw.table.ReferenceMatch), or None;
295  if any matches are specified then exposure must be provided and have a lsst.afw.image.Wcs.
296  @param[in] ctypes array of colors to use on ds9 for displaying sources and matches
297  (in that order).
298  ctypes is indexed as follows, where ctypes is repeatedly cycled through, if necessary:
299  - ctypes[i] is used to display sources[i]
300  - ctypes[len(sources) + 2i] is used to display matches[i][0]
301  - ctypes[len(sources) + 2i + 1] is used to display matches[i][1]
302  @param[in] ptypes array of ptypes to use on ds9 for displaying sources and matches;
303  indexed like ctypes
304  @param[in] sizes array of sizes to use on ds9 for displaying sources and matches;
305  indexed like ctypes
306  @param[in] pause pause execution?
307  @param[in] prompt prompt for user while paused (ignored if pause is False)
308 
309  @warning if matches are specified and exposure has no lsst.afw.image.Wcs then the matches are
310  silently not shown.
311 
312  @throw Exception if matches specified and exposure is None
313  """
314  # N.b. doxygen will complain about parameters like ds9 and RED not being documented. Bug ID 732356
315  if not self._display or self._display < 0:
316  return
317  if isinstance(self._display, dict):
318  if (name not in self._display) or not self._display[name] or self._display[name] < 0:
319  return
320 
321  if isinstance(self._display, int):
322  frame = self._display
323  elif isinstance(self._display, dict):
324  frame = self._display[name]
325  else:
326  frame = 1
327 
328  if exposure:
329  if isinstance(exposure, list):
330  raise RuntimeError("exposure may not be a list")
331  mi = exposure.getMaskedImage()
332  ds9.mtv(exposure, frame=frame, title=name)
333  x0, y0 = mi.getX0(), mi.getY0()
334  else:
335  x0, y0 = 0, 0
336 
337  try:
338  sources[0][0]
339  except IndexError: # empty list
340  pass
341  except (TypeError, NotImplementedError): # not a list of sets of sources
342  sources = [sources]
343 
344  with ds9.Buffering():
345  i = 0
346  for i, ss in enumerate(sources):
347  ctype = ctypes[i%len(ctypes)]
348  ptype = ptypes[i%len(ptypes)]
349  size = sizes[i%len(sizes)]
350 
351  for source in ss:
352  xc, yc = source.getX() - x0, source.getY() - y0
353  ds9.dot(ptype, xc, yc, size=size, frame=frame, ctype=ctype)
354  #try:
355  # mag = 25-2.5*math.log10(source.getPsfFlux())
356  # if mag > 15: continue
357  #except: continue
358  #ds9.dot("%.1f" % mag, xc, yc, frame=frame, ctype="red")
359 
360  if matches and exposure.getWcs() is not None:
361  wcs = exposure.getWcs()
362  with ds9.Buffering():
363  for first, second, d in matches:
364  i = len(sources) # counter for ptypes/ctypes, starting one after number of source lists
365  catPos = wcs.skyToPixel(first.getCoord())
366  x1, y1 = catPos.getX() - x0, catPos.getY() - y0
367 
368  ctype = ctypes[i%len(ctypes)]
369  ptype = ptypes[i%len(ptypes)]
370  size = 2*sizes[i%len(sizes)]
371  ds9.dot(ptype, x1, y1, size=size, frame=frame, ctype=ctype)
372  i += 1
373 
374  ctype = ctypes[i%len(ctypes)]
375  ptype = ptypes[i%len(ptypes)]
376  size = 2*sizes[i%len(sizes)]
377  x2, y2 = second.getX() - x0, second.getY() - y0
378  ds9.dot(ptype, x2, y2, size=size, frame=frame, ctype=ctype)
379  i += 1
380 
381  if pause:
382  if prompt is None:
383  prompt = "%s: Enter or c to continue [chp]: " % name
384  while True:
385  ans = raw_input(prompt).lower()
386  if ans in ("", "c",):
387  break
388  if ans in ("p",):
389  import pdb; pdb.set_trace()
390  elif ans in ("h", ):
391  print "h[elp] c[ontinue] p[db]"
392 
393  @classmethod
394  def makeField(cls, doc):
395  """!Make an lsst.pex.config.ConfigurableField for this task
396 
397  Provides a convenient way to specify this task is a subtask of another task.
398  Here is an example of use:
399  \code
400  class OtherTaskConfig(lsst.pex.config.Config)
401  aSubtask = ATaskClass.makeField("a brief description of what this task does")
402  \endcode
403 
404  @param[in] cls this class
405  @param[in] doc help text for the field
406  @return a lsst.pex.config.ConfigurableField for this task
407  """
408  return ConfigurableField(doc=doc, target=cls)
409 
410  def _computeFullName(self, name):
411  """!Compute the full name of a subtask or metadata item, given its brief name
412 
413  For example: if the full name of this task is "top.sub.sub2"
414  then _computeFullName("subname") returns "top.sub.sub2.subname".
415 
416  @param[in] name brief name of subtask or metadata item
417  @return the full name: the "name" argument prefixed by the full task name and a period.
418  """
419  return "%s.%s" % (self._fullName, name)
Use to report errors for which a traceback is not useful.
Definition: task.py:64
def getSchemaCatalogs
Return the schemas generated by this task.
Definition: task.py:162
Class for storing ordered metadata with comments.
Definition: PropertyList.h:81
def makeField
Make an lsst.pex.config.ConfigurableField for this task.
Definition: task.py:394
def makeSubtask
Create a subtask as a new instance self.
Definition: task.py:242
a place to record messages and descriptions of the state of processing.
Definition: Log.h:154
def logInfo
Log timer information to obj.metadata and obj.log.
Definition: timer.py:53
def getName
Return the name of the task.
Definition: task.py:227
def display
Display an exposure and/or sources.
Definition: task.py:283
def timer
Context manager to log performance data for an arbitrary block of code.
Definition: task.py:259
def __init__
Create a Task.
Definition: task.py:110
def getTaskDict
Return a dictionary of all tasks as a shallow copy.
Definition: task.py:234
Class for storing generic metadata.
Definition: PropertySet.h:82
def getAllSchemaCatalogs
Call getSchemaCatalogs() on all tasks in the hiearchy, combining the results into a single dict...
Definition: task.py:181
def getFullName
Return the task name as a hierarchical name including parent task names.
Definition: task.py:216
def emptyMetadata
Empty (clear) the metadata for this Task and all sub-Tasks.
Definition: task.py:157
def getFullMetadata
Get metadata for all tasks.
Definition: task.py:198
def _computeFullName
Compute the full name of a subtask or metadata item, given its brief name.
Definition: task.py:410