LSSTApplications
10.0+286,10.0+36,10.0+46,10.0-2-g4f67435,10.1+152,10.1+37,11.0,11.0+1,11.0-1-g47edd16,11.0-1-g60db491,11.0-1-g7418c06,11.0-2-g04d2804,11.0-2-g68503cd,11.0-2-g818369d,11.0-2-gb8b8ce7
LSSTDataManagementBasePackage
|
Run a command-line task, using multiprocessing if requested. More...
Public Member Functions | |
def | __init__ |
Construct a TaskRunner. More... | |
def | prepareForMultiProcessing |
Prepare this instance for multiprocessing by removing optional non-picklable elements. More... | |
def | run |
Run the task on all targets. More... | |
def | makeTask |
Create a Task instance. More... | |
def | precall |
Hook for code that should run exactly once, before multiprocessing is invoked. More... | |
def | __call__ |
Run the Task on a single target. More... | |
Static Public Member Functions | |
def | getTargetList |
Return a list of (dataRef, kwargs) to be used as arguments for TaskRunner. More... | |
Public Attributes | |
TaskClass | |
doReturnResults | |
config | |
log | |
doRaise | |
clobberConfig | |
numProcesses | |
timeout | |
Static Public Attributes | |
int | TIMEOUT = 9999 |
Run a command-line task, using multiprocessing if requested.
Each command-line task (subclass of CmdLineTask) has a task runner. By default it is this class, but some tasks require a subclass. See the manual "how to write a command-line task" in the pipe_tasks documentation for more information. See CmdLineTask.parseAndRun to see how a task runner is used.
You may use this task runner for your command-line task if your task has a run method that takes exactly one argument: a butler data reference. Otherwise you must provide a task-specific subclass of this runner for your task's RunnerClass
that overrides TaskRunner.getTargetList and possibly TaskRunner.__call__. See TaskRunner.getTargetList for details.
This design matches the common pattern for command-line tasks: the run method takes a single data reference, of some suitable name. Additional arguments are rare, and if present, require a subclass of TaskRunner that calls these additional arguments by name.
Instances of this class must be picklable in order to be compatible with multiprocessing. If multiprocessing is requested (parsedCmd.numProcesses > 1) then run() calls prepareForMultiProcessing to jettison optional non-picklable elements. If your task runner is not compatible with multiprocessing then indicate this in your task by setting class variable canMultiprocess=False.
Due to a python bug [1], handling a KeyboardInterrupt properly requires specifying a timeout [2]. This timeout (in sec) can be specified as the "timeout" element in the output from ArgumentParser (the "parsedCmd"), if available, otherwise we use TaskRunner.TIMEOUT_DEFAULT.
[1] http://bugs.python.org/issue8296 [2] http://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool)
Definition at line 99 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.__init__ | ( | self, | |
TaskClass, | |||
parsedCmd, | |||
doReturnResults = False |
|||
) |
Construct a TaskRunner.
TaskClass | The class of the task to run |
parsedCmd | The parsed command-line arguments, as returned by the task's argument parser's parse_args method. |
doReturnResults | Should run return the collected result from each invocation of the task? This is only intended for unit tests and similar use. It can easily exhaust memory (if the task returns enough data and you call it enough times) and it will fail when using multiprocessing if the returned data cannot be pickled. |
ImportError | if multiprocessing requested (and the task supports it) but the multiprocessing library cannot be imported. |
Definition at line 130 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.__call__ | ( | self, | |
args | |||
) |
Run the Task on a single target.
This default implementation assumes that the 'args' is a tuple containing a data reference and a dict of keyword arguments.
args | Arguments for Task.run() |
Definition at line 293 of file cmdLineTask.py.
|
static |
Return a list of (dataRef, kwargs) to be used as arguments for TaskRunner.
__call__.
parsedCmd | the parsed command object (an argparse.Namespace) returned by ArgumentParser.parse_args. |
**kwargs | any additional keyword arguments. In the default TaskRunner this is an empty dict, but having it simplifies overriding TaskRunner for tasks whose run method takes additional arguments (see case (1) below). |
The default implementation of TaskRunner.getTargetList and TaskRunner.__call__ works for any command-line task whose run method takes exactly one argument: a data reference. Otherwise you must provide a variant of TaskRunner that overrides TaskRunner.getTargetList and possibly TaskRunner.__call__. There are two cases:
(1) If your command-line task has a run
method that takes one data reference followed by additional arguments, then you need only override TaskRunner.getTargetList to return the additional arguments as an argument dict. To make this easier, your overridden version of getTargetList may call TaskRunner.getTargetList with the extra arguments as keyword arguments. For example, the following adds an argument dict containing a single key: "calExpList", whose value is the list of data IDs for the calexp ID argument:
It is equivalent to this slightly longer version:
(2) If your task does not meet condition (1) then you must override both TaskRunner.getTargetList and TaskRunner.__call__. You may do this however you see fit, so long as TaskRunner.getTargetList returns a list, each of whose elements is sent to TaskRunner.__call__, which runs your task.
Definition at line 210 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.makeTask | ( | self, | |
parsedCmd = None , |
|||
args = None |
|||
) |
Create a Task instance.
[in] | parsedCmd | parsed command-line options (used for extra task args by some task runners) |
[in] | args | args tuple passed to TaskRunner.__call__ (used for extra task arguments by some task runners) |
makeTask() can be called with either the 'parsedCmd' argument or 'args' argument set to None, but it must construct identical Task instances in either case.
Subclasses may ignore this method entirely if they reimplement both TaskRunner.precall and TaskRunner.__call__
Definition at line 252 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.precall | ( | self, | |
parsedCmd | |||
) |
Hook for code that should run exactly once, before multiprocessing is invoked.
Must return True if TaskRunner.__call__ should subsequently be called.
The default implementation writes schemas and configs, or compares them to existing files on disk if present.
Definition at line 267 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.prepareForMultiProcessing | ( | self | ) |
Prepare this instance for multiprocessing by removing optional non-picklable elements.
This is only called if the task is run under multiprocessing.
Definition at line 165 of file cmdLineTask.py.
def lsst.pipe.base.cmdLineTask.TaskRunner.run | ( | self, | |
parsedCmd | |||
) |
Run the task on all targets.
The task is run under multiprocessing if numProcesses > 1; otherwise processing is serial.
False
). See TaskRunner.__call__ for details. Definition at line 172 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.clobberConfig |
Definition at line 153 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.config |
Definition at line 150 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.doRaise |
Definition at line 152 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.doReturnResults |
Definition at line 149 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.log |
Definition at line 151 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.numProcesses |
Definition at line 154 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.TaskClass |
Definition at line 148 of file cmdLineTask.py.
|
static |
Definition at line 129 of file cmdLineTask.py.
lsst.pipe.base.cmdLineTask.TaskRunner.timeout |
Definition at line 156 of file cmdLineTask.py.