Introduction to meas_base

The meas_base package is the new home of the source measurement framework, which was formerly part of meas_algorithms.

The source measurement framework is a set of Python modules which allow measurements to be performed on calibrated exposures. The framework assumes that a detection catalog has been prepared to identify the objects to be measured. The detection catalog may have been produced with a detection pass on the exposure itself, but it might also be produced from other exposures, stacks, or even multiple exposures. Deblending may or may not have been performed during the creation of the detection catalog.

The framework steps through the detection catalog and performs a set of measurements, object by object, supplying each measurement plugin with the exposure and catalog information needed for its measurement. The measurement results (values, errors, and flags) are then placed in an output catalog.

The measurement framework includes the following features:

SingleFrameMeasurementTask, a subtask that measures after detection and deblending on the same frame (see Single-Frame Measurement).
Several tasks for forced photometry (see Forced Photometry).
Python Plugin base classes (SingleFramePlugin, ForcedPlugin) for single-frame and forced measurement.
Helper code to reduce boilerplate, standardize outputs, and make algorithm code easy to reuse when implementing new measurement algorithms.

Single-Frame Measurement

The single frame measurement framework is used when all the information about the sources to be measured comes from a single image, and hence those sources are detected (and possibly) deblended on that image before measurement. This image may be a coadd of other images, or even a difference of images - from the perspective of the measurement framework there is essentially no difference between these cases (though there may be important differences for particular measurement algorithms).

The high-level algorithm for single-frame measurement is:

The SingleFrameMeasurementTask is initialized (see also __init__). This initializes all the configured algorithms, creating a schema for the outputs in the process. After this stage, the schema cannot be modified and algorithm configuration can not longer be modified.
The run() method is called on each image to be processed, with a SourceCatalog containing all the sources to be measured. These sources must have Footprints (generated by SourceDetectionTask), and a schema that matches that constructed by the previous step. The fields added to the schema during initialization will then be filled in by the measurement framework.

Before measuring any sources, the measurement framework replaces all sources in the catalog with noise (see NoiseReplacer), using the Footprints attached to the SourceCatalog to define their boundaries.

We then loop over all "parent" sources in the catalog - both those that were not blended, and those that represent the pre-deblend combined state of blends. For each parent, we loop again over all its children (if any), and for each of these, we re-insert the child source into the image (which, recall, currently contains only noise), call measure() on each of the plugins, and then replace the child source with noise again. We the insert the parent source, and call measure() on all of the plugins. Before replacing the parent with noise again, we then call measureN() twice for each plugin: once with the list of all children, and once with a single-element list containing just the parent. This ensures that each source (parent or child) is measured with both measure() and measureN(), with the former preceeding the latter.

Because measurement plugin algorithms are often dependent on each other (in particular, most measurements require a centroid as an input), they must be run in a particular order, and we need a mechanism for passing information between them. The order is defined by the 'executionOrder' config parameter, which is defined in the BasePluginConfig class, and hence present for every plugin. Generally, these will remain at their default values; it is the responsibility of a plugin implementor to ensure the default for that plugin is appropriate relative to any plugins it depends on. See BasePluginConfig.executionOrder for some guidelines (it may be easiest to read the Python docstring directly; Doxygen garbles config documentation).

The mechanism for passing information between plugins is SourceTable's slot system (see lsst::afw::table::SlotDefinition), in which particular measurements are given one of several predefined aliases (e.g. "slot_Centroid" -> "base_SdssCentroid"), which are used to implement getters on SourceRecord (e.g. getCentroid(). The measurement framework's configuration defines which measurements are assigned to each slot, and these slot measurements are available to other plugins as soon as the plugin whose outputs are assigned to the slot are run.

All this means that algorithms that need a centroid as input should simply call getCentroid() on the SourceRecord they're provided, and ensure that their executionOrder is higher than that of centroid algorithms. Similarly, algorithms that want a shape should simply call getShape(). Things are a bit trickier for centroid algorithms, which often need to be given an approximate centroid as an input; these should be prepared to look at the Peaks attached to the SourceRecord's Footprint as an initial value, as the slot centroid may not yet be valid. For wrapped C++ Algorithms (see measBaseAlgorithmConcept), this is handled automatically.

Forced Photometry

In forced photometry, an external "reference" catalog is used to constrain measurements on an image. While parts of the forced photometry framework could be with a reference catalog from virtually any source, a complete system for loading the reference catalogs that correspond to the region of sky being measured is only available when measurements from a coadd are used as the reference.

While essentially any measurement plugin can be run in forced mode, typically only photometric measurements are scientifically useful (though centroids and shapes may be useful for quality metrics). In fact, in forced mode we typically configure pseudo-measurements to provide the shape and centroid slots, and it is these — rather than anything special about the forced measurement framework — that constrains measurements. In particular, we generally use the ForcedTransformedCentroidPlugin and ForcedTransformedShapePlugin to provide the centroid and shape slots. Rather than measure the centroid and shape on the image, these simply transform the centroid and shape slots from the reference catalog to the appropriate coordinate system. This ensures that measurements that use these slots to obtain positions and ellipses use the same quantities used in generating the reference catalog.

The core of the forced measurement framework is ForcedMeasurementTask and ForcedPlugin, which broadly parallel SingleFrameMeasurementTask and SingleFramePlugin. The high-level algorithm is essentially the same, but with the SourceCatalog to be measured generated by ForcedMeasurementTask.generateSources() from the reference catalog, rather than provided by the user after running detection. The corresponding reference source and the Wcs objects that define the mapping between reference and measurement coordinate systems are also provided to each plugin.

The fact that the sources to be measured are generated from the reference catalog means that the Footprints attached to these sources must be transformed from the reference coordinate system to the measurement coordinate system, and at present that operation turns HeavyFootprints into regular Footprints. HeavyFootprints for child sources are necessary in order to correctly replace neighboring children of the same parent with noise prior to measurement (see NoiseReplacer), and the lack of these means that deblended measurement in forced photometry is essentially broken, except for plugins that implement measureN() and can hence correctly measure all children simultaneously without having to replace them with noise individually.

In addition to the ForcedMeasurementTask subtask and its plugins, the forced measurement framework also contains a pair of command-line driver tasks, ForcedPhotCcdTask and ForcedPhotCoaddTask. These run forced measurement on CCD-level images and coadd patch images, respectively, using the outputs of a previous single-frame measurement run on coadds as the reference catalog in both cases. These delegate the work of loading (and as necessary, filtering and merging) the appropriate reference catalog for the measurement image to a "references" subtask. The interface for the reference subtask is defined by BaseReferencesTask, with the concrete implementation that utilizes coadd processing outputs in CoaddSrcReferencesTask. In general, to use a reference catalog from another source, one should implement a new references subtask, and reuse ForcedPhotCcdTask and/or ForcedPhotCoaddTask. It should only be necessary to replace these and use ForcedMeasurementTask directly if you need to run forced photometry on data that isn't organized by the Butler or doesn't correspond to CCD- or patch-level images.

Implementing New Plugins and Algorithms

The "Plugin" interfaces used directly by the measurement tasks are defined completely in Python, and are rooted in the abstract base classes sfm.SingleFramePlugin and forcedImage.ForcedPlugin. There are also analogous C++ base classes, SingleFrameAlgorithm and ForcedAlgorithm, for plugins implemented in C++, as well as SimpleAlgorithm, a C++ base class for simple algorithms in which the same implementation can be used for both single-frame and forced measurement.

For a SingleFramePlugin/SingleFrameAlgorithm:

Subclass sfm.SingleFramePlugin or SingleFrameAlgorithm
Implement an __init__ method with the same signature as the base class, in which fields saved by the the Plugin should be added to the schema passed to __init__, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by wrappers.wrapSingleFrameAlgorithm.
Reimplement measure() to perform the actual measurement and save the result in the measRecord argument.
Reimplement fail() unless the Plugin cannot fail (except for environment errors).
Reimplement measureN() if the Plugin supports measuring multiple sources simultaneously.
Register the Plugin with the config mechanism by calling e.g. SingleFramePlugin.registry.register() at module scope (so the registration happens at import-time). Or, in C++, Swig the algorithm as you would any normal C++ class, and call wrappers.wrapSingleFrameAlgorithm to wrap and register the algorithm simultaneously.

For a ForcedPlugin/ForcedAlgorithm

Subclass forcedMeasurement.ForcedPlugin or ForcedAlgorithm
Implement an __init__ method with the same signature as the base class, in which fields saved by the the Plugin should be added to the outputSchema of the schemaMapper passed to __init__, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by by wrappers.wrapForcedAlgorithm.
Reimplement measure() (in Python) or measureForced (C++) to perform the actual measurement and save the result in the measRecord argument. Note that the refRecord and refWcs are available during the measurement if needed.
Reimplement fail() unless the Plugin cannot fail (except for environment errors).
Reimplement measureN() (Python) or measureNForced() (C++) if the Plugin supports measuring multiple sources simultaneously.
Register the Plugin with the config mechanism by calling e.g. ForcedPlugin.registry.register() at module scope (so the registration happens at import-time). Or, in C++, Swig the algorithm as you would any normal C++ class, and call wrappers.wrapSingleFrameAlgorithm to wrap and register the algorithm simultaneously.

In C++, one can also implement both interfaces at the same time using SimpleAlgorithm (see that class for more information).

Error Handling

When a Plugin (or the Algorithm it delegates to) raises any exception, the Task calling it will catch the exception, and call the fail() method of the Plugin, which should cause the plugin to set one or more flags in the output record. If the exception is MeasurementError, the Task will pass this exception back to the fail() method, as MeasurementError contains additional Plugin-specific information indicating the kind of failure. For most other exceptions, the Task will log the exception message as a warning, and pass None as the exception to fail(). In this case, the Plugin should just set the primary failure flag. This is handled automatically by the FlagHandler in Algorithm-based Plugins. Certain exceptions (in particular, out-of-memory errors) are considered fatal and will always be propagated up out of the Task.

Plugin/Algorithm code should endeavor to only throw MeasurementError for known failure modes, unless the problem is in the data and can always be fixed there before the measurement framework is invoked. In other words, we want warnings to appear in the logs only when there's a bug, whether that's in the processing before the measurement framework or in a particular Plugin/Algorithm not knowing and documenting its own failure modes. This means that Plugin/Algorithm implementations should generally have a global try/catch block that re-throwns lower-level exceptions corresponding to known failure modes as MeasurementErrors.

How Plugin Errors are Logged

A Plugin is usually not run by itself, but as a component of a measurement task. The measurement task may also be a component or "subTask" of another task, and so on. When a Plugin is run, the measurement task which is running the plugin logs any error which the plugin throws to a log location within the task hierarchy. For eaxample, when the PsfFlux plugin from meas_base is run within processCcd, its errors are logged to:

    processCcd.charImage.measurement.base_PsfFlux

This log hierarchy allows the log for the PsfFlux plugin to be controlled independently of the other plugins, and also independently of the measurement task log. Measurement errors are typically logged at the DEBUG level. When processCcd is launched, you may selectively modify the log level of any level of the hierarchy.

    processCcd.py -L processCcd.charImage.measurement=WARN

will set the logging level of the MeasurementTask and all the plugins under the MeasurementTask to WARN, whereas:

    processCCd.py -L=processCcd.charImage.measurement.base_PsfFlux=DEBUG

will selectively set just the PsfFlux algorithm running under the MeasurementTask to DEBUG, leaving the task and the other plugins at their default log levels.

How a Plugin can get its Logname

Plugins which do not log internally do not need to know the name of their log. However, if you are writing a plugin and wish to have the plugin log messages to the log level described in the previous section, you must add the following to your plugin class (the class, not the instance):

The plugin class must have a class attribute named "hasLogName".
The class attribute hasLogName must be set to True.
The class initializer must have a logName parameter.

When all of these conditions are satisfied, the measurement task will initialize the plugin with the logName it uses to log error messages, to allow the plugin to log to the same location. The plugin may then use one of its base class methods, getLogName(), to get the name of its log. The plugin may then get the logger:

    logger = lsst.log.Log.getLogger(self.getLogName())

Setup for Python Plugins which log:

Here is an example of how a Python plugin which logs internally can be constructed. Note the class attribute "hasLogName" and the initialization with an optional logName parameter.

class SingleFrameTestPlugin(SingleFramePlugin):

    ConfigClass = SingleFrameTestConfig
    hasLogName = True

    def __init__(self, config, name, schema, metadata, logName=None):
        SingleFramePlugin.__init__(self, config, name, schema, metadata, logName=logName)

With this configuration, the running task will set the logName parameter when the plugin is initialized, and the plugin getLogName() method may subsequently be used to fetch it.

Though it might be overly verbose, a plugin could log at the INFO level each time its measure method is run, using the same logger as the MeasurementTask:

    lsst.log.Log.getLogger(self.getLogName()).info("Staring a measurement.")

Setup for C++ Algorithms which log:

C++ Algorithms which are called from Python tasks can also get the logName. To do so, they must have an optional logName argument in their constructor. Here is an example from PsfFluxAlgorithm:

    PsfFluxAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema,
                     std::string const & logName = "");

The constructor should include the line:

    _logName = logName.size() ? logName : name;

which sets the name to be used for the logger of this plugin to either logName, or to the name of the base name of the plugin if the optional logName argument has not been specified.

The following line in psfFlux.cc must also be added to allow this constructor to be accessed from Python.

    cls.def(py::init<PsfFluxAlgorithm::Control const &, std::string const &, afw::table::Schema &,
        std::string const &>(),
        "ctrl"_a, "name"_a, "schema"_a, "logName"_a);

And finally, the hasLogName=True must be added to the Python wrapper:

    wrapSimpleAlgorithm(PsfFluxAlgorithm, Control=PsfFluxControl,
                TransformClass=PsfFluxTransform, executionOrder=BasePlugin.FLUX_ORDER,
                shouldApCorr=True, hasLogName=True)

This constructor allows the Algorithm to receive the logName as an optional string, which can later be accessed by its getLogName method. In C++, an Algorithm may then log to this logName as follows:

    LOGL_INFO(logger, message...)

where logger can either be the logName string itself, or a Log object returned form

    logger = lsst.log.Log.getLogger(getLogName());

Using a FlagHandler with Python Plugins

Review the SingleFramePlugin requirements for the measure() and fail() methods, which a plugin must implement. When the plugin detects an error, it should raise a MeasurementError exception, which triggers a call to fail(). The fail() method should set the appropriate failure flags in the output catalog.

A FlagHandler is a convenient way for a plugin to define flags for different error conditions, and to automatically set the correct flags when they occur.

How to Define a FlagHandler in Python:

The meas_base plugins implemented in C++ use lsst::meas::base::FlagHander to handle measurement exceptions. In Python, measurement plugins may use a FlagDefinitionList to create an instance of this class.

First examine the code testFlagHandler.py in the meas_base tests directory. This unit test defines a PythonPlugin which illustrates the use of the FlagHander. The init method shown below defines a list of 3 failure flags. As each flag is added, a FlagDefinition is returned which can be used to identify the error later.

When the list is complete, a FlagHandler is created, which initializes the flag fields in the output catalog:

flagDefs = FlagDefinitionList()
FAILURE = flagDefs.add("flag", "General Failure error")
CONTAINS_NAN = flagDefs.add("flag_containsNan", "Measurement area contains a nan")
EDGE = flagDefs.add("flag_edge", "Measurement area over edge")
self.flagHandler = FlagHandler.addFields(schema, name, flagDefs)

This code defines the following error flags for your plugin:

a general failure flag which indicates that something has gone wrong during measure().
a specific failure flag to indicate a source which contain nans.
a specific failure flag which indicates that the source is too close to the edge to be measured.

The FlagHander is also used to implement your fail() method in Python. Recall that you must implement this method in your plugin class if your method can fail.

The following addition to your class will correctly implement the error handling:

def fail(self, measRecord, error=None):
    if error is None:
        self.flagHandler.handleFailure(measRecord)
    else:
        self.flagHandler.handleFailure(measRecord, error.cpp)

When the error is one which your plugin code expects, your code will raise a MeasurementError exception. The error is sent to the fail() method in the "error" argument. The error will indicate which flag should be set in addition to the general failure flag. If the error argument is not supplied, as when the failure is not expected, only the general failure flag will be set.

The PythonPlugin has example code. It is a toy plugin which measures flux inside a square box drawn about the center of a detected source. The plugin raises a MeasurementError when the box lies outside the exposure being measured.

if not exposure.getBBox().contains(bbox):

raise MeasurementError(self.EDGE.doc, self.EDGE.number)

When the MeasurementError is raised, both the general flag and the flag_edge flag will be set to True.

Using the SafeCentroidExtractor:

The PythonPlugin example also demonstrates the use of the SafeCentroidExtractor from Python. The SafeCentroidExtractor is used by many of the C++ algorithms to fetch a value for the centroid of the source, even if the centroid algorithm has failed on this source. Since the source was detected, it will always have a detection Footprint which can be used as a fallback to locate the center of the object.

Define the SafeCentroidExtractor in the initialization of your plugin, like this:

def __init__(self, config, name, schema, metadata):
    SingleFramePlugin.__init__(self, config, name, schema, metadata)
    self.centroidExtractor = lsst.meas.base.SafeCentroidExtractor(schema, name)

Then use centroidExtractor to fetch the centroid:

center = self.centroidExtractor(measRecord, self.flagHandler)

The SafeCentroidExtractor will first try to read the centroid from the centroid slot. But if the failure flag on the centroid slot has been set to "True", it will try to use the detection footprint to determine the centroid. This might potentially allow the plugin to complete its measure() method if the centroid provided is "good enough".

To indicate at the same time that something has gone wrong, the general flag will automatically get set on this source. The SafeCentroidExtractor will also create a flag called "flag_badCentroid" which points to the centroid slot failure flag, and can be used to distinguish records where the failure flag has been set because the centroid slot measurement was bad.