Inheritance diagram for lsst.pipe.tasks.functors.Functor:

Public Member Functions
	__init__ (self, filt=None, dataset=None, noDup=None)

	noDup (self)

	columns (self)

	multilevelColumns (self, data, columnIndex=None, returnTuple=False)

	__call__ (self, data, dropna=False)

	difference (self, data1, data2, **kwargs)

	fail (self, df)

	name (self)

	shortname (self)

Public Attributes
	filt

	dataset

	log

	name

Protected Member Functions
	_get_data_columnLevels (self, data, columnIndex=None)

	_get_data_columnLevelNames (self, data, columnIndex=None)

	_colsFromDict (self, colDict, columnIndex=None)

	_func (self, df, dropna=True)

	_get_columnIndex (self, data)

	_get_data (self, data)

	_setLevels (self, df)

	_dropna (self, vals)

Protected Attributes
	_noDup

Static Protected Attributes
str	_defaultDataset = 'ref'

tuple	_dfLevels = ('column',)

bool	_defaultNoDup = False

Detailed Description

Define and execute a calculation on a DataFrame or Handle holding a
DataFrame.

The `__call__` method accepts either a `~pandas.DataFrame` object or a
`~lsst.daf.butler.DeferredDatasetHandle` or
`~lsst.pipe.base.InMemoryDatasetHandle`, and returns the
result of the calculation as a single column.
Each functor defines what columns are needed for the calculation, and only
these columns are read from the dataset handle.

The action of `__call__` consists of two steps: first, loading the
necessary columns from disk into memory as a `~pandas.DataFrame` object;
and second, performing the computation on this DataFrame and returning the
result.

To define a new `Functor`, a subclass must define a `_func` method,
that takes a `~pandas.DataFrame` and returns result in a `~pandas.Series`.
In addition, it must define the following attributes:

* `_columns`: The columns necessary to perform the calculation
* `name`: A name appropriate for a figure axis label
* `shortname`: A name appropriate for use as a dictionary key

On initialization, a `Functor` should declare what band (``filt`` kwarg)
and dataset (e.g. ``'ref'``, ``'meas'``, ``'forced_src'``) it is intended
to be applied to.
This enables the `_get_data` method to extract the proper columns from the
underlying data.
If not specified, the dataset will fall back on the `_defaultDataset`
attribute.
If band is not specified and ``dataset`` is anything other than ``'ref'``,
then an error will be raised when trying to perform the calculation.

Originally, `Functor` was set up to expect datasets formatted like the
``deepCoadd_obj`` dataset; that is, a DataFrame with a multi-level column
index, with the levels of the column index being ``band``, ``dataset``, and
``column``.
It has since been generalized to apply to DataFrames without multi-level
indices and multi-level indices with just ``dataset`` and ``column``
levels.
In addition, the `_get_data` method that reads the columns from the
underlying data will return a DataFrame with column index levels defined by
the `_dfLevels` attribute; by default, this is ``column``.

The `_dfLevels` attributes should generally not need to be changed, unless
`_func` needs columns from multiple filters or datasets to do the
calculation.
An example of this is the `~lsst.pipe.tasks.functors.Color` functor, for
which `_dfLevels = ('band', 'column')`, and `_func` expects the DataFrame
it gets to have those levels in the column index.

Parameters
----------
filt : str
    Band upon which to do the calculation.

dataset : str
    Dataset upon which to do the calculation (e.g., 'ref', 'meas',
    'forced_src').

Definition at line 97 of file functors.py.

Constructor & Destructor Documentation

◆ init()

lsst.pipe.tasks.functors.Functor.__init__	(	self,
		filt = None,
		dataset = None,
		noDup = None )

Definition at line 163 of file functors.py.

    def __init__(self, filt=None, dataset=None, noDup=None):
        self.filt = filt
        self.dataset = dataset if dataset is not None else self._defaultDataset
        self._noDup = noDup
        self.log = logging.getLogger(type(self).__name__)
 

Member Function Documentation

◆ call()

lsst.pipe.tasks.functors.Functor.__call__	(	self,
		data,
		dropna = False )

Reimplemented in lsst.pipe.tasks.functors.RAColumn, lsst.pipe.tasks.functors.DecColumn, and lsst.pipe.tasks.functors.CompositeFunctor.

Definition at line 348 of file functors.py.

    def __call__(self, data, dropna=False):
        df = self._get_data(data)
        try:
            vals = self._func(df)
        except Exception as e:
            self.log.error("Exception in %s call: %s: %s", self.name, type(e).__name__, e)
            vals = self.fail(df)
        if dropna:
            vals = self._dropna(vals)
 
        return vals
 

◆ _colsFromDict()

lsst.pipe.tasks.functors.Functor._colsFromDict	(	self,
		colDict,
		columnIndex = None )

protected

Converts dictionary column specficiation to a list of columns.

Definition at line 218 of file functors.py.

    def _colsFromDict(self, colDict, columnIndex=None):
        """Converts dictionary column specficiation to a list of columns."""
        new_colDict = {}
        columnLevels = self._get_data_columnLevels(None, columnIndex=columnIndex)
 
        for i, lev in enumerate(columnLevels):
            if lev in colDict:
                if isinstance(colDict[lev], str):
                    new_colDict[lev] = [colDict[lev]]
                else:
                    new_colDict[lev] = colDict[lev]
            else:
                new_colDict[lev] = columnIndex.levels[i]
 
        levelCols = [new_colDict[lev] for lev in columnLevels]
        cols = list(product(*levelCols))
        colsAvailable = [col for col in cols if col in columnIndex]
        return colsAvailable
 

◆ _dropna()

lsst.pipe.tasks.functors.Functor._dropna	(		self,
			vals )

protected

Definition at line 345 of file functors.py.

    def _dropna(self, vals):
        return vals.dropna()
 

◆ _func()

lsst.pipe.tasks.functors.Functor._func	(	self,
		df,
		dropna = True )

protected

Definition at line 291 of file functors.py.

    def _func(self, df, dropna=True):
        raise NotImplementedError('Must define calculation on DataFrame')
 

◆ _get_columnIndex()

lsst.pipe.tasks.functors.Functor._get_columnIndex	(		self,
			data )

protected

Return columnIndex.

Definition at line 294 of file functors.py.

    def _get_columnIndex(self, data):
        """Return columnIndex."""
 
        if isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
            return data.get(component="columns")
        else:
            return None
 

◆ _get_data()

lsst.pipe.tasks.functors.Functor._get_data	(		self,
			data )

protected

Retrieve DataFrame necessary for calculation.

The data argument can be a `~pandas.DataFrame`, a
`~lsst.daf.butler.DeferredDatasetHandle`, or
an `~lsst.pipe.base.InMemoryDatasetHandle`.

Returns a DataFrame upon which `self._func` can act.

Definition at line 302 of file functors.py.

    def _get_data(self, data):
        """Retrieve DataFrame necessary for calculation.
 
        The data argument can be a `~pandas.DataFrame`, a
        `~lsst.daf.butler.DeferredDatasetHandle`, or
        an `~lsst.pipe.base.InMemoryDatasetHandle`.
 
        Returns a DataFrame upon which `self._func` can act.
        """
        # We wrap a DataFrame in a handle here to take advantage of the
        # DataFrame delegate DataFrame column wrangling abilities.
        if isinstance(data, pd.DataFrame):
            _data = InMemoryDatasetHandle(data, storageClass="DataFrame")
        elif isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
            _data = data
        else:
            raise RuntimeError(f"Unexpected type provided for data. Got {get_full_type_name(data)}.")
 
        # First thing to do: check to see if the data source has a multilevel
        # column index or not.
        columnIndex = self._get_columnIndex(_data)
        is_multiLevel = isinstance(columnIndex, pd.MultiIndex)
 
        # Get proper columns specification for this functor.
        if is_multiLevel:
            columns = self.multilevelColumns(_data, columnIndex=columnIndex)
        else:
            columns = self.columns
 
        # Load in-memory DataFrame with appropriate columns the gen3 way.
        df = _data.get(parameters={"columns": columns})
 
        # Drop unnecessary column levels.
        if is_multiLevel:
            df = self._setLevels(df)
 
        return df
 

◆ _get_data_columnLevelNames()

lsst.pipe.tasks.functors.Functor._get_data_columnLevelNames	(	self,
		data,
		columnIndex = None )

protected

Gets the content of each of the column levels for a multilevel
table.

Definition at line 204 of file functors.py.

    def _get_data_columnLevelNames(self, data, columnIndex=None):
        """Gets the content of each of the column levels for a multilevel
        table.
        """
        if columnIndex is None:
            columnIndex = data.get(component="columns")
 
        columnLevels = columnIndex.names
        columnLevelNames = {
            level: list(np.unique(np.array([c for c in columnIndex])[:, i]))
            for i, level in enumerate(columnLevels)
        }
        return columnLevelNames
 

◆ _get_data_columnLevels()

lsst.pipe.tasks.functors.Functor._get_data_columnLevels	(	self,
		data,
		columnIndex = None )

protected

Gets the names of the column index levels.

This should only be called in the context of a multilevel table.

Parameters
----------
data : various
    The data to be read, can be a
    `~lsst.daf.butler.DeferredDatasetHandle` or
    `~lsst.pipe.base.InMemoryDatasetHandle`.
columnIndex (optional): pandas `~pandas.Index` object
    If not passed, then it is read from the
    `~lsst.daf.butler.DeferredDatasetHandle`
    for `~lsst.pipe.base.InMemoryDatasetHandle`.

Definition at line 184 of file functors.py.

    def _get_data_columnLevels(self, data, columnIndex=None):
        """Gets the names of the column index levels.
 
        This should only be called in the context of a multilevel table.
 
        Parameters
        ----------
        data : various
            The data to be read, can be a
            `~lsst.daf.butler.DeferredDatasetHandle` or
            `~lsst.pipe.base.InMemoryDatasetHandle`.
        columnIndex (optional): pandas `~pandas.Index` object
            If not passed, then it is read from the
            `~lsst.daf.butler.DeferredDatasetHandle`
            for `~lsst.pipe.base.InMemoryDatasetHandle`.
        """
        if columnIndex is None:
            columnIndex = data.get(component="columns")
        return columnIndex.names
 

◆ _setLevels()

lsst.pipe.tasks.functors.Functor._setLevels	(		self,
			df )

protected

Definition at line 340 of file functors.py.

    def _setLevels(self, df):
        levelsToDrop = [n for n in df.columns.names if n not in self._dfLevels]
        df.columns = df.columns.droplevel(levelsToDrop)
        return df
 

◆ columns()

lsst.pipe.tasks.functors.Functor.columns ( self )

Columns required to perform calculation.

Definition at line 178 of file functors.py.

    def columns(self):
        """Columns required to perform calculation."""
        if not hasattr(self, '_columns'):
            raise NotImplementedError('Must define columns property or _columns attribute')
        return self._columns
 

◆ difference()

lsst.pipe.tasks.functors.Functor.difference	(		self,
			data1,
			data2,
		**	kwargs )

Computes difference between functor called on two different
DataFrame/Handle objects.

Definition at line 360 of file functors.py.

    def difference(self, data1, data2, **kwargs):
        """Computes difference between functor called on two different
        DataFrame/Handle objects.
        """
        return self(data1, **kwargs) - self(data2, **kwargs)
 

◆ fail()

lsst.pipe.tasks.functors.Functor.fail	(		self,
			df )

Definition at line 366 of file functors.py.

    def fail(self, df):
        return pd.Series(np.full(len(df), np.nan), index=df.index)
 

◆ multilevelColumns()

lsst.pipe.tasks.functors.Functor.multilevelColumns	(	self,
		data,
		columnIndex = None,
		returnTuple = False )

Returns columns needed by functor from multilevel dataset.

To access tables with multilevel column structure, the
`~lsst.daf.butler.DeferredDatasetHandle` or
`~lsst.pipe.base.InMemoryDatasetHandle` needs to be passed
either a list of tuples or a dictionary.

Parameters
----------
data : various
    The data as either `~lsst.daf.butler.DeferredDatasetHandle`, or
    `~lsst.pipe.base.InMemoryDatasetHandle`.
columnIndex (optional): pandas `~pandas.Index` object
    Either passed or read in from
    `~lsst.daf.butler.DeferredDatasetHandle`.
`returnTuple` : `bool`
    If true, then return a list of tuples rather than the column
    dictionary specification.
    This is set to `True` by `CompositeFunctor` in order to be able to
    combine columns from the various component functors.

Reimplemented in lsst.pipe.tasks.functors.CompositeFunctor, and lsst.pipe.tasks.functors.Color.

Definition at line 237 of file functors.py.

    def multilevelColumns(self, data, columnIndex=None, returnTuple=False):
        """Returns columns needed by functor from multilevel dataset.
 
        To access tables with multilevel column structure, the
        `~lsst.daf.butler.DeferredDatasetHandle` or
        `~lsst.pipe.base.InMemoryDatasetHandle` needs to be passed
        either a list of tuples or a dictionary.
 
        Parameters
        ----------
        data : various
            The data as either `~lsst.daf.butler.DeferredDatasetHandle`, or
            `~lsst.pipe.base.InMemoryDatasetHandle`.
        columnIndex (optional): pandas `~pandas.Index` object
            Either passed or read in from
            `~lsst.daf.butler.DeferredDatasetHandle`.
        `returnTuple` : `bool`
            If true, then return a list of tuples rather than the column
            dictionary specification.
            This is set to `True` by `CompositeFunctor` in order to be able to
            combine columns from the various component functors.
 
        """
        if not isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
            raise RuntimeError(f"Unexpected data type. Got {get_full_type_name(data)}.")
 
        if columnIndex is None:
            columnIndex = data.get(component="columns")
 
        # Confirm that the dataset has the column levels the functor is
        # expecting it to have.
        columnLevels = self._get_data_columnLevels(data, columnIndex)
 
        columnDict = {'column': self.columns,
                      'dataset': self.dataset}
        if self.filt is None:
            columnLevelNames = self._get_data_columnLevelNames(data, columnIndex)
            if "band" in columnLevels:
                if self.dataset == "ref":
                    columnDict["band"] = columnLevelNames["band"][0]
                else:
                    raise ValueError(f"'filt' not set for functor {self.name}"
                                     f"(dataset {self.dataset}) "
                                     "and DataFrame "
                                     "contains multiple filters in column index. "
                                     "Set 'filt' or set 'dataset' to 'ref'.")
        else:
            columnDict['band'] = self.filt
 
        if returnTuple:
            return self._colsFromDict(columnDict, columnIndex=columnIndex)
        else:
            return columnDict
 

◆ name()

lsst.pipe.tasks.functors.Functor.name ( self )

Full name of functor (suitable for figure labels).

Definition at line 370 of file functors.py.

    def name(self):
        """Full name of functor (suitable for figure labels)."""
        return NotImplementedError
 

◆ noDup()

lsst.pipe.tasks.functors.Functor.noDup ( self )

Do not explode by band if used on object table.

Definition at line 170 of file functors.py.

    def noDup(self):
        """Do not explode by band if used on object table."""
        if self._noDup is not None:
            return self._noDup
        else:
            return self._defaultNoDup
 

◆ shortname()

lsst.pipe.tasks.functors.Functor.shortname ( self )

Short name of functor (suitable for column name/dict key).

Reimplemented in lsst.pipe.tasks.functors.MagDiff, and lsst.pipe.tasks.functors.Color.

Definition at line 375 of file functors.py.

    def shortname(self):
        """Short name of functor (suitable for column name/dict key)."""
        return self.name
 
 

Member Data Documentation

◆ _defaultDataset

str lsst.pipe.tasks.functors.Functor._defaultDataset = 'ref'

staticprotected

Definition at line 159 of file functors.py.

◆ _defaultNoDup

bool lsst.pipe.tasks.functors.Functor._defaultNoDup = False

staticprotected

Definition at line 161 of file functors.py.

◆ _dfLevels

tuple lsst.pipe.tasks.functors.Functor._dfLevels = ('column',)

staticprotected

Definition at line 160 of file functors.py.

◆ _noDup

lsst.pipe.tasks.functors.Functor._noDup

protected

Definition at line 166 of file functors.py.

◆ dataset

lsst.pipe.tasks.functors.Functor.dataset

Definition at line 165 of file functors.py.

◆ filt

lsst.pipe.tasks.functors.Functor.filt

Reimplemented in lsst.pipe.tasks.functors.CompositeFunctor, lsst.pipe.tasks.functors.Color, lsst.pipe.tasks.functors.CompositeFunctor, and lsst.pipe.tasks.functors.Color.

Definition at line 164 of file functors.py.

◆ log

lsst.pipe.tasks.functors.Functor.log

Definition at line 167 of file functors.py.

◆ name

lsst.pipe.tasks.functors.Functor.name

Definition at line 353 of file functors.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/lsst-scipipe-8.0.0/Linux64/pipe_tasks/ge6cb8fbbf7+d119aed356/python/lsst/pipe/tasks/functors.py

Public Member Functions

Public Attributes

Protected Member Functions

Protected Attributes

Static Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ __call__()

◆ _colsFromDict()

◆ _dropna()

◆ _func()

◆ _get_columnIndex()

◆ _get_data()

◆ _get_data_columnLevelNames()

◆ _get_data_columnLevels()

◆ _setLevels()

◆ columns()

◆ difference()

◆ fail()

◆ multilevelColumns()

◆ name()

◆ noDup()

◆ shortname()

Member Data Documentation

◆ _defaultDataset

◆ _defaultNoDup

◆ _dfLevels

◆ _noDup

◆ dataset

◆ filt

◆ log

◆ name

◆ init()

◆ call()