Inheritance diagram for lsst.pipe.tasks.functors.Functor:

Public Member Functions
def	__init__ (self, filt=None, dataset=None, noDup=None)

def	noDup (self)

def	columns (self)

def	multilevelColumns (self, data, columnIndex=None, returnTuple=False)

def	__call__ (self, data, dropna=False)

def	difference (self, data1, data2, **kwargs)

def	fail (self, df)

def	name (self)

def	shortname (self)

Public Attributes
	filt

	dataset

Detailed Description

Define and execute a calculation on a ParquetTable

The `__call__` method accepts either a `ParquetTable` object or a
`DeferredDatasetHandle`, and returns the
result of the calculation as a single column.  Each functor defines what
columns are needed for the calculation, and only these columns are read
from the `ParquetTable`.

The action of  `__call__` consists of two steps: first, loading the
necessary columns from disk into memory as a `pandas.DataFrame` object;
and second, performing the computation on this dataframe and returning the
result.


To define a new `Functor`, a subclass must define a `_func` method,
that takes a `pandas.DataFrame` and returns result in a `pandas.Series`.
In addition, it must define the following attributes

* `_columns`: The columns necessary to perform the calculation
* `name`: A name appropriate for a figure axis label
* `shortname`: A name appropriate for use as a dictionary key

On initialization, a `Functor` should declare what band (`filt` kwarg)
and dataset (e.g. `'ref'`, `'meas'`, `'forced_src'`) it is intended to be
applied to. This enables the `_get_data` method to extract the proper
columns from the parquet file. If not specified, the dataset will fall back
on the `_defaultDataset`attribute. If band is not specified and `dataset`
is anything other than `'ref'`, then an error will be raised when trying to
perform the calculation.

Originally, `Functor` was set up to expect
datasets formatted like the `deepCoadd_obj` dataset; that is, a
dataframe with a multi-level column index, with the levels of the
column index being `band`, `dataset`, and `column`.
It has since been generalized to apply to dataframes without mutli-level
indices and multi-level indices with just `dataset` and `column` levels.
In addition, the `_get_data` method that reads
the dataframe from the `ParquetTable` will return a dataframe with column
index levels defined by the `_dfLevels` attribute; by default, this is
`column`.

The `_dfLevels` attributes should generally not need to
be changed, unless `_func` needs columns from multiple filters or datasets
to do the calculation.
An example of this is the `lsst.pipe.tasks.functors.Color` functor, for
which `_dfLevels = ('band', 'column')`, and `_func` expects the dataframe
it gets to have those levels in the column index.

Parameters
----------
filt : str
    Filter upon which to do the calculation

dataset : str
    Dataset upon which to do the calculation
    (e.g., 'ref', 'meas', 'forced_src').

Definition at line 78 of file functors.py.

Constructor & Destructor Documentation

◆ init()

def lsst.pipe.tasks.functors.Functor.__init__	(	self,
		filt = `None`,
		dataset = `None`,
		noDup = `None`
	)

Definition at line 142 of file functors.py.

     def __init__(self, filt=None, dataset=None, noDup=None):
         self.filt = filt
         self.dataset = dataset if dataset is not None else self._defaultDataset
         self._noDup = noDup
  

Member Function Documentation

◆ call()

def lsst.pipe.tasks.functors.Functor.__call__	(	self,
		data,
		dropna = `False`
	)

Definition at line 340 of file functors.py.

     def __call__(self, data, dropna=False):
         try:
             df = self._get_data(data)
             vals = self._func(df)
         except Exception:
             vals = self.fail(df)
         if dropna:
             vals = self._dropna(vals)
  
         return vals
  

◆ columns()

def lsst.pipe.tasks.functors.Functor.columns ( self )

Columns required to perform calculation

Definition at line 155 of file functors.py.

     def columns(self):
         """Columns required to perform calculation
         """
         if not hasattr(self, '_columns'):
             raise NotImplementedError('Must define columns property or _columns attribute')
         return self._columns
  

◆ difference()

def lsst.pipe.tasks.functors.Functor.difference	(		self,
			data1,
			data2,
		**	kwargs
	)

Computes difference between functor called on two different ParquetTable objects

Definition at line 351 of file functors.py.

     def difference(self, data1, data2, **kwargs):
         """Computes difference between functor called on two different ParquetTable objects
         """
         return self(data1, **kwargs) - self(data2, **kwargs)
  

◆ fail()

def lsst.pipe.tasks.functors.Functor.fail	(	self,
		df
	)

Definition at line 356 of file functors.py.

     def fail(self, df):
         return pd.Series(np.full(len(df), np.nan), index=df.index)
  

◆ multilevelColumns()

def lsst.pipe.tasks.functors.Functor.multilevelColumns	(	self,
		data,
		columnIndex = `None`,
		returnTuple = `False`
	)

Returns columns needed by functor from multilevel dataset

To access tables with multilevel column structure, the `MultilevelParquetTable`
or `DeferredDatasetHandle` need to be passed either a list of tuples or a
dictionary.

Parameters
----------
data : `MultilevelParquetTable` or `DeferredDatasetHandle`

columnIndex (optional): pandas `Index` object
    either passed or read in from `DeferredDatasetHandle`.

`returnTuple` : bool
    If true, then return a list of tuples rather than the column dictionary
    specification.  This is set to `True` by `CompositeFunctor` in order to be able to
    combine columns from the various component functors.

Definition at line 229 of file functors.py.

     def multilevelColumns(self, data, columnIndex=None, returnTuple=False):
         """Returns columns needed by functor from multilevel dataset
  
         To access tables with multilevel column structure, the `MultilevelParquetTable`
         or `DeferredDatasetHandle` need to be passed either a list of tuples or a
         dictionary.
  
         Parameters
         ----------
         data : `MultilevelParquetTable` or `DeferredDatasetHandle`
  
         columnIndex (optional): pandas `Index` object
             either passed or read in from `DeferredDatasetHandle`.
  
         `returnTuple` : bool
             If true, then return a list of tuples rather than the column dictionary
             specification.  This is set to `True` by `CompositeFunctor` in order to be able to
             combine columns from the various component functors.
  
         """
         if isinstance(data, DeferredDatasetHandle) and columnIndex is None:
             columnIndex = data.get(component="columns")
  
         # Confirm that the dataset has the column levels the functor is expecting it to have.
         columnLevels = self._get_data_columnLevels(data, columnIndex)
  
         columnDict = {'column': self.columns,
                       'dataset': self.dataset}
         if self.filt is None:
             columnLevelNames = self._get_data_columnLevelNames(data, columnIndex)
             if "band" in columnLevels:
                 if self.dataset == "ref":
                     columnDict["band"] = columnLevelNames["band"][0]
                 else:
                     raise ValueError(f"'filt' not set for functor {self.name}"
                                      f"(dataset {self.dataset}) "
                                      "and ParquetTable "
                                      "contains multiple filters in column index. "
                                      "Set 'filt' or set 'dataset' to 'ref'.")
         else:
             columnDict['band'] = self.filt
  
         if isinstance(data, MultilevelParquetTable):
             return data._colsFromDict(columnDict)
         elif isinstance(data, DeferredDatasetHandle):
             if returnTuple:
                 return self._colsFromDict(columnDict, columnIndex=columnIndex)
             else:
                 return columnDict
  

◆ name()

def lsst.pipe.tasks.functors.Functor.name ( self )

Full name of functor (suitable for figure labels)

Definition at line 360 of file functors.py.

     def name(self):
         """Full name of functor (suitable for figure labels)
         """
         return NotImplementedError
  

◆ noDup()

def lsst.pipe.tasks.functors.Functor.noDup ( self )

Definition at line 148 of file functors.py.

     def noDup(self):
         if self._noDup is not None:
             return self._noDup
         else:
             return self._defaultNoDup
  

◆ shortname()

def lsst.pipe.tasks.functors.Functor.shortname ( self )

Short name of functor (suitable for column name/dict key)

Reimplemented in lsst.pipe.tasks.functors.Color, and lsst.pipe.tasks.functors.MagDiff.

Definition at line 366 of file functors.py.

     def shortname(self):
         """Short name of functor (suitable for column name/dict key)
         """
         return self.name
  
  

Member Data Documentation

◆ dataset

lsst.pipe.tasks.functors.Functor.dataset

Definition at line 144 of file functors.py.

◆ filt

lsst.pipe.tasks.functors.Functor.filt

Definition at line 143 of file functors.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/lsst-scipipe-0.7.0/Linux64/pipe_tasks/21.0.0-147-g0e635eb1+1acddb5be5/python/lsst/pipe/tasks/functors.py

Public Member Functions

Public Attributes