LSST Applications g0265f82a02+093ff98f47,g02d81e74bb+10352d6f39,g1f3e9fa97e+40b0fc677d,g2079a07aa2+14824f138e,g2bbee38e9b+093ff98f47,g337abbeb29+093ff98f47,g3ddfee87b4+064c02c7ee,g487adcacf7+7e842ebf4b,g50ff169b8f+5929b3527e,g52b1c1532d+a6fc98d2e7,g568d43a26c+1d7ac31cb0,g591dd9f2cf+fb1f69e2ea,g858d7b2824+10352d6f39,g8a8a8dda67+a6fc98d2e7,g8cdfe0ae6a+66d966b544,g99cad8db69+7ce8a7c20a,g9ddcbc5298+d4bad12328,ga1e77700b3+246acaaf9c,ga2e4dd1c03+064c02c7ee,ga8c6da7877+04f6ba86dc,gae46bcf261+093ff98f47,gb0e22166c9+3863383f4c,gba4ed39666+9664299f35,gbb8dafda3b+db40f59a7d,gbeb006f7da+e6a448e96d,gbf5cecdb8a+10352d6f39,gc0f3af6251+10a3fd39cd,gc120e1dc64+5839e50a77,gc28159a63d+093ff98f47,gcf0d15dbbd+064c02c7ee,gd2a12a3803+0c2c227a2b,gdaeeff99f8+a38ce5ea23,ge79ae78c31+093ff98f47,gee10cc3b42+a6fc98d2e7,gf1cff7945b+10352d6f39,w.2024.15
LSST Data Management Base Package
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes | Static Protected Attributes | List of all members
lsst.pipe.tasks.functors.Functor Class Reference
Inheritance diagram for lsst.pipe.tasks.functors.Functor:
lsst.pipe.tasks.functors.Color lsst.pipe.tasks.functors.Column lsst.pipe.tasks.functors.CompositeFunctor lsst.pipe.tasks.functors.CustomFunctor lsst.pipe.tasks.functors.DeconvolvedMoments lsst.pipe.tasks.functors.E1 lsst.pipe.tasks.functors.E2 lsst.pipe.tasks.functors.Ebv lsst.pipe.tasks.functors.HsmFwhm lsst.pipe.tasks.functors.HsmTraceSize lsst.pipe.tasks.functors.HtmIndex20 lsst.pipe.tasks.functors.Index lsst.pipe.tasks.functors.LocalPhotometry lsst.pipe.tasks.functors.LocalWcs lsst.pipe.tasks.functors.Mag lsst.pipe.tasks.functors.MagDiff lsst.pipe.tasks.functors.Photometry lsst.pipe.tasks.functors.PsfHsmTraceSizeDiff lsst.pipe.tasks.functors.PsfSdssTraceSizeDiff lsst.pipe.tasks.functors.RadiusFromQuadrupole lsst.pipe.tasks.functors.ReferenceBand lsst.pipe.tasks.functors.SdssTraceSize

Public Member Functions

 __init__ (self, filt=None, dataset=None, noDup=None)
 
 noDup (self)
 
 columns (self)
 
 multilevelColumns (self, data, columnIndex=None, returnTuple=False)
 
 __call__ (self, data, dropna=False)
 
 difference (self, data1, data2, **kwargs)
 
 fail (self, df)
 
 name (self)
 
 shortname (self)
 

Public Attributes

 filt
 
 dataset
 
 log
 
 name
 

Protected Member Functions

 _get_data_columnLevels (self, data, columnIndex=None)
 
 _get_data_columnLevelNames (self, data, columnIndex=None)
 
 _colsFromDict (self, colDict, columnIndex=None)
 
 _func (self, df, dropna=True)
 
 _get_columnIndex (self, data)
 
 _get_data (self, data)
 
 _setLevels (self, df)
 
 _dropna (self, vals)
 

Protected Attributes

 _noDup
 

Static Protected Attributes

str _defaultDataset = 'ref'
 
tuple _dfLevels = ('column',)
 
bool _defaultNoDup = False
 

Detailed Description

Define and execute a calculation on a DataFrame or Handle holding a
DataFrame.

The `__call__` method accepts either a `~pandas.DataFrame` object or a
`~lsst.daf.butler.DeferredDatasetHandle` or
`~lsst.pipe.base.InMemoryDatasetHandle`, and returns the
result of the calculation as a single column.
Each functor defines what columns are needed for the calculation, and only
these columns are read from the dataset handle.

The action of `__call__` consists of two steps: first, loading the
necessary columns from disk into memory as a `~pandas.DataFrame` object;
and second, performing the computation on this DataFrame and returning the
result.

To define a new `Functor`, a subclass must define a `_func` method,
that takes a `~pandas.DataFrame` and returns result in a `~pandas.Series`.
In addition, it must define the following attributes:

* `_columns`: The columns necessary to perform the calculation
* `name`: A name appropriate for a figure axis label
* `shortname`: A name appropriate for use as a dictionary key

On initialization, a `Functor` should declare what band (``filt`` kwarg)
and dataset (e.g. ``'ref'``, ``'meas'``, ``'forced_src'``) it is intended
to be applied to.
This enables the `_get_data` method to extract the proper columns from the
underlying data.
If not specified, the dataset will fall back on the `_defaultDataset`
attribute.
If band is not specified and ``dataset`` is anything other than ``'ref'``,
then an error will be raised when trying to perform the calculation.

Originally, `Functor` was set up to expect datasets formatted like the
``deepCoadd_obj`` dataset; that is, a DataFrame with a multi-level column
index, with the levels of the column index being ``band``, ``dataset``, and
``column``.
It has since been generalized to apply to DataFrames without multi-level
indices and multi-level indices with just ``dataset`` and ``column``
levels.
In addition, the `_get_data` method that reads the columns from the
underlying data will return a DataFrame with column index levels defined by
the `_dfLevels` attribute; by default, this is ``column``.

The `_dfLevels` attributes should generally not need to be changed, unless
`_func` needs columns from multiple filters or datasets to do the
calculation.
An example of this is the `~lsst.pipe.tasks.functors.Color` functor, for
which `_dfLevels = ('band', 'column')`, and `_func` expects the DataFrame
it gets to have those levels in the column index.

Parameters
----------
filt : str
    Band upon which to do the calculation.

dataset : str
    Dataset upon which to do the calculation (e.g., 'ref', 'meas',
    'forced_src').

Definition at line 94 of file functors.py.

Constructor & Destructor Documentation

◆ __init__()

lsst.pipe.tasks.functors.Functor.__init__ ( self,
filt = None,
dataset = None,
noDup = None )

Member Function Documentation

◆ __call__()

lsst.pipe.tasks.functors.Functor.__call__ ( self,
data,
dropna = False )

Reimplemented in lsst.pipe.tasks.functors.RAColumn, lsst.pipe.tasks.functors.DecColumn, and lsst.pipe.tasks.functors.CompositeFunctor.

Definition at line 345 of file functors.py.

345 def __call__(self, data, dropna=False):
346 df = self._get_data(data)
347 try:
348 vals = self._func(df)
349 except Exception as e:
350 self.log.error("Exception in %s call: %s: %s", self.name, type(e).__name__, e)
351 vals = self.fail(df)
352 if dropna:
353 vals = self._dropna(vals)
354
355 return vals
356

◆ _colsFromDict()

lsst.pipe.tasks.functors.Functor._colsFromDict ( self,
colDict,
columnIndex = None )
protected
Converts dictionary column specficiation to a list of columns.

Definition at line 215 of file functors.py.

215 def _colsFromDict(self, colDict, columnIndex=None):
216 """Converts dictionary column specficiation to a list of columns."""
217 new_colDict = {}
218 columnLevels = self._get_data_columnLevels(None, columnIndex=columnIndex)
219
220 for i, lev in enumerate(columnLevels):
221 if lev in colDict:
222 if isinstance(colDict[lev], str):
223 new_colDict[lev] = [colDict[lev]]
224 else:
225 new_colDict[lev] = colDict[lev]
226 else:
227 new_colDict[lev] = columnIndex.levels[i]
228
229 levelCols = [new_colDict[lev] for lev in columnLevels]
230 cols = list(product(*levelCols))
231 colsAvailable = [col for col in cols if col in columnIndex]
232 return colsAvailable
233

◆ _dropna()

lsst.pipe.tasks.functors.Functor._dropna ( self,
vals )
protected

Definition at line 342 of file functors.py.

342 def _dropna(self, vals):
343 return vals.dropna()
344

◆ _func()

lsst.pipe.tasks.functors.Functor._func ( self,
df,
dropna = True )
protected

◆ _get_columnIndex()

lsst.pipe.tasks.functors.Functor._get_columnIndex ( self,
data )
protected
Return columnIndex.

Definition at line 291 of file functors.py.

291 def _get_columnIndex(self, data):
292 """Return columnIndex."""
293
294 if isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
295 return data.get(component="columns")
296 else:
297 return None
298

◆ _get_data()

lsst.pipe.tasks.functors.Functor._get_data ( self,
data )
protected
Retrieve DataFrame necessary for calculation.

The data argument can be a `~pandas.DataFrame`, a
`~lsst.daf.butler.DeferredDatasetHandle`, or
an `~lsst.pipe.base.InMemoryDatasetHandle`.

Returns a DataFrame upon which `self._func` can act.

Definition at line 299 of file functors.py.

299 def _get_data(self, data):
300 """Retrieve DataFrame necessary for calculation.
301
302 The data argument can be a `~pandas.DataFrame`, a
303 `~lsst.daf.butler.DeferredDatasetHandle`, or
304 an `~lsst.pipe.base.InMemoryDatasetHandle`.
305
306 Returns a DataFrame upon which `self._func` can act.
307 """
308 # We wrap a DataFrame in a handle here to take advantage of the
309 # DataFrame delegate DataFrame column wrangling abilities.
310 if isinstance(data, pd.DataFrame):
311 _data = InMemoryDatasetHandle(data, storageClass="DataFrame")
312 elif isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
313 _data = data
314 else:
315 raise RuntimeError(f"Unexpected type provided for data. Got {get_full_type_name(data)}.")
316
317 # First thing to do: check to see if the data source has a multilevel
318 # column index or not.
319 columnIndex = self._get_columnIndex(_data)
320 is_multiLevel = isinstance(columnIndex, pd.MultiIndex)
321
322 # Get proper columns specification for this functor.
323 if is_multiLevel:
324 columns = self.multilevelColumns(_data, columnIndex=columnIndex)
325 else:
326 columns = self.columns
327
328 # Load in-memory DataFrame with appropriate columns the gen3 way.
329 df = _data.get(parameters={"columns": columns})
330
331 # Drop unnecessary column levels.
332 if is_multiLevel:
333 df = self._setLevels(df)
334
335 return df
336

◆ _get_data_columnLevelNames()

lsst.pipe.tasks.functors.Functor._get_data_columnLevelNames ( self,
data,
columnIndex = None )
protected
Gets the content of each of the column levels for a multilevel
table.

Definition at line 201 of file functors.py.

201 def _get_data_columnLevelNames(self, data, columnIndex=None):
202 """Gets the content of each of the column levels for a multilevel
203 table.
204 """
205 if columnIndex is None:
206 columnIndex = data.get(component="columns")
207
208 columnLevels = columnIndex.names
209 columnLevelNames = {
210 level: list(np.unique(np.array([c for c in columnIndex])[:, i]))
211 for i, level in enumerate(columnLevels)
212 }
213 return columnLevelNames
214

◆ _get_data_columnLevels()

lsst.pipe.tasks.functors.Functor._get_data_columnLevels ( self,
data,
columnIndex = None )
protected
Gets the names of the column index levels.

This should only be called in the context of a multilevel table.

Parameters
----------
data : various
    The data to be read, can be a
    `~lsst.daf.butler.DeferredDatasetHandle` or
    `~lsst.pipe.base.InMemoryDatasetHandle`.
columnIndex (optional): pandas `~pandas.Index` object
    If not passed, then it is read from the
    `~lsst.daf.butler.DeferredDatasetHandle`
    for `~lsst.pipe.base.InMemoryDatasetHandle`.

Definition at line 181 of file functors.py.

181 def _get_data_columnLevels(self, data, columnIndex=None):
182 """Gets the names of the column index levels.
183
184 This should only be called in the context of a multilevel table.
185
186 Parameters
187 ----------
188 data : various
189 The data to be read, can be a
190 `~lsst.daf.butler.DeferredDatasetHandle` or
191 `~lsst.pipe.base.InMemoryDatasetHandle`.
192 columnIndex (optional): pandas `~pandas.Index` object
193 If not passed, then it is read from the
194 `~lsst.daf.butler.DeferredDatasetHandle`
195 for `~lsst.pipe.base.InMemoryDatasetHandle`.
196 """
197 if columnIndex is None:
198 columnIndex = data.get(component="columns")
199 return columnIndex.names
200

◆ _setLevels()

lsst.pipe.tasks.functors.Functor._setLevels ( self,
df )
protected

Definition at line 337 of file functors.py.

337 def _setLevels(self, df):
338 levelsToDrop = [n for n in df.columns.names if n not in self._dfLevels]
339 df.columns = df.columns.droplevel(levelsToDrop)
340 return df
341

◆ columns()

lsst.pipe.tasks.functors.Functor.columns ( self)

◆ difference()

lsst.pipe.tasks.functors.Functor.difference ( self,
data1,
data2,
** kwargs )
Computes difference between functor called on two different
DataFrame/Handle objects.

Definition at line 357 of file functors.py.

357 def difference(self, data1, data2, **kwargs):
358 """Computes difference between functor called on two different
359 DataFrame/Handle objects.
360 """
361 return self(data1, **kwargs) - self(data2, **kwargs)
362

◆ fail()

lsst.pipe.tasks.functors.Functor.fail ( self,
df )

Definition at line 363 of file functors.py.

363 def fail(self, df):
364 return pd.Series(np.full(len(df), np.nan), index=df.index)
365

◆ multilevelColumns()

lsst.pipe.tasks.functors.Functor.multilevelColumns ( self,
data,
columnIndex = None,
returnTuple = False )
Returns columns needed by functor from multilevel dataset.

To access tables with multilevel column structure, the
`~lsst.daf.butler.DeferredDatasetHandle` or
`~lsst.pipe.base.InMemoryDatasetHandle` needs to be passed
either a list of tuples or a dictionary.

Parameters
----------
data : various
    The data as either `~lsst.daf.butler.DeferredDatasetHandle`, or
    `~lsst.pipe.base.InMemoryDatasetHandle`.
columnIndex (optional): pandas `~pandas.Index` object
    Either passed or read in from
    `~lsst.daf.butler.DeferredDatasetHandle`.
`returnTuple` : `bool`
    If true, then return a list of tuples rather than the column
    dictionary specification.
    This is set to `True` by `CompositeFunctor` in order to be able to
    combine columns from the various component functors.

Reimplemented in lsst.pipe.tasks.functors.CompositeFunctor, and lsst.pipe.tasks.functors.Color.

Definition at line 234 of file functors.py.

234 def multilevelColumns(self, data, columnIndex=None, returnTuple=False):
235 """Returns columns needed by functor from multilevel dataset.
236
237 To access tables with multilevel column structure, the
238 `~lsst.daf.butler.DeferredDatasetHandle` or
239 `~lsst.pipe.base.InMemoryDatasetHandle` needs to be passed
240 either a list of tuples or a dictionary.
241
242 Parameters
243 ----------
244 data : various
245 The data as either `~lsst.daf.butler.DeferredDatasetHandle`, or
246 `~lsst.pipe.base.InMemoryDatasetHandle`.
247 columnIndex (optional): pandas `~pandas.Index` object
248 Either passed or read in from
249 `~lsst.daf.butler.DeferredDatasetHandle`.
250 `returnTuple` : `bool`
251 If true, then return a list of tuples rather than the column
252 dictionary specification.
253 This is set to `True` by `CompositeFunctor` in order to be able to
254 combine columns from the various component functors.
255
256 """
257 if not isinstance(data, (DeferredDatasetHandle, InMemoryDatasetHandle)):
258 raise RuntimeError(f"Unexpected data type. Got {get_full_type_name(data)}.")
259
260 if columnIndex is None:
261 columnIndex = data.get(component="columns")
262
263 # Confirm that the dataset has the column levels the functor is
264 # expecting it to have.
265 columnLevels = self._get_data_columnLevels(data, columnIndex)
266
267 columnDict = {'column': self.columns,
268 'dataset': self.dataset}
269 if self.filt is None:
270 columnLevelNames = self._get_data_columnLevelNames(data, columnIndex)
271 if "band" in columnLevels:
272 if self.dataset == "ref":
273 columnDict["band"] = columnLevelNames["band"][0]
274 else:
275 raise ValueError(f"'filt' not set for functor {self.name}"
276 f"(dataset {self.dataset}) "
277 "and DataFrame "
278 "contains multiple filters in column index. "
279 "Set 'filt' or set 'dataset' to 'ref'.")
280 else:
281 columnDict['band'] = self.filt
282
283 if returnTuple:
284 return self._colsFromDict(columnDict, columnIndex=columnIndex)
285 else:
286 return columnDict
287

◆ name()

lsst.pipe.tasks.functors.Functor.name ( self)

◆ noDup()

lsst.pipe.tasks.functors.Functor.noDup ( self)
Do not explode by band if used on object table.

Definition at line 167 of file functors.py.

167 def noDup(self):
168 """Do not explode by band if used on object table."""
169 if self._noDup is not None:
170 return self._noDup
171 else:
172 return self._defaultNoDup
173

◆ shortname()

lsst.pipe.tasks.functors.Functor.shortname ( self)
Short name of functor (suitable for column name/dict key).

Reimplemented in lsst.pipe.tasks.functors.MagDiff, and lsst.pipe.tasks.functors.Color.

Definition at line 372 of file functors.py.

372 def shortname(self):
373 """Short name of functor (suitable for column name/dict key)."""
374 return self.name
375
376

Member Data Documentation

◆ _defaultDataset

str lsst.pipe.tasks.functors.Functor._defaultDataset = 'ref'
staticprotected

Definition at line 156 of file functors.py.

◆ _defaultNoDup

bool lsst.pipe.tasks.functors.Functor._defaultNoDup = False
staticprotected

Definition at line 158 of file functors.py.

◆ _dfLevels

tuple lsst.pipe.tasks.functors.Functor._dfLevels = ('column',)
staticprotected

Definition at line 157 of file functors.py.

◆ _noDup

lsst.pipe.tasks.functors.Functor._noDup
protected

Definition at line 163 of file functors.py.

◆ dataset

lsst.pipe.tasks.functors.Functor.dataset

Definition at line 162 of file functors.py.

◆ filt

lsst.pipe.tasks.functors.Functor.filt

◆ log

lsst.pipe.tasks.functors.Functor.log

Definition at line 164 of file functors.py.

◆ name

lsst.pipe.tasks.functors.Functor.name

The documentation for this class was generated from the following file: