Inheritance diagram for lsst.dax.apdb.sql.apdbSql.ApdbSql:

Public Member Functions
	__init__ (self, ApdbSqlConfig config)

VersionTuple	apdbImplementationVersion (cls)

ApdbSqlConfig	init_database (cls, str db_url, *str\|None schema_file=None, str\|None schema_name=None, int\|None read_sources_months=None, int\|None read_forced_sources_months=None, bool use_insert_id=False, int\|None connection_timeout=None, str\|None dia_object_index=None, int\|None htm_level=None, str\|None htm_index_column=None, list[str]\|None ra_dec_columns=None, str\|None prefix=None, str\|None namespace=None, bool drop=False)

VersionTuple	apdbSchemaVersion (self)

ApdbSqlReplica	get_replica (self)

dict[str, int]	tableRowCount (self)

Table\|None	tableDef (self, ApdbTables table)

pandas.DataFrame	getDiaObjects (self, Region region)

pandas.DataFrame\|None	getDiaSources (self, Region region, Iterable[int]\|None object_ids, astropy.time.Time visit_time)

pandas.DataFrame\|None	getDiaForcedSources (self, Region region, Iterable[int]\|None object_ids, astropy.time.Time visit_time)

bool	containsVisitDetector (self, int visit, int detector)

bool	containsCcdVisit (self, int ccdVisitId)

pandas.DataFrame	getSSObjects (self)

None	store (self, astropy.time.Time visit_time, pandas.DataFrame objects, pandas.DataFrame\|None sources=None, pandas.DataFrame\|None forced_sources=None)

None	storeSSObjects (self, pandas.DataFrame objects)

None	reassignDiaSources (self, Mapping[int, int] idMap)

None	dailyJob (self)

int	countUnassociatedObjects (self)

ApdbMetadata	metadata (self)

Public Attributes
	config

	pixelator

	metadataSchemaVersionKey

	metadataCodeVersionKey

	metadataReplicaVersionKey

	metadataConfigKey

Static Public Attributes
	ConfigClass = ApdbSqlConfig

str	metadataSchemaVersionKey = "version:schema"

str	metadataCodeVersionKey = "version:ApdbSql"

str	metadataReplicaVersionKey = "version:ApdbSqlReplica"

str	metadataConfigKey = "config:apdb-sql.json"

Protected Member Functions
sqlalchemy.engine.Engine	_makeEngine (cls, ApdbSqlConfig config)

None	_versionCheck (self, ApdbMetadataSql metadata)

None	_makeSchema (cls, ApdbConfig config, bool drop=False)

pandas.DataFrame	_getDiaSourcesInRegion (self, Region region, astropy.time.Time visit_time)

pandas.DataFrame	_getDiaSourcesByIDs (self, list[int] object_ids, astropy.time.Time visit_time)

pandas.DataFrame	_getSourcesByIDs (self, ApdbTables table_enum, list[int] object_ids, float midpointMjdTai_start)

None	_storeReplicaChunk (self, ReplicaChunk replica_chunk, astropy.time.Time visit_time, sqlalchemy.engine.Connection connection)

None	_storeDiaObjects (self, pandas.DataFrame objs, astropy.time.Time visit_time, ReplicaChunk\|None replica_chunk, sqlalchemy.engine.Connection connection)

None	_storeDiaSources (self, pandas.DataFrame sources, ReplicaChunk\|None replica_chunk, sqlalchemy.engine.Connection connection)

None	_storeDiaForcedSources (self, pandas.DataFrame sources, ReplicaChunk\|None replica_chunk, sqlalchemy.engine.Connection connection)

list[tuple[int, int]]	_htm_indices (self, Region region)

sql.ColumnElement	_filterRegion (self, sqlalchemy.schema.Table table, Region region)

pandas.DataFrame	_add_obj_htm_index (self, pandas.DataFrame df)

pandas.DataFrame	_add_src_htm_index (self, pandas.DataFrame sources, pandas.DataFrame objs)

Protected Attributes
	_engine

	_metadata

	_schema

Static Protected Attributes
tuple	_frozen_parameters

Detailed Description

Implementation of APDB interface based on SQL database.

The implementation is configured via standard ``pex_config`` mechanism
using `ApdbSqlConfig` configuration class. For an example of different
configurations check ``config/`` folder.

Parameters
----------
config : `ApdbSqlConfig`
    Configuration object.

Definition at line 176 of file apdbSql.py.

Constructor & Destructor Documentation

◆ init()

lsst.dax.apdb.sql.apdbSql.ApdbSql.__init__	(		self,
		ApdbSqlConfig	config )

Definition at line 212 of file apdbSql.py.

    def __init__(self, config: ApdbSqlConfig):
        self._engine = self._makeEngine(config)
 
        sa_metadata = sqlalchemy.MetaData(schema=config.namespace)
        meta_table_name = ApdbTables.metadata.table_name(prefix=config.prefix)
        meta_table: sqlalchemy.schema.Table | None = None
        with suppress(sqlalchemy.exc.NoSuchTableError):
            meta_table = sqlalchemy.schema.Table(meta_table_name, sa_metadata, autoload_with=self._engine)
 
        self._metadata = ApdbMetadataSql(self._engine, meta_table)
 
        # Read frozen config from metadata.
        config_json = self._metadata.get(self.metadataConfigKey)
        if config_json is not None:
            # Update config from metadata.
            freezer = ApdbConfigFreezer[ApdbSqlConfig](self._frozen_parameters)
            self.config = freezer.update(config, config_json)
        else:
            self.config = config
        self.config.validate()
 
        self._schema = ApdbSqlSchema(
            engine=self._engine,
            dia_object_index=self.config.dia_object_index,
            schema_file=self.config.schema_file,
            schema_name=self.config.schema_name,
            prefix=self.config.prefix,
            namespace=self.config.namespace,
            htm_index_column=self.config.htm_index_column,
            enable_replica=self.config.use_insert_id,
        )
 
        if self._metadata.table_exists():
            self._versionCheck(self._metadata)
 
        self.pixelator = HtmPixelization(self.config.htm_level)
 
        _LOG.debug("APDB Configuration:")
        _LOG.debug("    dia_object_index: %s", self.config.dia_object_index)
        _LOG.debug("    read_sources_months: %s", self.config.read_sources_months)
        _LOG.debug("    read_forced_sources_months: %s", self.config.read_forced_sources_months)
        _LOG.debug("    dia_object_columns: %s", self.config.dia_object_columns)
        _LOG.debug("    schema_file: %s", self.config.schema_file)
        _LOG.debug("    extra_schema_file: %s", self.config.extra_schema_file)
        _LOG.debug("    schema prefix: %s", self.config.prefix)
 

Member Function Documentation

◆ _add_obj_htm_index()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql._add_obj_htm_index	(		self,
		pandas.DataFrame	df )

protected

Calculate HTM index for each record and add it to a DataFrame.

Notes
-----
This overrides any existing column in a DataFrame with the same name
(pixelId). Original DataFrame is not changed, copy of a DataFrame is
returned.

Definition at line 1103 of file apdbSql.py.

    def _add_obj_htm_index(self, df: pandas.DataFrame) -> pandas.DataFrame:
        """Calculate HTM index for each record and add it to a DataFrame.
 
        Notes
        -----
        This overrides any existing column in a DataFrame with the same name
        (pixelId). Original DataFrame is not changed, copy of a DataFrame is
        returned.
        """
        # calculate HTM index for every DiaObject
        htm_index = np.zeros(df.shape[0], dtype=np.int64)
        ra_col, dec_col = self.config.ra_dec_columns
        for i, (ra, dec) in enumerate(zip(df[ra_col], df[dec_col])):
            uv3d = UnitVector3d(LonLat.fromDegrees(ra, dec))
            idx = self.pixelator.index(uv3d)
            htm_index[i] = idx
        df = df.copy()
        df[self.config.htm_index_column] = htm_index
        return df
 

◆ _add_src_htm_index()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql._add_src_htm_index	(		self,
		pandas.DataFrame	sources,
		pandas.DataFrame	objs )

protected

Add pixelId column to DiaSource catalog.

Notes
-----
This method copies pixelId value from a matching DiaObject record.
DiaObject catalog needs to have a pixelId column filled by
``_add_obj_htm_index`` method and DiaSource records need to be
associated to DiaObjects via ``diaObjectId`` column.

This overrides any existing column in a DataFrame with the same name
(pixelId). Original DataFrame is not changed, copy of a DataFrame is
returned.

Definition at line 1123 of file apdbSql.py.

    def _add_src_htm_index(self, sources: pandas.DataFrame, objs: pandas.DataFrame) -> pandas.DataFrame:
        """Add pixelId column to DiaSource catalog.
 
        Notes
        -----
        This method copies pixelId value from a matching DiaObject record.
        DiaObject catalog needs to have a pixelId column filled by
        ``_add_obj_htm_index`` method and DiaSource records need to be
        associated to DiaObjects via ``diaObjectId`` column.
 
        This overrides any existing column in a DataFrame with the same name
        (pixelId). Original DataFrame is not changed, copy of a DataFrame is
        returned.
        """
        pixel_id_map: dict[int, int] = {
            diaObjectId: pixelId
            for diaObjectId, pixelId in zip(objs["diaObjectId"], objs[self.config.htm_index_column])
        }
        # DiaSources associated with SolarSystemObjects do not have an
        # associated DiaObject hence we skip them and set their htmIndex
        # value to 0.
        pixel_id_map[0] = 0
        htm_index = np.zeros(sources.shape[0], dtype=np.int64)
        for i, diaObjId in enumerate(sources["diaObjectId"]):
            htm_index[i] = pixel_id_map[diaObjId]
        sources = sources.copy()
        sources[self.config.htm_index_column] = htm_index
        return sources

◆ _filterRegion()

sql.ColumnElement lsst.dax.apdb.sql.apdbSql.ApdbSql._filterRegion	(		self,
		sqlalchemy.schema.Table	table,
		Region	region )

protected

Make SQLAlchemy expression for selecting records in a region.

Definition at line 1089 of file apdbSql.py.

    def _filterRegion(self, table: sqlalchemy.schema.Table, region: Region) -> sql.ColumnElement:
        """Make SQLAlchemy expression for selecting records in a region."""
        htm_index_column = table.columns[self.config.htm_index_column]
        exprlist = []
        pixel_ranges = self._htm_indices(region)
        for low, upper in pixel_ranges:
            upper -= 1
            if low == upper:
                exprlist.append(htm_index_column == low)
            else:
                exprlist.append(sql.expression.between(htm_index_column, low, upper))
 
        return sql.expression.or_(*exprlist)
 

◆ _getDiaSourcesByIDs()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql._getDiaSourcesByIDs	(		self,
		list[int]	object_ids,
		astropy.time.Time	visit_time )

protected

Return catalog of DiaSource instances given set of DiaObject IDs.

Parameters
----------
object_ids :
    Collection of DiaObject IDs
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog contaning DiaSource records.

Definition at line 771 of file apdbSql.py.

    def _getDiaSourcesByIDs(self, object_ids: list[int], visit_time: astropy.time.Time) -> pandas.DataFrame:
        """Return catalog of DiaSource instances given set of DiaObject IDs.
 
        Parameters
        ----------
        object_ids :
            Collection of DiaObject IDs
        visit_time : `astropy.time.Time`
            Time of the current visit.
 
        Returns
        -------
        catalog : `pandas.DataFrame`
            Catalog contaning DiaSource records.
        """
        # TODO: DateTime.MJD must be consistent with code in ap_association,
        # alternatively we can fill midpointMjdTai ourselves in store()
        midpointMjdTai_start = _make_midpointMjdTai_start(visit_time, self.config.read_sources_months)
        _LOG.debug("midpointMjdTai_start = %.6f", midpointMjdTai_start)
 
        with Timer("DiaSource select", self.config.timer):
            sources = self._getSourcesByIDs(ApdbTables.DiaSource, object_ids, midpointMjdTai_start)
 
        _LOG.debug("found %s DiaSources", len(sources))
        return sources
 

◆ _getDiaSourcesInRegion()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql._getDiaSourcesInRegion	(		self,
		Region	region,
		astropy.time.Time	visit_time )

protected

Return catalog of DiaSource instances from given region.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIASources.
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog containing DiaSource records.

Definition at line 735 of file apdbSql.py.

    def _getDiaSourcesInRegion(self, region: Region, visit_time: astropy.time.Time) -> pandas.DataFrame:
        """Return catalog of DiaSource instances from given region.
 
        Parameters
        ----------
        region : `lsst.sphgeom.Region`
            Region to search for DIASources.
        visit_time : `astropy.time.Time`
            Time of the current visit.
 
        Returns
        -------
        catalog : `pandas.DataFrame`
            Catalog containing DiaSource records.
        """
        # TODO: DateTime.MJD must be consistent with code in ap_association,
        # alternatively we can fill midpointMjdTai ourselves in store()
        midpointMjdTai_start = _make_midpointMjdTai_start(visit_time, self.config.read_sources_months)
        _LOG.debug("midpointMjdTai_start = %.6f", midpointMjdTai_start)
 
        table = self._schema.get_table(ApdbTables.DiaSource)
        columns = self._schema.get_apdb_columns(ApdbTables.DiaSource)
        query = sql.select(*columns)
 
        # build selection
        time_filter = table.columns["midpointMjdTai"] > midpointMjdTai_start
        where = sql.expression.and_(self._filterRegion(table, region), time_filter)
        query = query.where(where)
 
        # execute select
        with Timer("DiaSource select", self.config.timer):
            with self._engine.begin() as conn:
                sources = pandas.read_sql_query(query, conn)
        _LOG.debug("found %s DiaSources", len(sources))
        return sources
 

◆ _getSourcesByIDs()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql._getSourcesByIDs	(		self,
		ApdbTables	table_enum,
		list[int]	object_ids,
		float	midpointMjdTai_start )

protected

Return catalog of DiaSource or DiaForcedSource instances given set
of DiaObject IDs.

Parameters
----------
table : `sqlalchemy.schema.Table`
    Database table.
object_ids :
    Collection of DiaObject IDs
midpointMjdTai_start : `float`
    Earliest midpointMjdTai to retrieve.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog contaning DiaSource records. `None` is returned if
    ``read_sources_months`` configuration parameter is set to 0 or
    when ``object_ids`` is empty.

Definition at line 797 of file apdbSql.py.

    ) -> pandas.DataFrame:
        """Return catalog of DiaSource or DiaForcedSource instances given set
        of DiaObject IDs.
 
        Parameters
        ----------
        table : `sqlalchemy.schema.Table`
            Database table.
        object_ids :
            Collection of DiaObject IDs
        midpointMjdTai_start : `float`
            Earliest midpointMjdTai to retrieve.
 
        Returns
        -------
        catalog : `pandas.DataFrame`
            Catalog contaning DiaSource records. `None` is returned if
            ``read_sources_months`` configuration parameter is set to 0 or
            when ``object_ids`` is empty.
        """
        table = self._schema.get_table(table_enum)
        columns = self._schema.get_apdb_columns(table_enum)
 
        sources: pandas.DataFrame | None = None
        if len(object_ids) <= 0:
            _LOG.debug("ID list is empty, just fetch empty result")
            query = sql.select(*columns).where(sql.literal(False))
            with self._engine.begin() as conn:
                sources = pandas.read_sql_query(query, conn)
        else:
            data_frames: list[pandas.DataFrame] = []
            for ids in chunk_iterable(sorted(object_ids), 1000):
                query = sql.select(*columns)
 
                # Some types like np.int64 can cause issues with
                # sqlalchemy, convert them to int.
                int_ids = [int(oid) for oid in ids]
 
                # select by object id
                query = query.where(
                    sql.expression.and_(
                        table.columns["diaObjectId"].in_(int_ids),
                        table.columns["midpointMjdTai"] > midpointMjdTai_start,
                    )
                )
 
                # execute select
                with self._engine.begin() as conn:
                    data_frames.append(pandas.read_sql_query(query, conn))
 
            if len(data_frames) == 1:
                sources = data_frames[0]
            else:
                sources = pandas.concat(data_frames)
        assert sources is not None, "Catalog cannot be None"
        return sources
 

◆ _htm_indices()

list[tuple[int, int]] lsst.dax.apdb.sql.apdbSql.ApdbSql._htm_indices	(		self,
		Region	region )

protected

Generate a set of HTM indices covering specified region.

Parameters
----------
region: `sphgeom.Region`
    Region that needs to be indexed.

Returns
-------
Sequence of ranges, range is a tuple (minHtmID, maxHtmID).

Definition at line 1072 of file apdbSql.py.

    def _htm_indices(self, region: Region) -> list[tuple[int, int]]:
        """Generate a set of HTM indices covering specified region.
 
        Parameters
        ----------
        region: `sphgeom.Region`
            Region that needs to be indexed.
 
        Returns
        -------
        Sequence of ranges, range is a tuple (minHtmID, maxHtmID).
        """
        _LOG.debug("region: %s", region)
        indices = self.pixelator.envelope(region, self.config.htm_max_ranges)
 
        return indices.ranges()
 

◆ _makeEngine()

sqlalchemy.engine.Engine lsst.dax.apdb.sql.apdbSql.ApdbSql._makeEngine	(		cls,
		ApdbSqlConfig	config )

protected

Make SQLALchemy engine based on configured parameters.

Parameters
----------
config : `ApdbSqlConfig`
    Configuration object.

Definition at line 259 of file apdbSql.py.

    def _makeEngine(cls, config: ApdbSqlConfig) -> sqlalchemy.engine.Engine:
        """Make SQLALchemy engine based on configured parameters.
 
        Parameters
        ----------
        config : `ApdbSqlConfig`
            Configuration object.
        """
        # engine is reused between multiple processes, make sure that we don't
        # share connections by disabling pool (by using NullPool class)
        kw: MutableMapping[str, Any] = dict(echo=config.sql_echo)
        conn_args: dict[str, Any] = dict()
        if not config.connection_pool:
            kw.update(poolclass=NullPool)
        if config.isolation_level is not None:
            kw.update(isolation_level=config.isolation_level)
        elif config.db_url.startswith("sqlite"):  # type: ignore
            # Use READ_UNCOMMITTED as default value for sqlite.
            kw.update(isolation_level="READ_UNCOMMITTED")
        if config.connection_timeout is not None:
            if config.db_url.startswith("sqlite"):
                conn_args.update(timeout=config.connection_timeout)
            elif config.db_url.startswith(("postgresql", "mysql")):
                conn_args.update(connect_timeout=config.connection_timeout)
        kw.update(connect_args=conn_args)
        engine = sqlalchemy.create_engine(config.db_url, **kw)
 
        if engine.dialect.name == "sqlite":
            # Need to enable foreign keys on every new connection.
            sqlalchemy.event.listen(engine, "connect", _onSqlite3Connect)
 
        return engine
 

◆ _makeSchema()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._makeSchema	(		cls,
		ApdbConfig	config,
		bool	drop = False )

protected

Definition at line 464 of file apdbSql.py.

    def _makeSchema(cls, config: ApdbConfig, drop: bool = False) -> None:
        # docstring is inherited from a base class
 
        if not isinstance(config, ApdbSqlConfig):
            raise TypeError(f"Unexpected type of configuration object: {type(config)}")
 
        engine = cls._makeEngine(config)
 
        # Ask schema class to create all tables.
        schema = ApdbSqlSchema(
            engine=engine,
            dia_object_index=config.dia_object_index,
            schema_file=config.schema_file,
            schema_name=config.schema_name,
            prefix=config.prefix,
            namespace=config.namespace,
            htm_index_column=config.htm_index_column,
            enable_replica=config.use_insert_id,
        )
        schema.makeSchema(drop=drop)
 
        # Need metadata table to store few items in it, if table exists.
        meta_table: sqlalchemy.schema.Table | None = None
        with suppress(ValueError):
            meta_table = schema.get_table(ApdbTables.metadata)
 
        apdb_meta = ApdbMetadataSql(engine, meta_table)
        if apdb_meta.table_exists():
            # Fill version numbers, overwrite if they are already there.
            apdb_meta.set(cls.metadataSchemaVersionKey, str(schema.schemaVersion()), force=True)
            apdb_meta.set(cls.metadataCodeVersionKey, str(cls.apdbImplementationVersion()), force=True)
            if config.use_insert_id:
                # Only store replica code version if replcia is enabled.
                apdb_meta.set(
                    cls.metadataReplicaVersionKey,
                    str(ApdbSqlReplica.apdbReplicaImplementationVersion()),
                    force=True,
                )
 
            # Store frozen part of a configuration in metadata.
            freezer = ApdbConfigFreezer[ApdbSqlConfig](cls._frozen_parameters)
            apdb_meta.set(cls.metadataConfigKey, freezer.to_json(config), force=True)
 

◆ _storeDiaForcedSources()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._storeDiaForcedSources	(		self,
		pandas.DataFrame	sources,
		ReplicaChunk \| None	replica_chunk,
		sqlalchemy.engine.Connection	connection )

protected

Store a set of DiaForcedSources from current visit.

Parameters
----------
sources : `pandas.DataFrame`
    Catalog containing DiaForcedSource records

Definition at line 1039 of file apdbSql.py.

    ) -> None:
        """Store a set of DiaForcedSources from current visit.
 
        Parameters
        ----------
        sources : `pandas.DataFrame`
            Catalog containing DiaForcedSource records
        """
        table = self._schema.get_table(ApdbTables.DiaForcedSource)
 
        # Insert replica data
        replica_data: list[dict] = []
        replica_stmt: Any = None
        if replica_chunk is not None:
            pk_names = [column.name for column in table.primary_key]
            replica_data = sources[pk_names].to_dict("records")
            for row in replica_data:
                row["apdb_replica_chunk"] = replica_chunk.id
            replica_table = self._schema.get_table(ExtraTables.DiaForcedSourceChunks)
            replica_stmt = replica_table.insert()
 
        # everything to be done in single transaction
        with Timer("DiaForcedSource insert", self.config.timer):
            sources = _coerce_uint64(sources)
            sources.to_sql(table.name, connection, if_exists="append", index=False, schema=table.schema)
            if replica_stmt is not None:
                connection.execute(replica_stmt, replica_data)
 

◆ _storeDiaObjects()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._storeDiaObjects	(		self,
		pandas.DataFrame	objs,
		astropy.time.Time	visit_time,
		ReplicaChunk \| None	replica_chunk,
		sqlalchemy.engine.Connection	connection )

protected

Store catalog of DiaObjects from current visit.

Parameters
----------
objs : `pandas.DataFrame`
    Catalog with DiaObject records.
visit_time : `astropy.time.Time`
    Time of the visit.
replica_chunk : `ReplicaChunk`
    Insert identifier.

Definition at line 880 of file apdbSql.py.

    ) -> None:
        """Store catalog of DiaObjects from current visit.
 
        Parameters
        ----------
        objs : `pandas.DataFrame`
            Catalog with DiaObject records.
        visit_time : `astropy.time.Time`
            Time of the visit.
        replica_chunk : `ReplicaChunk`
            Insert identifier.
        """
        if len(objs) == 0:
            _LOG.debug("No objects to write to database.")
            return
 
        # Some types like np.int64 can cause issues with sqlalchemy, convert
        # them to int.
        ids = sorted(int(oid) for oid in objs["diaObjectId"])
        _LOG.debug("first object ID: %d", ids[0])
 
        # TODO: Need to verify that we are using correct scale here for
        # DATETIME representation (see DM-31996).
        dt = visit_time.datetime
 
        # everything to be done in single transaction
        if self.config.dia_object_index == "last_object_table":
            # Insert and replace all records in LAST table.
            table = self._schema.get_table(ApdbTables.DiaObjectLast)
 
            # Drop the previous objects (pandas cannot upsert).
            query = table.delete().where(table.columns["diaObjectId"].in_(ids))
 
            with Timer(table.name + " delete", self.config.timer):
                res = connection.execute(query)
            _LOG.debug("deleted %s objects", res.rowcount)
 
            # DiaObjectLast is a subset of DiaObject, strip missing columns
            last_column_names = [column.name for column in table.columns]
            last_objs = objs[last_column_names]
            last_objs = _coerce_uint64(last_objs)
 
            if "lastNonForcedSource" in last_objs.columns:
                # lastNonForcedSource is defined NOT NULL, fill it with visit
                # time just in case.
                last_objs["lastNonForcedSource"].fillna(dt, inplace=True)
            else:
                extra_column = pandas.Series([dt] * len(objs), name="lastNonForcedSource")
                last_objs.set_index(extra_column.index, inplace=True)
                last_objs = pandas.concat([last_objs, extra_column], axis="columns")
 
            with Timer("DiaObjectLast insert", self.config.timer):
                last_objs.to_sql(
                    table.name,
                    connection,
                    if_exists="append",
                    index=False,
                    schema=table.schema,
                )
        else:
            # truncate existing validity intervals
            table = self._schema.get_table(ApdbTables.DiaObject)
 
            update = (
                table.update()
                .values(validityEnd=dt)
                .where(
                    sql.expression.and_(
                        table.columns["diaObjectId"].in_(ids),
                        table.columns["validityEnd"].is_(None),
                    )
                )
            )
 
            # _LOG.debug("query: %s", query)
 
            with Timer(table.name + " truncate", self.config.timer):
                res = connection.execute(update)
            _LOG.debug("truncated %s intervals", res.rowcount)
 
        objs = _coerce_uint64(objs)
 
        # Fill additional columns
        extra_columns: list[pandas.Series] = []
        if "validityStart" in objs.columns:
            objs["validityStart"] = dt
        else:
            extra_columns.append(pandas.Series([dt] * len(objs), name="validityStart"))
        if "validityEnd" in objs.columns:
            objs["validityEnd"] = None
        else:
            extra_columns.append(pandas.Series([None] * len(objs), name="validityEnd"))
        if "lastNonForcedSource" in objs.columns:
            # lastNonForcedSource is defined NOT NULL, fill it with visit time
            # just in case.
            objs["lastNonForcedSource"].fillna(dt, inplace=True)
        else:
            extra_columns.append(pandas.Series([dt] * len(objs), name="lastNonForcedSource"))
        if extra_columns:
            objs.set_index(extra_columns[0].index, inplace=True)
            objs = pandas.concat([objs] + extra_columns, axis="columns")
 
        # Insert replica data
        table = self._schema.get_table(ApdbTables.DiaObject)
        replica_data: list[dict] = []
        replica_stmt: Any = None
        if replica_chunk is not None:
            pk_names = [column.name for column in table.primary_key]
            replica_data = objs[pk_names].to_dict("records")
            for row in replica_data:
                row["apdb_replica_chunk"] = replica_chunk.id
            replica_table = self._schema.get_table(ExtraTables.DiaObjectChunks)
            replica_stmt = replica_table.insert()
 
        # insert new versions
        with Timer("DiaObject insert", self.config.timer):
            objs.to_sql(table.name, connection, if_exists="append", index=False, schema=table.schema)
            if replica_stmt is not None:
                connection.execute(replica_stmt, replica_data)
 

◆ _storeDiaSources()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._storeDiaSources	(		self,
		pandas.DataFrame	sources,
		ReplicaChunk \| None	replica_chunk,
		sqlalchemy.engine.Connection	connection )

protected

Store catalog of DiaSources from current visit.

Parameters
----------
sources : `pandas.DataFrame`
    Catalog containing DiaSource records

Definition at line 1006 of file apdbSql.py.

    ) -> None:
        """Store catalog of DiaSources from current visit.
 
        Parameters
        ----------
        sources : `pandas.DataFrame`
            Catalog containing DiaSource records
        """
        table = self._schema.get_table(ApdbTables.DiaSource)
 
        # Insert replica data
        replica_data: list[dict] = []
        replica_stmt: Any = None
        if replica_chunk is not None:
            pk_names = [column.name for column in table.primary_key]
            replica_data = sources[pk_names].to_dict("records")
            for row in replica_data:
                row["apdb_replica_chunk"] = replica_chunk.id
            replica_table = self._schema.get_table(ExtraTables.DiaSourceChunks)
            replica_stmt = replica_table.insert()
 
        # everything to be done in single transaction
        with Timer("DiaSource insert", self.config.timer):
            sources = _coerce_uint64(sources)
            sources.to_sql(table.name, connection, if_exists="append", index=False, schema=table.schema)
            if replica_stmt is not None:
                connection.execute(replica_stmt, replica_data)
 

◆ _storeReplicaChunk()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._storeReplicaChunk	(		self,
		ReplicaChunk	replica_chunk,
		astropy.time.Time	visit_time,
		sqlalchemy.engine.Connection	connection )

protected

Definition at line 856 of file apdbSql.py.

    ) -> None:
        dt = visit_time.datetime
 
        table = self._schema.get_table(ExtraTables.ApdbReplicaChunks)
 
        # We need UPSERT which is dialect-specific construct
        values = {"last_update_time": dt, "unique_id": replica_chunk.unique_id}
        row = {"apdb_replica_chunk": replica_chunk.id} | values
        if connection.dialect.name == "sqlite":
            insert_sqlite = sqlalchemy.dialects.sqlite.insert(table)
            insert_sqlite = insert_sqlite.on_conflict_do_update(index_elements=table.primary_key, set_=values)
            connection.execute(insert_sqlite, row)
        elif connection.dialect.name == "postgresql":
            insert_pg = sqlalchemy.dialects.postgresql.dml.insert(table)
            insert_pg = insert_pg.on_conflict_do_update(constraint=table.primary_key, set_=values)
            connection.execute(insert_pg, row)
        else:
            raise TypeError(f"Unsupported dialect {connection.dialect.name} for upsert.")
 

◆ _versionCheck()

None lsst.dax.apdb.sql.apdbSql.ApdbSql._versionCheck	(		self,
		ApdbMetadataSql	metadata )

protected

Check schema version compatibility.

Definition at line 292 of file apdbSql.py.

    def _versionCheck(self, metadata: ApdbMetadataSql) -> None:
        """Check schema version compatibility."""
 
        def _get_version(key: str, default: VersionTuple) -> VersionTuple:
            """Retrieve version number from given metadata key."""
            if metadata.table_exists():
                version_str = metadata.get(key)
                if version_str is None:
                    # Should not happen with existing metadata table.
                    raise RuntimeError(f"Version key {key!r} does not exist in metadata table.")
                return VersionTuple.fromString(version_str)
            return default
 
        # For old databases where metadata table does not exist we assume that
        # version of both code and schema is 0.1.0.
        initial_version = VersionTuple(0, 1, 0)
        db_schema_version = _get_version(self.metadataSchemaVersionKey, initial_version)
        db_code_version = _get_version(self.metadataCodeVersionKey, initial_version)
 
        # For now there is no way to make read-only APDB instances, assume that
        # any access can do updates.
        if not self._schema.schemaVersion().checkCompatibility(db_schema_version, True):
            raise IncompatibleVersionError(
                f"Configured schema version {self._schema.schemaVersion()} "
                f"is not compatible with database version {db_schema_version}"
            )
        if not self.apdbImplementationVersion().checkCompatibility(db_code_version, True):
            raise IncompatibleVersionError(
                f"Current code version {self.apdbImplementationVersion()} "
                f"is not compatible with database version {db_code_version}"
            )
 
        # Check replica code version only if replica is enabled.
        if self._schema.has_replica_chunks:
            db_replica_version = _get_version(self.metadataReplicaVersionKey, initial_version)
            code_replica_version = ApdbSqlReplica.apdbReplicaImplementationVersion()
            if not code_replica_version.checkCompatibility(db_replica_version, True):
                raise IncompatibleVersionError(
                    f"Current replication code version {code_replica_version} "
                    f"is not compatible with database version {db_replica_version}"
                )
 

◆ apdbImplementationVersion()

VersionTuple lsst.dax.apdb.sql.apdbSql.ApdbSql.apdbImplementationVersion ( cls )

Return version number for current APDB implementation.

Returns
-------
version : `VersionTuple`
    Version of the code defined in implementation class.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 335 of file apdbSql.py.

    def apdbImplementationVersion(cls) -> VersionTuple:
        # Docstring inherited from base class.
        return VERSION
 

◆ apdbSchemaVersion()

VersionTuple lsst.dax.apdb.sql.apdbSql.ApdbSql.apdbSchemaVersion ( self )

Return schema version number as defined in config file.

Returns
-------
version : `VersionTuple`
    Version of the schema defined in schema config file.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 427 of file apdbSql.py.

    def apdbSchemaVersion(self) -> VersionTuple:
        # Docstring inherited from base class.
        return self._schema.schemaVersion()
 

◆ containsCcdVisit()

bool lsst.dax.apdb.sql.apdbSql.ApdbSql.containsCcdVisit	(		self,
		int	ccdVisitId )

Test whether data for a given visit-detector is present in the APDB.

This method is a placeholder until `Apdb.containsVisitDetector` can
be implemented.

Parameters
----------
ccdVisitId : `int`
    The packed ID of the visit-detector to search for.

Returns
-------
present : `bool`
    `True` if some DiaSource records exist for the specified
    observation, `False` otherwise.

Definition at line 581 of file apdbSql.py.

    def containsCcdVisit(self, ccdVisitId: int) -> bool:
        """Test whether data for a given visit-detector is present in the APDB.
 
        This method is a placeholder until `Apdb.containsVisitDetector` can
        be implemented.
 
        Parameters
        ----------
        ccdVisitId : `int`
            The packed ID of the visit-detector to search for.
 
        Returns
        -------
        present : `bool`
            `True` if some DiaSource records exist for the specified
            observation, `False` otherwise.
        """
        # TODO: remove this method in favor of containsVisitDetector on either
        # DM-41671 or a ticket that removes ccdVisitId from these tables
        src_table: sqlalchemy.schema.Table = self._schema.get_table(ApdbTables.DiaSource)
        frcsrc_table: sqlalchemy.schema.Table = self._schema.get_table(ApdbTables.DiaForcedSource)
        # Query should load only one leaf page of the index
        query1 = sql.select(src_table.c.ccdVisitId).filter_by(ccdVisitId=ccdVisitId).limit(1)
        # Backup query in case an image was processed but had no diaSources
        query2 = sql.select(frcsrc_table.c.ccdVisitId).filter_by(ccdVisitId=ccdVisitId).limit(1)
 
        with self._engine.begin() as conn:
            result = conn.execute(query1).scalar_one_or_none()
            if result is not None:
                return True
            else:
                result = conn.execute(query2).scalar_one_or_none()
                return result is not None
 

◆ containsVisitDetector()

bool lsst.dax.apdb.sql.apdbSql.ApdbSql.containsVisitDetector	(		self,
		int	visit,
		int	detector )

Test whether data for a given visit-detector is present in the APDB.

Parameters
----------
visit, detector : `int`
    The ID of the visit-detector to search for.

Returns
-------
present : `bool`
    `True` if some DiaObject, DiaSource, or DiaForcedSource records
    exist for the specified observation, `False` otherwise.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 577 of file apdbSql.py.

    def containsVisitDetector(self, visit: int, detector: int) -> bool:
        # docstring is inherited from a base class
        raise NotImplementedError()
 

◆ countUnassociatedObjects()

int lsst.dax.apdb.sql.apdbSql.ApdbSql.countUnassociatedObjects ( self )

Return the number of DiaObjects that have only one DiaSource
associated with them.

Used as part of ap_verify metrics.

Returns
-------
count : `int`
    Number of DiaObjects with exactly one associated DiaSource.

Notes
-----
This method can be very inefficient or slow in some implementations.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 712 of file apdbSql.py.

    def countUnassociatedObjects(self) -> int:
        # docstring is inherited from a base class
 
        # Retrieve the DiaObject table.
        table: sqlalchemy.schema.Table = self._schema.get_table(ApdbTables.DiaObject)
 
        # Construct the sql statement.
        stmt = sql.select(func.count()).select_from(table).where(table.c.nDiaSources == 1)
        stmt = stmt.where(table.c.validityEnd == None)  # noqa: E711
 
        # Return the count.
        with self._engine.begin() as conn:
            count = conn.execute(stmt).scalar_one()
 
        return count
 

◆ dailyJob()

None lsst.dax.apdb.sql.apdbSql.ApdbSql.dailyJob ( self )

Implement daily activities like cleanup/vacuum.

What should be done during daily activities is determined by
specific implementation.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 708 of file apdbSql.py.

    def dailyJob(self) -> None:
        # docstring is inherited from a base class
        pass
 

◆ get_replica()

ApdbSqlReplica lsst.dax.apdb.sql.apdbSql.ApdbSql.get_replica ( self )

Return `ApdbReplica` instance for this database.

Definition at line 431 of file apdbSql.py.

    def get_replica(self) -> ApdbSqlReplica:
        """Return `ApdbReplica` instance for this database."""
        return ApdbSqlReplica(self._schema, self._engine)
 

◆ getDiaForcedSources()

pandas.DataFrame \| None lsst.dax.apdb.sql.apdbSql.ApdbSql.getDiaForcedSources	(		self,
		Region	region,
		Iterable[int] \| None	object_ids,
		astropy.time.Time	visit_time )

Return catalog of DiaForcedSource instances from a given region.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIASources.
object_ids : iterable [ `int` ], optional
    List of DiaObject IDs to further constrain the set of returned
    sources. If list is empty then empty catalog is returned with a
    correct schema. If `None` then returned sources are not
    constrained. Some implementations may not support latter case.
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`, or `None`
    Catalog containing DiaSource records. `None` is returned if
    ``read_forced_sources_months`` configuration parameter is set to 0.

Raises
------
NotImplementedError
    May be raised by some implementations if ``object_ids`` is `None`.

Notes
-----
This method returns DiaForcedSource catalog for a region with
additional filtering based on DiaObject IDs. Only a subset of DiaSource
history is returned limited by ``read_forced_sources_months`` config
parameter, w.r.t. ``visit_time``. If ``object_ids`` is empty then an
empty catalog is always returned with the correct schema
(columns/types). If ``object_ids`` is `None` then no filtering is
performed and some of the returned records may be outside the specified
region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 552 of file apdbSql.py.

    ) -> pandas.DataFrame | None:
        # docstring is inherited from a base class
        if self.config.read_forced_sources_months == 0:
            _LOG.debug("Skip DiaForceSources fetching")
            return None
 
        if object_ids is None:
            # This implementation does not support region-based selection.
            raise NotImplementedError("Region-based selection is not supported")
 
        # TODO: DateTime.MJD must be consistent with code in ap_association,
        # alternatively we can fill midpointMjdTai ourselves in store()
        midpointMjdTai_start = _make_midpointMjdTai_start(visit_time, self.config.read_forced_sources_months)
        _LOG.debug("midpointMjdTai_start = %.6f", midpointMjdTai_start)
 
        with Timer("DiaForcedSource select", self.config.timer):
            sources = self._getSourcesByIDs(
                ApdbTables.DiaForcedSource, list(object_ids), midpointMjdTai_start
            )
 
        _LOG.debug("found %s DiaForcedSources", len(sources))
        return sources
 

◆ getDiaObjects()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql.getDiaObjects	(		self,
		Region	region )

Return catalog of DiaObject instances from a given region.

This method returns only the last version of each DiaObject. Some
records in a returned catalog may be outside the specified region, it
is up to a client to ignore those records or cleanup the catalog before
futher use.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIAObjects.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog containing DiaObject records for a region that may be a
    superset of the specified region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 507 of file apdbSql.py.

    def getDiaObjects(self, region: Region) -> pandas.DataFrame:
        # docstring is inherited from a base class
 
        # decide what columns we need
        if self.config.dia_object_index == "last_object_table":
            table_enum = ApdbTables.DiaObjectLast
        else:
            table_enum = ApdbTables.DiaObject
        table = self._schema.get_table(table_enum)
        if not self.config.dia_object_columns:
            columns = self._schema.get_apdb_columns(table_enum)
        else:
            columns = [table.c[col] for col in self.config.dia_object_columns]
        query = sql.select(*columns)
 
        # build selection
        query = query.where(self._filterRegion(table, region))
 
        # select latest version of objects
        if self.config.dia_object_index != "last_object_table":
            query = query.where(table.c.validityEnd == None)  # noqa: E711
 
        # _LOG.debug("query: %s", query)
 
        # execute select
        with Timer("DiaObject select", self.config.timer):
            with self._engine.begin() as conn:
                objects = pandas.read_sql_query(query, conn)
        _LOG.debug("found %s DiaObjects", len(objects))
        return objects
 

◆ getDiaSources()

pandas.DataFrame \| None lsst.dax.apdb.sql.apdbSql.ApdbSql.getDiaSources	(		self,
		Region	region,
		Iterable[int] \| None	object_ids,
		astropy.time.Time	visit_time )

Return catalog of DiaSource instances from a given region.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIASources.
object_ids : iterable [ `int` ], optional
    List of DiaObject IDs to further constrain the set of returned
    sources. If `None` then returned sources are not constrained. If
    list is empty then empty catalog is returned with a correct
    schema.
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`, or `None`
    Catalog containing DiaSource records. `None` is returned if
    ``read_sources_months`` configuration parameter is set to 0.

Notes
-----
This method returns DiaSource catalog for a region with additional
filtering based on DiaObject IDs. Only a subset of DiaSource history
is returned limited by ``read_sources_months`` config parameter, w.r.t.
``visit_time``. If ``object_ids`` is empty then an empty catalog is
always returned with the correct schema (columns/types). If
``object_ids`` is `None` then no filtering is performed and some of the
returned records may be outside the specified region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 538 of file apdbSql.py.

    ) -> pandas.DataFrame | None:
        # docstring is inherited from a base class
        if self.config.read_sources_months == 0:
            _LOG.debug("Skip DiaSources fetching")
            return None
 
        if object_ids is None:
            # region-based select
            return self._getDiaSourcesInRegion(region, visit_time)
        else:
            return self._getDiaSourcesByIDs(list(object_ids), visit_time)
 

◆ getSSObjects()

pandas.DataFrame lsst.dax.apdb.sql.apdbSql.ApdbSql.getSSObjects ( self )

Return catalog of SSObject instances.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog containing SSObject records, all existing records are
    returned.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 615 of file apdbSql.py.

    def getSSObjects(self) -> pandas.DataFrame:
        # docstring is inherited from a base class
 
        columns = self._schema.get_apdb_columns(ApdbTables.SSObject)
        query = sql.select(*columns)
 
        # execute select
        with Timer("DiaObject select", self.config.timer):
            with self._engine.begin() as conn:
                objects = pandas.read_sql_query(query, conn)
        _LOG.debug("found %s SSObjects", len(objects))
        return objects
 

◆ init_database()

ApdbSqlConfig lsst.dax.apdb.sql.apdbSql.ApdbSql.init_database	(		cls,
		str	db_url,
		*str \| None	schema_file = None,
		str \| None	schema_name = None,
		int \| None	read_sources_months = None,
		int \| None	read_forced_sources_months = None,
		bool	use_insert_id = False,
		int \| None	connection_timeout = None,
		str \| None	dia_object_index = None,
		int \| None	htm_level = None,
		str \| None	htm_index_column = None,
		list[str] \| None	ra_dec_columns = None,
		str \| None	prefix = None,
		str \| None	namespace = None,
		bool	drop = False )

Initialize new APDB instance and make configuration object for it.

Parameters
----------
db_url : `str`
    SQLAlchemy database URL.
schema_file : `str`, optional
    Location of (YAML) configuration file with APDB schema. If not
    specified then default location will be used.
schema_name : str | None
    Name of the schema in YAML configuration file. If not specified
    then default name will be used.
read_sources_months : `int`, optional
    Number of months of history to read from DiaSource.
read_forced_sources_months : `int`, optional
    Number of months of history to read from DiaForcedSource.
use_insert_id : `bool`
    If True, make additional tables used for replication to PPDB.
connection_timeout : `int`, optional
    Database connection timeout in seconds.
dia_object_index : `str`, optional
    Indexing mode for DiaObject table.
htm_level : `int`, optional
    HTM indexing level.
htm_index_column : `str`, optional
    Name of a HTM index column for DiaObject and DiaSource tables.
ra_dec_columns : `list` [`str`], optional
    Names of ra/dec columns in DiaObject table.
prefix : `str`, optional
    Optional prefix for all table names.
namespace : `str`, optional
    Name of the database schema for all APDB tables. If not specified
    then default schema is used.
drop : `bool`, optional
    If `True` then drop existing tables before re-creating the schema.

Returns
-------
config : `ApdbSqlConfig`
    Resulting configuration object for a created APDB instance.

Definition at line 340 of file apdbSql.py.

    ) -> ApdbSqlConfig:
        """Initialize new APDB instance and make configuration object for it.
 
        Parameters
        ----------
        db_url : `str`
            SQLAlchemy database URL.
        schema_file : `str`, optional
            Location of (YAML) configuration file with APDB schema. If not
            specified then default location will be used.
        schema_name : str | None
            Name of the schema in YAML configuration file. If not specified
            then default name will be used.
        read_sources_months : `int`, optional
            Number of months of history to read from DiaSource.
        read_forced_sources_months : `int`, optional
            Number of months of history to read from DiaForcedSource.
        use_insert_id : `bool`
            If True, make additional tables used for replication to PPDB.
        connection_timeout : `int`, optional
            Database connection timeout in seconds.
        dia_object_index : `str`, optional
            Indexing mode for DiaObject table.
        htm_level : `int`, optional
            HTM indexing level.
        htm_index_column : `str`, optional
            Name of a HTM index column for DiaObject and DiaSource tables.
        ra_dec_columns : `list` [`str`], optional
            Names of ra/dec columns in DiaObject table.
        prefix : `str`, optional
            Optional prefix for all table names.
        namespace : `str`, optional
            Name of the database schema for all APDB tables. If not specified
            then default schema is used.
        drop : `bool`, optional
            If `True` then drop existing tables before re-creating the schema.
 
        Returns
        -------
        config : `ApdbSqlConfig`
            Resulting configuration object for a created APDB instance.
        """
        config = ApdbSqlConfig(db_url=db_url, use_insert_id=use_insert_id)
        if schema_file is not None:
            config.schema_file = schema_file
        if schema_name is not None:
            config.schema_name = schema_name
        if read_sources_months is not None:
            config.read_sources_months = read_sources_months
        if read_forced_sources_months is not None:
            config.read_forced_sources_months = read_forced_sources_months
        if connection_timeout is not None:
            config.connection_timeout = connection_timeout
        if dia_object_index is not None:
            config.dia_object_index = dia_object_index
        if htm_level is not None:
            config.htm_level = htm_level
        if htm_index_column is not None:
            config.htm_index_column = htm_index_column
        if ra_dec_columns is not None:
            config.ra_dec_columns = ra_dec_columns
        if prefix is not None:
            config.prefix = prefix
        if namespace is not None:
            config.namespace = namespace
 
        cls._makeSchema(config, drop=drop)
 
        return config
 

◆ metadata()

ApdbMetadata lsst.dax.apdb.sql.apdbSql.ApdbSql.metadata ( self )

Object controlling access to APDB metadata (`ApdbMetadata`).

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 729 of file apdbSql.py.

    def metadata(self) -> ApdbMetadata:
        # docstring is inherited from a base class
        if self._metadata is None:
            raise RuntimeError("Database schema was not initialized.")
        return self._metadata
 

◆ reassignDiaSources()

None lsst.dax.apdb.sql.apdbSql.ApdbSql.reassignDiaSources	(		self,
		Mapping[int, int]	idMap )

Associate DiaSources with SSObjects, dis-associating them
from DiaObjects.

Parameters
----------
idMap : `Mapping`
    Maps DiaSource IDs to their new SSObject IDs.

Raises
------
ValueError
    Raised if DiaSource ID does not exist in the database.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 688 of file apdbSql.py.

    def reassignDiaSources(self, idMap: Mapping[int, int]) -> None:
        # docstring is inherited from a base class
 
        table = self._schema.get_table(ApdbTables.DiaSource)
        query = table.update().where(table.columns["diaSourceId"] == sql.bindparam("srcId"))
 
        with self._engine.begin() as conn:
            # Need to make sure that every ID exists in the database, but
            # executemany may not support rowcount, so iterate and check what
            # is missing.
            missing_ids: list[int] = []
            for key, value in idMap.items():
                params = dict(srcId=key, diaObjectId=0, ssObjectId=value)
                result = conn.execute(query, params)
                if result.rowcount == 0:
                    missing_ids.append(key)
            if missing_ids:
                missing = ",".join(str(item) for item in missing_ids)
                raise ValueError(f"Following DiaSource IDs do not exist in the database: {missing}")
 

◆ store()

None lsst.dax.apdb.sql.apdbSql.ApdbSql.store	(		self,
		astropy.time.Time	visit_time,
		pandas.DataFrame	objects,
		pandas.DataFrame \| None	sources = None,
		pandas.DataFrame \| None	forced_sources = None )

Store all three types of catalogs in the database.

Parameters
----------
visit_time : `astropy.time.Time`
    Time of the visit.
objects : `pandas.DataFrame`
    Catalog with DiaObject records.
sources : `pandas.DataFrame`, optional
    Catalog with DiaSource records.
forced_sources : `pandas.DataFrame`, optional
    Catalog with DiaForcedSource records.

Notes
-----
This methods takes DataFrame catalogs, their schema must be
compatible with the schema of APDB table:

  - column names must correspond to database table columns
  - types and units of the columns must match database definitions,
    no unit conversion is performed presently
  - columns that have default values in database schema can be
    omitted from catalog
  - this method knows how to fill interval-related columns of DiaObject
    (validityStart, validityEnd) they do not need to appear in a
    catalog
  - source catalogs have ``diaObjectId`` column associating sources
    with objects

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 628 of file apdbSql.py.

    ) -> None:
        # docstring is inherited from a base class
 
        # We want to run all inserts in one transaction.
        with self._engine.begin() as connection:
            replica_chunk: ReplicaChunk | None = None
            if self._schema.has_replica_chunks:
                replica_chunk = ReplicaChunk.make_replica_chunk(visit_time, self.config.replica_chunk_seconds)
                self._storeReplicaChunk(replica_chunk, visit_time, connection)
 
            # fill pixelId column for DiaObjects
            objects = self._add_obj_htm_index(objects)
            self._storeDiaObjects(objects, visit_time, replica_chunk, connection)
 
            if sources is not None:
                # copy pixelId column from DiaObjects to DiaSources
                sources = self._add_src_htm_index(sources, objects)
                self._storeDiaSources(sources, replica_chunk, connection)
 
            if forced_sources is not None:
                self._storeDiaForcedSources(forced_sources, replica_chunk, connection)
 

◆ storeSSObjects()

None lsst.dax.apdb.sql.apdbSql.ApdbSql.storeSSObjects	(		self,
		pandas.DataFrame	objects )

Store or update SSObject catalog.

Parameters
----------
objects : `pandas.DataFrame`
    Catalog with SSObject records.

Notes
-----
If SSObjects with matching IDs already exist in the database, their
records will be updated with the information from provided records.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 656 of file apdbSql.py.

    def storeSSObjects(self, objects: pandas.DataFrame) -> None:
        # docstring is inherited from a base class
 
        idColumn = "ssObjectId"
        table = self._schema.get_table(ApdbTables.SSObject)
 
        # everything to be done in single transaction
        with self._engine.begin() as conn:
            # Find record IDs that already exist. Some types like np.int64 can
            # cause issues with sqlalchemy, convert them to int.
            ids = sorted(int(oid) for oid in objects[idColumn])
 
            query = sql.select(table.columns[idColumn], table.columns[idColumn].in_(ids))
            result = conn.execute(query)
            knownIds = set(row.ssObjectId for row in result)
 
            filter = objects[idColumn].isin(knownIds)
            toUpdate = cast(pandas.DataFrame, objects[filter])
            toInsert = cast(pandas.DataFrame, objects[~filter])
 
            # insert new records
            if len(toInsert) > 0:
                toInsert.to_sql(table.name, conn, if_exists="append", index=False, schema=table.schema)
 
            # update existing records
            if len(toUpdate) > 0:
                whereKey = f"{idColumn}_param"
                update = table.update().where(table.columns[idColumn] == sql.bindparam(whereKey))
                toUpdate = toUpdate.rename({idColumn: whereKey}, axis="columns")
                values = toUpdate.to_dict("records")
                result = conn.execute(update, values)
 

◆ tableDef()

Table \| None lsst.dax.apdb.sql.apdbSql.ApdbSql.tableDef	(		self,
		ApdbTables	table )

Return table schema definition for a given table.

Parameters
----------
table : `ApdbTables`
    One of the known APDB tables.

Returns
-------
tableSchema : `.schema_model.Table` or `None`
    Table schema description, `None` is returned if table is not
    defined by this implementation.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 459 of file apdbSql.py.

    def tableDef(self, table: ApdbTables) -> Table | None:
        # docstring is inherited from a base class
        return self._schema.tableSchemas.get(table)
 

◆ tableRowCount()

dict[str, int] lsst.dax.apdb.sql.apdbSql.ApdbSql.tableRowCount ( self )

Return dictionary with the table names and row counts.

Used by ``ap_proto`` to keep track of the size of the database tables.
Depending on database technology this could be expensive operation.

Returns
-------
row_counts : `dict`
    Dict where key is a table name and value is a row count.

Definition at line 435 of file apdbSql.py.

    def tableRowCount(self) -> dict[str, int]:
        """Return dictionary with the table names and row counts.
 
        Used by ``ap_proto`` to keep track of the size of the database tables.
        Depending on database technology this could be expensive operation.
 
        Returns
        -------
        row_counts : `dict`
            Dict where key is a table name and value is a row count.
        """
        res = {}
        tables = [ApdbTables.DiaObject, ApdbTables.DiaSource, ApdbTables.DiaForcedSource]
        if self.config.dia_object_index == "last_object_table":
            tables.append(ApdbTables.DiaObjectLast)
        with self._engine.begin() as conn:
            for table in tables:
                sa_table = self._schema.get_table(table)
                stmt = sql.select(func.count()).select_from(sa_table)
                count: int = conn.execute(stmt).scalar_one()
                res[table.name] = count
 
        return res
 

Member Data Documentation

◆ _engine

lsst.dax.apdb.sql.apdbSql.ApdbSql._engine

protected

Definition at line 213 of file apdbSql.py.

◆ _frozen_parameters

tuple lsst.dax.apdb.sql.apdbSql.ApdbSql._frozen_parameters

staticprotected

Initial value:

=  (
        "use_insert_id",
        "dia_object_index",
        "htm_level",
        "htm_index_column",
        "ra_dec_columns",
    )

Definition at line 203 of file apdbSql.py.

◆ _metadata

lsst.dax.apdb.sql.apdbSql.ApdbSql._metadata

protected

Definition at line 221 of file apdbSql.py.

◆ _schema

lsst.dax.apdb.sql.apdbSql.ApdbSql._schema

protected

Definition at line 233 of file apdbSql.py.

◆ config

lsst.dax.apdb.sql.apdbSql.ApdbSql.config

Definition at line 228 of file apdbSql.py.

◆ ConfigClass

lsst.dax.apdb.sql.apdbSql.ApdbSql.ConfigClass = ApdbSqlConfig

static

Definition at line 189 of file apdbSql.py.

◆ metadataCodeVersionKey [1/2]

str lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataCodeVersionKey = "version:ApdbSql"

static

Definition at line 194 of file apdbSql.py.

◆ metadataCodeVersionKey [2/2]

lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataCodeVersionKey

Definition at line 494 of file apdbSql.py.

◆ metadataConfigKey [1/2]

str lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataConfigKey = "config:apdb-sql.json"

static

Definition at line 200 of file apdbSql.py.

◆ metadataConfigKey [2/2]

lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataConfigKey

Definition at line 505 of file apdbSql.py.

◆ metadataReplicaVersionKey [1/2]

str lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataReplicaVersionKey = "version:ApdbSqlReplica"

static

Definition at line 197 of file apdbSql.py.

◆ metadataReplicaVersionKey [2/2]

lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataReplicaVersionKey

Definition at line 498 of file apdbSql.py.

◆ metadataSchemaVersionKey [1/2]

str lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataSchemaVersionKey = "version:schema"

static

Definition at line 191 of file apdbSql.py.

◆ metadataSchemaVersionKey [2/2]

lsst.dax.apdb.sql.apdbSql.ApdbSql.metadataSchemaVersionKey

Definition at line 493 of file apdbSql.py.

◆ pixelator

lsst.dax.apdb.sql.apdbSql.ApdbSql.pixelator

Definition at line 247 of file apdbSql.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/lsst-scipipe-8.0.0/Linux64/dax_apdb/g295015adf3+81dd352a9d/python/lsst/dax/apdb/sql/apdbSql.py

Public Member Functions

Public Attributes

Static Public Attributes

Protected Member Functions

Protected Attributes

Static Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ _add_obj_htm_index()

◆ _add_src_htm_index()

◆ _filterRegion()

◆ _getDiaSourcesByIDs()

◆ _getDiaSourcesInRegion()

◆ _getSourcesByIDs()

◆ _htm_indices()

◆ _makeEngine()

◆ _makeSchema()

◆ _storeDiaForcedSources()

◆ _storeDiaObjects()

◆ _storeDiaSources()

◆ _storeReplicaChunk()

◆ _versionCheck()

◆ apdbImplementationVersion()

◆ apdbSchemaVersion()

◆ containsCcdVisit()

◆ containsVisitDetector()

◆ countUnassociatedObjects()

◆ dailyJob()

◆ get_replica()

◆ getDiaForcedSources()

◆ getDiaObjects()

◆ getDiaSources()

◆ getSSObjects()

◆ init_database()

◆ metadata()

◆ reassignDiaSources()

◆ store()

◆ storeSSObjects()

◆ tableDef()

◆ tableRowCount()

Member Data Documentation

◆ _engine

◆ _frozen_parameters

◆ _metadata

◆ _schema

◆ config

◆ ConfigClass

◆ metadataCodeVersionKey [1/2]

◆ metadataCodeVersionKey [2/2]

◆ metadataConfigKey [1/2]

◆ metadataConfigKey [2/2]

◆ metadataReplicaVersionKey [1/2]

◆ metadataReplicaVersionKey [2/2]

◆ metadataSchemaVersionKey [1/2]

◆ metadataSchemaVersionKey [2/2]

◆ pixelator

◆ init()