Inheritance diagram for lsst.dax.apdb.apdbCassandra.ApdbCassandra:

Public Member Functions
	__init__ (self, ApdbCassandraConfig config)

None	__del__ (self)

VersionTuple	apdbImplementationVersion (cls)

VersionTuple	apdbSchemaVersion (self)

Table\|None	tableDef (self, ApdbTables table)

ApdbCassandraConfig	init_database (cls, list[str] hosts, str keyspace, *str\|None schema_file=None, str\|None schema_name=None, int\|None read_sources_months=None, int\|None read_forced_sources_months=None, bool use_insert_id=False, bool use_insert_id_skips_diaobjects=False, int\|None port=None, str\|None username=None, str\|None prefix=None, str\|None part_pixelization=None, int\|None part_pix_level=None, bool time_partition_tables=True, str\|None time_partition_start=None, str\|None time_partition_end=None, str\|None read_consistency=None, str\|None write_consistency=None, int\|None read_timeout=None, int\|None write_timeout=None, list[str]\|None ra_dec_columns=None, int\|None replication_factor=None, bool drop=False)

pandas.DataFrame	getDiaObjects (self, sphgeom.Region region)

pandas.DataFrame\|None	getDiaSources (self, sphgeom.Region region, Iterable[int]\|None object_ids, astropy.time.Time visit_time)

pandas.DataFrame\|None	getDiaForcedSources (self, sphgeom.Region region, Iterable[int]\|None object_ids, astropy.time.Time visit_time)

bool	containsVisitDetector (self, int visit, int detector)

list[ApdbInsertId]\|None	getInsertIds (self)

None	deleteInsertIds (self, Iterable[ApdbInsertId] ids)

ApdbTableData	getDiaObjectsHistory (self, Iterable[ApdbInsertId] ids)

ApdbTableData	getDiaSourcesHistory (self, Iterable[ApdbInsertId] ids)

ApdbTableData	getDiaForcedSourcesHistory (self, Iterable[ApdbInsertId] ids)

pandas.DataFrame	getSSObjects (self)

None	store (self, astropy.time.Time visit_time, pandas.DataFrame objects, pandas.DataFrame\|None sources=None, pandas.DataFrame\|None forced_sources=None)

None	storeSSObjects (self, pandas.DataFrame objects)

None	reassignDiaSources (self, Mapping[int, int] idMap)

None	dailyJob (self)

int	countUnassociatedObjects (self)

ApdbMetadata	metadata (self)

Public Attributes
	config

	metadataSchemaVersionKey

	metadataCodeVersionKey

	metadataConfigKey

Static Public Attributes
str	metadataSchemaVersionKey = "version:schema"

str	metadataCodeVersionKey = "version:ApdbCassandra"

str	metadataConfigKey = "config:apdb-cassandra.json"

	partition_zero_epoch = astropy.time.Time(0, format="unix_tai")

Protected Member Functions
tuple[Cluster, Session]	_make_session (cls, ApdbCassandraConfig config)

AuthProvider\|None	_make_auth_provider (cls, ApdbCassandraConfig config)

None	_versionCheck (self, ApdbMetadataCassandra metadata)

None	_makeSchema (cls, ApdbConfig config, *bool drop=False, int\|None replication_factor=None)

Mapping[Any, ExecutionProfile]	_makeProfiles (cls, ApdbCassandraConfig config)

pandas.DataFrame	_getSources (self, sphgeom.Region region, Iterable[int]\|None object_ids, float mjd_start, float mjd_end, ApdbTables table_name)

ApdbTableData	_get_history (self, ExtraTables table, Iterable[ApdbInsertId] ids)

None	_storeInsertId (self, ApdbInsertId insert_id, astropy.time.Time visit_time)

None	_storeDiaObjects (self, pandas.DataFrame objs, astropy.time.Time visit_time, ApdbInsertId\|None insert_id)

None	_storeDiaSources (self, ApdbTables table_name, pandas.DataFrame sources, astropy.time.Time visit_time, ApdbInsertId\|None insert_id)

None	_storeDiaSourcesPartitions (self, pandas.DataFrame sources, astropy.time.Time visit_time, ApdbInsertId\|None insert_id)

None	_storeObjectsPandas (self, pandas.DataFrame records, ApdbTables\|ExtraTables table_name, Mapping\|None extra_columns=None, int\|None time_part=None)

pandas.DataFrame	_add_obj_part (self, pandas.DataFrame df)

pandas.DataFrame	_add_src_part (self, pandas.DataFrame sources, pandas.DataFrame objs)

pandas.DataFrame	_add_fsrc_part (self, pandas.DataFrame sources, pandas.DataFrame objs)

int	_time_partition_cls (cls, float\|astropy.time.Time time, float epoch_mjd, int part_days)

int	_time_partition (self, float\|astropy.time.Time time)

pandas.DataFrame	_make_empty_catalog (self, ApdbTables table_name)

Iterator[tuple[cassandra.query.Statement, tuple]]	_combine_where (self, str prefix, list[tuple[str, tuple]] where1, list[tuple[str, tuple]] where2, str\|None suffix=None)

list[tuple[str, tuple]]	_spatial_where (self, sphgeom.Region\|None region, bool use_ranges=False)

tuple[list[str], list[tuple[str, tuple]]]	_temporal_where (self, ApdbTables table, float\|astropy.time.Time start_time, float\|astropy.time.Time end_time, bool\|None query_per_time_part=None)

Protected Attributes
	_keyspace

	_cluster

	_session

	_metadata

	_pixelization

	_schema

	_partition_zero_epoch_mjd

	_preparer

Static Protected Attributes
tuple	_frozen_parameters

Detailed Description

Implementation of APDB database on to of Apache Cassandra.

The implementation is configured via standard ``pex_config`` mechanism
using `ApdbCassandraConfig` configuration class. For an example of
different configurations check config/ folder.

Parameters
----------
config : `ApdbCassandraConfig`
    Configuration object.

Definition at line 249 of file apdbCassandra.py.

Constructor & Destructor Documentation

◆ init()

lsst.dax.apdb.apdbCassandra.ApdbCassandra.__init__	(		self,
		ApdbCassandraConfig	config )

Definition at line 285 of file apdbCassandra.py.

    def __init__(self, config: ApdbCassandraConfig):
        if not CASSANDRA_IMPORTED:
            raise CassandraMissingError()
 
        self._keyspace = config.keyspace
 
        self._cluster, self._session = self._make_session(config)
 
        meta_table_name = ApdbTables.metadata.table_name(config.prefix)
        self._metadata = ApdbMetadataCassandra(
            self._session, meta_table_name, config.keyspace, "read_tuples", "write"
        )
 
        # Read frozen config from metadata.
        config_json = self._metadata.get(self.metadataConfigKey)
        if config_json is not None:
            # Update config from metadata.
            freezer = ApdbConfigFreezer[ApdbCassandraConfig](self._frozen_parameters)
            self.config = freezer.update(config, config_json)
        else:
            self.config = config
        self.config.validate()
 
        self._pixelization = Pixelization(
            self.config.part_pixelization,
            self.config.part_pix_level,
            config.part_pix_max_ranges,
        )
 
        self._schema = ApdbCassandraSchema(
            session=self._session,
            keyspace=self._keyspace,
            schema_file=self.config.schema_file,
            schema_name=self.config.schema_name,
            prefix=self.config.prefix,
            time_partition_tables=self.config.time_partition_tables,
            use_insert_id=self.config.use_insert_id,
        )
        self._partition_zero_epoch_mjd = float(self.partition_zero_epoch.mjd)
 
        if self._metadata.table_exists():
            self._versionCheck(self._metadata)
 
        # Cache for prepared statements
        self._preparer = PreparedStatementCache(self._session)
 
        _LOG.debug("ApdbCassandra Configuration:")
        for key, value in self.config.items():
            _LOG.debug("    %s: %s", key, value)
 

◆ del()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.__del__ ( self )

Definition at line 335 of file apdbCassandra.py.

    def __del__(self) -> None:
        if hasattr(self, "_cluster"):
            self._cluster.shutdown()
 

Member Function Documentation

◆ _add_fsrc_part()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra._add_fsrc_part	(		self,
		pandas.DataFrame	sources,
		pandas.DataFrame	objs )

protected

Add apdb_part column to DiaForcedSource catalog.

Notes
-----
This method copies apdb_part value from a matching DiaObject record.
DiaObject catalog needs to have a apdb_part column filled by
``_add_obj_part`` method and DiaSource records need to be
associated to DiaObjects via ``diaObjectId`` column.

This overrides any existing column in a DataFrame with the same name
(apdb_part). Original DataFrame is not changed, copy of a DataFrame is
returned.

Definition at line 1311 of file apdbCassandra.py.

    def _add_fsrc_part(self, sources: pandas.DataFrame, objs: pandas.DataFrame) -> pandas.DataFrame:
        """Add apdb_part column to DiaForcedSource catalog.
 
        Notes
        -----
        This method copies apdb_part value from a matching DiaObject record.
        DiaObject catalog needs to have a apdb_part column filled by
        ``_add_obj_part`` method and DiaSource records need to be
        associated to DiaObjects via ``diaObjectId`` column.
 
        This overrides any existing column in a DataFrame with the same name
        (apdb_part). Original DataFrame is not changed, copy of a DataFrame is
        returned.
        """
        pixel_id_map: dict[int, int] = {
            diaObjectId: apdb_part for diaObjectId, apdb_part in zip(objs["diaObjectId"], objs["apdb_part"])
        }
        apdb_part = np.zeros(sources.shape[0], dtype=np.int64)
        for i, diaObjId in enumerate(sources["diaObjectId"]):
            apdb_part[i] = pixel_id_map[diaObjId]
        sources = sources.copy()
        sources["apdb_part"] = apdb_part
        return sources
 

◆ _add_obj_part()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra._add_obj_part	(		self,
		pandas.DataFrame	df )

protected

Calculate spatial partition for each record and add it to a
DataFrame.

Notes
-----
This overrides any existing column in a DataFrame with the same name
(apdb_part). Original DataFrame is not changed, copy of a DataFrame is
returned.

Definition at line 1255 of file apdbCassandra.py.

    def _add_obj_part(self, df: pandas.DataFrame) -> pandas.DataFrame:
        """Calculate spatial partition for each record and add it to a
        DataFrame.
 
        Notes
        -----
        This overrides any existing column in a DataFrame with the same name
        (apdb_part). Original DataFrame is not changed, copy of a DataFrame is
        returned.
        """
        # calculate HTM index for every DiaObject
        apdb_part = np.zeros(df.shape[0], dtype=np.int64)
        ra_col, dec_col = self.config.ra_dec_columns
        for i, (ra, dec) in enumerate(zip(df[ra_col], df[dec_col])):
            uv3d = sphgeom.UnitVector3d(sphgeom.LonLat.fromDegrees(ra, dec))
            idx = self._pixelization.pixel(uv3d)
            apdb_part[i] = idx
        df = df.copy()
        df["apdb_part"] = apdb_part
        return df
 

◆ _add_src_part()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra._add_src_part	(		self,
		pandas.DataFrame	sources,
		pandas.DataFrame	objs )

protected

Add apdb_part column to DiaSource catalog.

Notes
-----
This method copies apdb_part value from a matching DiaObject record.
DiaObject catalog needs to have a apdb_part column filled by
``_add_obj_part`` method and DiaSource records need to be
associated to DiaObjects via ``diaObjectId`` column.

This overrides any existing column in a DataFrame with the same name
(apdb_part). Original DataFrame is not changed, copy of a DataFrame is
returned.

Definition at line 1276 of file apdbCassandra.py.

    def _add_src_part(self, sources: pandas.DataFrame, objs: pandas.DataFrame) -> pandas.DataFrame:
        """Add apdb_part column to DiaSource catalog.
 
        Notes
        -----
        This method copies apdb_part value from a matching DiaObject record.
        DiaObject catalog needs to have a apdb_part column filled by
        ``_add_obj_part`` method and DiaSource records need to be
        associated to DiaObjects via ``diaObjectId`` column.
 
        This overrides any existing column in a DataFrame with the same name
        (apdb_part). Original DataFrame is not changed, copy of a DataFrame is
        returned.
        """
        pixel_id_map: dict[int, int] = {
            diaObjectId: apdb_part for diaObjectId, apdb_part in zip(objs["diaObjectId"], objs["apdb_part"])
        }
        apdb_part = np.zeros(sources.shape[0], dtype=np.int64)
        ra_col, dec_col = self.config.ra_dec_columns
        for i, (diaObjId, ra, dec) in enumerate(
            zip(sources["diaObjectId"], sources[ra_col], sources[dec_col])
        ):
            if diaObjId == 0:
                # DiaSources associated with SolarSystemObjects do not have an
                # associated DiaObject hence we skip them and set partition
                # based on its own ra/dec
                uv3d = sphgeom.UnitVector3d(sphgeom.LonLat.fromDegrees(ra, dec))
                idx = self._pixelization.pixel(uv3d)
                apdb_part[i] = idx
            else:
                apdb_part[i] = pixel_id_map[diaObjId]
        sources = sources.copy()
        sources["apdb_part"] = apdb_part
        return sources
 

◆ _combine_where()

Iterator[tuple[cassandra.query.Statement, tuple]] lsst.dax.apdb.apdbCassandra.ApdbCassandra._combine_where	(		self,
		str	prefix,
		list[tuple[str, tuple]]	where1,
		list[tuple[str, tuple]]	where2,
		str \| None	suffix = None )

protected

Make cartesian product of two parts of WHERE clause into a series
of statements to execute.

Parameters
----------
prefix : `str`
    Initial statement prefix that comes before WHERE clause, e.g.
    "SELECT * from Table"

Definition at line 1405 of file apdbCassandra.py.

    ) -> Iterator[tuple[cassandra.query.Statement, tuple]]:
        """Make cartesian product of two parts of WHERE clause into a series
        of statements to execute.
 
        Parameters
        ----------
        prefix : `str`
            Initial statement prefix that comes before WHERE clause, e.g.
            "SELECT * from Table"
        """
        # If lists are empty use special sentinels.
        if not where1:
            where1 = [("", ())]
        if not where2:
            where2 = [("", ())]
 
        for expr1, params1 in where1:
            for expr2, params2 in where2:
                full_query = prefix
                wheres = []
                if expr1:
                    wheres.append(expr1)
                if expr2:
                    wheres.append(expr2)
                if wheres:
                    full_query += " WHERE " + " AND ".join(wheres)
                if suffix:
                    full_query += " " + suffix
                params = params1 + params2
                if params:
                    statement = self._preparer.prepare(full_query)
                else:
                    # If there are no params then it is likely that query
                    # has a bunch of literals rendered already, no point
                    # trying to prepare it.
                    statement = cassandra.query.SimpleStatement(full_query)
                yield (statement, params)
 

◆ _get_history()

ApdbTableData lsst.dax.apdb.apdbCassandra.ApdbCassandra._get_history	(		self,
		ExtraTables	table,
		Iterable[ApdbInsertId]	ids )

protected

Return records from a particular table given set of insert IDs.

Definition at line 1037 of file apdbCassandra.py.

    def _get_history(self, table: ExtraTables, ids: Iterable[ApdbInsertId]) -> ApdbTableData:
        """Return records from a particular table given set of insert IDs."""
        if not self._schema.has_insert_id:
            raise ValueError("APDB is not configured for history retrieval")
 
        insert_ids = [id.id for id in ids]
        params = ",".join("?" * len(insert_ids))
 
        table_name = self._schema.tableName(table)
        # I know that history table schema has only regular APDB columns plus
        # an insert_id column, and this is exactly what we need to return from
        # this method, so selecting a star is fine here.
        query = f'SELECT * FROM "{self._keyspace}"."{table_name}" WHERE insert_id IN ({params})'
        statement = self._preparer.prepare(query)
 
        with Timer("DiaObject history", self.config.timer):
            result = self._session.execute(statement, insert_ids, execution_profile="read_raw")
            table_data = cast(ApdbCassandraTableData, result._current_rows)
        return table_data
 

◆ _getSources()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra._getSources	(		self,
		sphgeom.Region	region,
		Iterable[int] \| None	object_ids,
		float	mjd_start,
		float	mjd_end,
		ApdbTables	table_name )

protected

Return catalog of DiaSource instances given set of DiaObject IDs.

Parameters
----------
region : `lsst.sphgeom.Region`
    Spherical region.
object_ids :
    Collection of DiaObject IDs
mjd_start : `float`
    Lower bound of time interval.
mjd_end : `float`
    Upper bound of time interval.
table_name : `ApdbTables`
    Name of the table.

Returns
-------
catalog : `pandas.DataFrame`, or `None`
    Catalog containing DiaSource records. Empty catalog is returned if
    ``object_ids`` is empty.

Definition at line 970 of file apdbCassandra.py.

    ) -> pandas.DataFrame:
        """Return catalog of DiaSource instances given set of DiaObject IDs.
 
        Parameters
        ----------
        region : `lsst.sphgeom.Region`
            Spherical region.
        object_ids :
            Collection of DiaObject IDs
        mjd_start : `float`
            Lower bound of time interval.
        mjd_end : `float`
            Upper bound of time interval.
        table_name : `ApdbTables`
            Name of the table.
 
        Returns
        -------
        catalog : `pandas.DataFrame`, or `None`
            Catalog containing DiaSource records. Empty catalog is returned if
            ``object_ids`` is empty.
        """
        object_id_set: Set[int] = set()
        if object_ids is not None:
            object_id_set = set(object_ids)
            if len(object_id_set) == 0:
                return self._make_empty_catalog(table_name)
 
        sp_where = self._spatial_where(region)
        tables, temporal_where = self._temporal_where(table_name, mjd_start, mjd_end)
 
        # We need to exclude extra partitioning columns from result.
        column_names = self._schema.apdbColumnNames(table_name)
        what = ",".join(quote_id(column) for column in column_names)
 
        # Build all queries
        statements: list[tuple] = []
        for table in tables:
            prefix = f'SELECT {what} from "{self._keyspace}"."{table}"'
            statements += list(self._combine_where(prefix, sp_where, temporal_where))
        _LOG.debug("_getSources %s: #queries: %s", table_name, len(statements))
 
        with Timer(table_name.name + " select", self.config.timer):
            catalog = cast(
                pandas.DataFrame,
                select_concurrent(
                    self._session, statements, "read_pandas_multi", self.config.read_concurrency
                ),
            )
 
        # filter by given object IDs
        if len(object_id_set) > 0:
            catalog = cast(pandas.DataFrame, catalog[catalog["diaObjectId"].isin(object_id_set)])
 
        # precise filtering on midpointMjdTai
        catalog = cast(pandas.DataFrame, catalog[catalog["midpointMjdTai"] > mjd_start])
 
        _LOG.debug("found %d %ss", catalog.shape[0], table_name.name)
        return catalog
 

◆ _make_auth_provider()

AuthProvider \| None lsst.dax.apdb.apdbCassandra.ApdbCassandra._make_auth_provider	(		cls,
		ApdbCassandraConfig	config )

protected

Make Cassandra authentication provider instance.

Definition at line 361 of file apdbCassandra.py.

    def _make_auth_provider(cls, config: ApdbCassandraConfig) -> AuthProvider | None:
        """Make Cassandra authentication provider instance."""
        try:
            dbauth = DbAuth(DB_AUTH_PATH, DB_AUTH_ENVVAR)
        except DbAuthNotFoundError:
            # Credentials file doesn't exist, use anonymous login.
            return None
 
        empty_username = True
        # Try every contact point in turn.
        for hostname in config.contact_points:
            try:
                username, password = dbauth.getAuth(
                    "cassandra", config.username, hostname, config.port, config.keyspace
                )
                if not username:
                    # Password without user name, try next hostname, but give
                    # warning later if no better match is found.
                    empty_username = True
                else:
                    return PlainTextAuthProvider(username=username, password=password)
            except DbAuthNotFoundError:
                pass
 
        if empty_username:
            _LOG.warning(
                f"Credentials file ({DB_AUTH_PATH} or ${DB_AUTH_ENVVAR}) provided password but not "
                f"user name, anonymous Cassandra logon will be attempted."
            )
 
        return None
 

◆ _make_empty_catalog()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra._make_empty_catalog	(		self,
		ApdbTables	table_name )

protected

Make an empty catalog for a table with a given name.

Parameters
----------
table_name : `ApdbTables`
    Name of the table.

Returns
-------
catalog : `pandas.DataFrame`
    An empty catalog.

Definition at line 1384 of file apdbCassandra.py.

    def _make_empty_catalog(self, table_name: ApdbTables) -> pandas.DataFrame:
        """Make an empty catalog for a table with a given name.
 
        Parameters
        ----------
        table_name : `ApdbTables`
            Name of the table.
 
        Returns
        -------
        catalog : `pandas.DataFrame`
            An empty catalog.
        """
        table = self._schema.tableSchemas[table_name]
 
        data = {
            columnDef.name: pandas.Series(dtype=self._schema.column_dtype(columnDef.datatype))
            for columnDef in table.columns
        }
        return pandas.DataFrame(data)
 

◆ _make_session()

tuple[Cluster, Session] lsst.dax.apdb.apdbCassandra.ApdbCassandra._make_session	(		cls,
		ApdbCassandraConfig	config )

protected

Make Cassandra session.

Definition at line 340 of file apdbCassandra.py.

    def _make_session(cls, config: ApdbCassandraConfig) -> tuple[Cluster, Session]:
        """Make Cassandra session."""
        addressTranslator: AddressTranslator | None = None
        if config.private_ips:
            addressTranslator = _AddressTranslator(list(config.contact_points), list(config.private_ips))
 
        cluster = Cluster(
            execution_profiles=cls._makeProfiles(config),
            contact_points=config.contact_points,
            port=config.port,
            address_translator=addressTranslator,
            protocol_version=config.protocol_version,
            auth_provider=cls._make_auth_provider(config),
        )
        session = cluster.connect()
        # Disable result paging
        session.default_fetch_size = None
 
        return cluster, session
 

◆ _makeProfiles()

Mapping[Any, ExecutionProfile] lsst.dax.apdb.apdbCassandra.ApdbCassandra._makeProfiles	(		cls,
		ApdbCassandraConfig	config )

protected

Make all execution profiles used in the code.

Definition at line 911 of file apdbCassandra.py.

    def _makeProfiles(cls, config: ApdbCassandraConfig) -> Mapping[Any, ExecutionProfile]:
        """Make all execution profiles used in the code."""
        if config.private_ips:
            loadBalancePolicy = WhiteListRoundRobinPolicy(hosts=config.contact_points)
        else:
            loadBalancePolicy = RoundRobinPolicy()
 
        read_tuples_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.read_consistency),
            request_timeout=config.read_timeout,
            row_factory=cassandra.query.tuple_factory,
            load_balancing_policy=loadBalancePolicy,
        )
        read_pandas_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.read_consistency),
            request_timeout=config.read_timeout,
            row_factory=pandas_dataframe_factory,
            load_balancing_policy=loadBalancePolicy,
        )
        read_raw_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.read_consistency),
            request_timeout=config.read_timeout,
            row_factory=raw_data_factory,
            load_balancing_policy=loadBalancePolicy,
        )
        # Profile to use with select_concurrent to return pandas data frame
        read_pandas_multi_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.read_consistency),
            request_timeout=config.read_timeout,
            row_factory=pandas_dataframe_factory,
            load_balancing_policy=loadBalancePolicy,
        )
        # Profile to use with select_concurrent to return raw data (columns and
        # rows)
        read_raw_multi_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.read_consistency),
            request_timeout=config.read_timeout,
            row_factory=raw_data_factory,
            load_balancing_policy=loadBalancePolicy,
        )
        write_profile = ExecutionProfile(
            consistency_level=getattr(cassandra.ConsistencyLevel, config.write_consistency),
            request_timeout=config.write_timeout,
            load_balancing_policy=loadBalancePolicy,
        )
        # To replace default DCAwareRoundRobinPolicy
        default_profile = ExecutionProfile(
            load_balancing_policy=loadBalancePolicy,
        )
        return {
            "read_tuples": read_tuples_profile,
            "read_pandas": read_pandas_profile,
            "read_raw": read_raw_profile,
            "read_pandas_multi": read_pandas_multi_profile,
            "read_raw_multi": read_raw_multi_profile,
            "write": write_profile,
            EXEC_PROFILE_DEFAULT: default_profile,
        }
 

◆ _makeSchema()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._makeSchema	(		cls,
		ApdbConfig	config,
		*bool	drop = False,
		int \| None	replication_factor = None )

protected

Definition at line 573 of file apdbCassandra.py.

    ) -> None:
        # docstring is inherited from a base class
 
        if not isinstance(config, ApdbCassandraConfig):
            raise TypeError(f"Unexpected type of configuration object: {type(config)}")
 
        cluster, session = cls._make_session(config)
 
        schema = ApdbCassandraSchema(
            session=session,
            keyspace=config.keyspace,
            schema_file=config.schema_file,
            schema_name=config.schema_name,
            prefix=config.prefix,
            time_partition_tables=config.time_partition_tables,
            use_insert_id=config.use_insert_id,
        )
 
        # Ask schema to create all tables.
        if config.time_partition_tables:
            time_partition_start = astropy.time.Time(config.time_partition_start, format="isot", scale="tai")
            time_partition_end = astropy.time.Time(config.time_partition_end, format="isot", scale="tai")
            part_epoch = float(cls.partition_zero_epoch.mjd)
            part_days = config.time_partition_days
            part_range = (
                cls._time_partition_cls(time_partition_start, part_epoch, part_days),
                cls._time_partition_cls(time_partition_end, part_epoch, part_days) + 1,
            )
            schema.makeSchema(drop=drop, part_range=part_range, replication_factor=replication_factor)
        else:
            schema.makeSchema(drop=drop, replication_factor=replication_factor)
 
        meta_table_name = ApdbTables.metadata.table_name(config.prefix)
        metadata = ApdbMetadataCassandra(session, meta_table_name, config.keyspace, "read_tuples", "write")
 
        # Fill version numbers, overrides if they existed before.
        if metadata.table_exists():
            metadata.set(cls.metadataSchemaVersionKey, str(schema.schemaVersion()), force=True)
            metadata.set(cls.metadataCodeVersionKey, str(cls.apdbImplementationVersion()), force=True)
 
            # Store frozen part of a configuration in metadata.
            freezer = ApdbConfigFreezer[ApdbCassandraConfig](cls._frozen_parameters)
            metadata.set(cls.metadataConfigKey, freezer.to_json(config), force=True)
 
        cluster.shutdown()
 

◆ _spatial_where()

list[tuple[str, tuple]] lsst.dax.apdb.apdbCassandra.ApdbCassandra._spatial_where	(		self,
		sphgeom.Region \| None	region,
		bool	use_ranges = False )

protected

Generate expressions for spatial part of WHERE clause.

Parameters
----------
region : `sphgeom.Region`
    Spatial region for query results.
use_ranges : `bool`
    If True then use pixel ranges ("apdb_part >= p1 AND apdb_part <=
    p2") instead of exact list of pixels. Should be set to True for
    large regions covering very many pixels.

Returns
-------
expressions : `list` [ `tuple` ]
    Empty list is returned if ``region`` is `None`, otherwise a list
    of one or more (expression, parameters) tuples

Definition at line 1449 of file apdbCassandra.py.

    ) -> list[tuple[str, tuple]]:
        """Generate expressions for spatial part of WHERE clause.
 
        Parameters
        ----------
        region : `sphgeom.Region`
            Spatial region for query results.
        use_ranges : `bool`
            If True then use pixel ranges ("apdb_part >= p1 AND apdb_part <=
            p2") instead of exact list of pixels. Should be set to True for
            large regions covering very many pixels.
 
        Returns
        -------
        expressions : `list` [ `tuple` ]
            Empty list is returned if ``region`` is `None`, otherwise a list
            of one or more (expression, parameters) tuples
        """
        if region is None:
            return []
        if use_ranges:
            pixel_ranges = self._pixelization.envelope(region)
            expressions: list[tuple[str, tuple]] = []
            for lower, upper in pixel_ranges:
                upper -= 1
                if lower == upper:
                    expressions.append(('"apdb_part" = ?', (lower,)))
                else:
                    expressions.append(('"apdb_part" >= ? AND "apdb_part" <= ?', (lower, upper)))
            return expressions
        else:
            pixels = self._pixelization.pixels(region)
            if self.config.query_per_spatial_part:
                return [('"apdb_part" = ?', (pixel,)) for pixel in pixels]
            else:
                pixels_str = ",".join([str(pix) for pix in pixels])
                return [(f'"apdb_part" IN ({pixels_str})', ())]
 

◆ _storeDiaObjects()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._storeDiaObjects	(		self,
		pandas.DataFrame	objs,
		astropy.time.Time	visit_time,
		ApdbInsertId \| None	insert_id )

protected

Store catalog of DiaObjects from current visit.

Parameters
----------
objs : `pandas.DataFrame`
    Catalog with DiaObject records
visit_time : `astropy.time.Time`
    Time of the current visit.

Definition at line 1077 of file apdbCassandra.py.

    ) -> None:
        """Store catalog of DiaObjects from current visit.
 
        Parameters
        ----------
        objs : `pandas.DataFrame`
            Catalog with DiaObject records
        visit_time : `astropy.time.Time`
            Time of the current visit.
        """
        if len(objs) == 0:
            _LOG.debug("No objects to write to database.")
            return
 
        visit_time_dt = visit_time.datetime
        extra_columns = dict(lastNonForcedSource=visit_time_dt)
        self._storeObjectsPandas(objs, ApdbTables.DiaObjectLast, extra_columns=extra_columns)
 
        extra_columns["validityStart"] = visit_time_dt
        time_part: int | None = self._time_partition(visit_time)
        if not self.config.time_partition_tables:
            extra_columns["apdb_time_part"] = time_part
            time_part = None
 
        # Only store DiaObects if not storing insert_ids or explicitly
        # configured to always store them
        if insert_id is None or not self.config.use_insert_id_skips_diaobjects:
            self._storeObjectsPandas(
                objs, ApdbTables.DiaObject, extra_columns=extra_columns, time_part=time_part
            )
 
        if insert_id is not None:
            extra_columns = dict(insert_id=insert_id.id, validityStart=visit_time_dt)
            self._storeObjectsPandas(objs, ExtraTables.DiaObjectInsertId, extra_columns=extra_columns)
 

◆ _storeDiaSources()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._storeDiaSources	(		self,
		ApdbTables	table_name,
		pandas.DataFrame	sources,
		astropy.time.Time	visit_time,
		ApdbInsertId \| None	insert_id )

protected

Store catalog of DIASources or DIAForcedSources from current visit.

Parameters
----------
sources : `pandas.DataFrame`
    Catalog containing DiaSource records
visit_time : `astropy.time.Time`
    Time of the current visit.

Definition at line 1114 of file apdbCassandra.py.

    ) -> None:
        """Store catalog of DIASources or DIAForcedSources from current visit.
 
        Parameters
        ----------
        sources : `pandas.DataFrame`
            Catalog containing DiaSource records
        visit_time : `astropy.time.Time`
            Time of the current visit.
        """
        time_part: int | None = self._time_partition(visit_time)
        extra_columns: dict[str, Any] = {}
        if not self.config.time_partition_tables:
            extra_columns["apdb_time_part"] = time_part
            time_part = None
 
        self._storeObjectsPandas(sources, table_name, extra_columns=extra_columns, time_part=time_part)
 
        if insert_id is not None:
            extra_columns = dict(insert_id=insert_id.id)
            if table_name is ApdbTables.DiaSource:
                extra_table = ExtraTables.DiaSourceInsertId
            else:
                extra_table = ExtraTables.DiaForcedSourceInsertId
            self._storeObjectsPandas(sources, extra_table, extra_columns=extra_columns)
 

◆ _storeDiaSourcesPartitions()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._storeDiaSourcesPartitions	(		self,
		pandas.DataFrame	sources,
		astropy.time.Time	visit_time,
		ApdbInsertId \| None	insert_id )

protected

Store mapping of diaSourceId to its partitioning values.

Parameters
----------
sources : `pandas.DataFrame`
    Catalog containing DiaSource records
visit_time : `astropy.time.Time`
    Time of the current visit.

Definition at line 1146 of file apdbCassandra.py.

    ) -> None:
        """Store mapping of diaSourceId to its partitioning values.
 
        Parameters
        ----------
        sources : `pandas.DataFrame`
            Catalog containing DiaSource records
        visit_time : `astropy.time.Time`
            Time of the current visit.
        """
        id_map = cast(pandas.DataFrame, sources[["diaSourceId", "apdb_part"]])
        extra_columns = {
            "apdb_time_part": self._time_partition(visit_time),
            "insert_id": insert_id.id if insert_id is not None else None,
        }
 
        self._storeObjectsPandas(
            id_map, ExtraTables.DiaSourceToPartition, extra_columns=extra_columns, time_part=None
        )
 

◆ _storeInsertId()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._storeInsertId	(		self,
		ApdbInsertId	insert_id,
		astropy.time.Time	visit_time )

protected

Definition at line 1057 of file apdbCassandra.py.

    def _storeInsertId(self, insert_id: ApdbInsertId, visit_time: astropy.time.Time) -> None:
        # Cassandra timestamp uses milliseconds since epoch
        timestamp = int(insert_id.insert_time.unix_tai / 1_000_000)
 
        # everything goes into a single partition
        partition = 0
 
        table_name = self._schema.tableName(ExtraTables.DiaInsertId)
        query = (
            f'INSERT INTO "{self._keyspace}"."{table_name}" (partition, insert_id, insert_time) '
            "VALUES (?, ?, ?)"
        )
 
        self._session.execute(
            self._preparer.prepare(query),
            (partition, insert_id.id, timestamp),
            timeout=self.config.write_timeout,
            execution_profile="write",
        )
 

◆ _storeObjectsPandas()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._storeObjectsPandas	(		self,
		pandas.DataFrame	records,
		ApdbTables \| ExtraTables	table_name,
		Mapping \| None	extra_columns = None,
		int \| None	time_part = None )

protected

Store generic objects.

Takes Pandas catalog and stores a bunch of records in a table.

Parameters
----------
records : `pandas.DataFrame`
    Catalog containing object records
table_name : `ApdbTables`
    Name of the table as defined in APDB schema.
extra_columns : `dict`, optional
    Mapping (column_name, column_value) which gives fixed values for
    columns in each row, overrides values in ``records`` if matching
    columns exist there.
time_part : `int`, optional
    If not `None` then insert into a per-partition table.

Notes
-----
If Pandas catalog contains additional columns not defined in table
schema they are ignored. Catalog does not have to contain all columns
defined in a table, but partition and clustering keys must be present
in a catalog or ``extra_columns``.

Definition at line 1168 of file apdbCassandra.py.

    ) -> None:
        """Store generic objects.
 
        Takes Pandas catalog and stores a bunch of records in a table.
 
        Parameters
        ----------
        records : `pandas.DataFrame`
            Catalog containing object records
        table_name : `ApdbTables`
            Name of the table as defined in APDB schema.
        extra_columns : `dict`, optional
            Mapping (column_name, column_value) which gives fixed values for
            columns in each row, overrides values in ``records`` if matching
            columns exist there.
        time_part : `int`, optional
            If not `None` then insert into a per-partition table.
 
        Notes
        -----
        If Pandas catalog contains additional columns not defined in table
        schema they are ignored. Catalog does not have to contain all columns
        defined in a table, but partition and clustering keys must be present
        in a catalog or ``extra_columns``.
        """
        # use extra columns if specified
        if extra_columns is None:
            extra_columns = {}
        extra_fields = list(extra_columns.keys())
 
        # Fields that will come from dataframe.
        df_fields = [column for column in records.columns if column not in extra_fields]
 
        column_map = self._schema.getColumnMap(table_name)
        # list of columns (as in felis schema)
        fields = [column_map[field].name for field in df_fields if field in column_map]
        fields += extra_fields
 
        # check that all partitioning and clustering columns are defined
        required_columns = self._schema.partitionColumns(table_name) + self._schema.clusteringColumns(
            table_name
        )
        missing_columns = [column for column in required_columns if column not in fields]
        if missing_columns:
            raise ValueError(f"Primary key columns are missing from catalog: {missing_columns}")
 
        qfields = [quote_id(field) for field in fields]
        qfields_str = ",".join(qfields)
 
        with Timer(table_name.name + " query build", self.config.timer):
            table = self._schema.tableName(table_name)
            if time_part is not None:
                table = f"{table}_{time_part}"
 
            holders = ",".join(["?"] * len(qfields))
            query = f'INSERT INTO "{self._keyspace}"."{table}" ({qfields_str}) VALUES ({holders})'
            statement = self._preparer.prepare(query)
            queries = cassandra.query.BatchStatement()
            for rec in records.itertuples(index=False):
                values = []
                for field in df_fields:
                    if field not in column_map:
                        continue
                    value = getattr(rec, field)
                    if column_map[field].datatype is felis.types.Timestamp:
                        if isinstance(value, pandas.Timestamp):
                            value = literal(value.to_pydatetime())
                        else:
                            # Assume it's seconds since epoch, Cassandra
                            # datetime is in milliseconds
                            value = int(value * 1000)
                    values.append(literal(value))
                for field in extra_fields:
                    value = extra_columns[field]
                    values.append(literal(value))
                queries.add(statement, values)
 
        _LOG.debug("%s: will store %d records", self._schema.tableName(table_name), records.shape[0])
        with Timer(table_name.name + " insert", self.config.timer):
            self._session.execute(queries, timeout=self.config.write_timeout, execution_profile="write")
 

◆ _temporal_where()

tuple[list[str], list[tuple[str, tuple]]] lsst.dax.apdb.apdbCassandra.ApdbCassandra._temporal_where	(		self,
		ApdbTables	table,
		float \| astropy.time.Time	start_time,
		float \| astropy.time.Time	end_time,
		bool \| None	query_per_time_part = None )

protected

Generate table names and expressions for temporal part of WHERE
clauses.

Parameters
----------
table : `ApdbTables`
    Table to select from.
start_time : `astropy.time.Time` or `float`
    Starting Datetime of MJD value of the time range.
end_time : `astropy.time.Time` or `float`
    Starting Datetime of MJD value of the time range.
query_per_time_part : `bool`, optional
    If None then use ``query_per_time_part`` from configuration.

Returns
-------
tables : `list` [ `str` ]
    List of the table names to query.
expressions : `list` [ `tuple` ]
    A list of zero or more (expression, parameters) tuples.

Definition at line 1489 of file apdbCassandra.py.

    ) -> tuple[list[str], list[tuple[str, tuple]]]:
        """Generate table names and expressions for temporal part of WHERE
        clauses.
 
        Parameters
        ----------
        table : `ApdbTables`
            Table to select from.
        start_time : `astropy.time.Time` or `float`
            Starting Datetime of MJD value of the time range.
        end_time : `astropy.time.Time` or `float`
            Starting Datetime of MJD value of the time range.
        query_per_time_part : `bool`, optional
            If None then use ``query_per_time_part`` from configuration.
 
        Returns
        -------
        tables : `list` [ `str` ]
            List of the table names to query.
        expressions : `list` [ `tuple` ]
            A list of zero or more (expression, parameters) tuples.
        """
        tables: list[str]
        temporal_where: list[tuple[str, tuple]] = []
        table_name = self._schema.tableName(table)
        time_part_start = self._time_partition(start_time)
        time_part_end = self._time_partition(end_time)
        time_parts = list(range(time_part_start, time_part_end + 1))
        if self.config.time_partition_tables:
            tables = [f"{table_name}_{part}" for part in time_parts]
        else:
            tables = [table_name]
            if query_per_time_part is None:
                query_per_time_part = self.config.query_per_time_part
            if query_per_time_part:
                temporal_where = [('"apdb_time_part" = ?', (time_part,)) for time_part in time_parts]
            else:
                time_part_list = ",".join([str(part) for part in time_parts])
                temporal_where = [(f'"apdb_time_part" IN ({time_part_list})', ())]
 
        return tables, temporal_where

◆ _time_partition()

int lsst.dax.apdb.apdbCassandra.ApdbCassandra._time_partition	(		self,
		float \| astropy.time.Time	time )

protected

Calculate time partition number for a given time.

Parameters
----------
time : `float` or `astropy.time.Time`
    Time for which to calculate partition number. Can be float to mean
    MJD or `astropy.time.Time`

Returns
-------
partition : `int`
    Partition number for a given time.

Definition at line 1362 of file apdbCassandra.py.

    def _time_partition(self, time: float | astropy.time.Time) -> int:
        """Calculate time partition number for a given time.
 
        Parameters
        ----------
        time : `float` or `astropy.time.Time`
            Time for which to calculate partition number. Can be float to mean
            MJD or `astropy.time.Time`
 
        Returns
        -------
        partition : `int`
            Partition number for a given time.
        """
        if isinstance(time, astropy.time.Time):
            mjd = float(time.mjd)
        else:
            mjd = time
        days_since_epoch = mjd - self._partition_zero_epoch_mjd
        partition = int(days_since_epoch) // self.config.time_partition_days
        return partition
 

◆ _time_partition_cls()

int lsst.dax.apdb.apdbCassandra.ApdbCassandra._time_partition_cls	(		cls,
		float \| astropy.time.Time	time,
		float	epoch_mjd,
		int	part_days )

protected

Calculate time partition number for a given time.

Parameters
----------
time : `float` or `astropy.time.Time`
    Time for which to calculate partition number. Can be float to mean
    MJD or `astropy.time.Time`
epoch_mjd : `float`
    Epoch time for partition 0.
part_days : `int`
    Number of days per partition.

Returns
-------
partition : `int`
    Partition number for a given time.

Definition at line 1336 of file apdbCassandra.py.

    def _time_partition_cls(cls, time: float | astropy.time.Time, epoch_mjd: float, part_days: int) -> int:
        """Calculate time partition number for a given time.
 
        Parameters
        ----------
        time : `float` or `astropy.time.Time`
            Time for which to calculate partition number. Can be float to mean
            MJD or `astropy.time.Time`
        epoch_mjd : `float`
            Epoch time for partition 0.
        part_days : `int`
            Number of days per partition.
 
        Returns
        -------
        partition : `int`
            Partition number for a given time.
        """
        if isinstance(time, astropy.time.Time):
            mjd = float(time.mjd)
        else:
            mjd = time
        days_since_epoch = mjd - epoch_mjd
        partition = int(days_since_epoch) // part_days
        return partition
 

◆ _versionCheck()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra._versionCheck	(		self,
		ApdbMetadataCassandra	metadata )

protected

Check schema version compatibility.

Definition at line 393 of file apdbCassandra.py.

    def _versionCheck(self, metadata: ApdbMetadataCassandra) -> None:
        """Check schema version compatibility."""
 
        def _get_version(key: str, default: VersionTuple) -> VersionTuple:
            """Retrieve version number from given metadata key."""
            if metadata.table_exists():
                version_str = metadata.get(key)
                if version_str is None:
                    # Should not happen with existing metadata table.
                    raise RuntimeError(f"Version key {key!r} does not exist in metadata table.")
                return VersionTuple.fromString(version_str)
            return default
 
        # For old databases where metadata table does not exist we assume that
        # version of both code and schema is 0.1.0.
        initial_version = VersionTuple(0, 1, 0)
        db_schema_version = _get_version(self.metadataSchemaVersionKey, initial_version)
        db_code_version = _get_version(self.metadataCodeVersionKey, initial_version)
 
        # For now there is no way to make read-only APDB instances, assume that
        # any access can do updates.
        if not self._schema.schemaVersion().checkCompatibility(db_schema_version, True):
            raise IncompatibleVersionError(
                f"Configured schema version {self._schema.schemaVersion()} "
                f"is not compatible with database version {db_schema_version}"
            )
        if not self.apdbImplementationVersion().checkCompatibility(db_code_version, True):
            raise IncompatibleVersionError(
                f"Current code version {self.apdbImplementationVersion()} "
                f"is not compatible with database version {db_code_version}"
            )
 

◆ apdbImplementationVersion()

VersionTuple lsst.dax.apdb.apdbCassandra.ApdbCassandra.apdbImplementationVersion ( cls )

Return version number for current APDB implementation.

Returns
-------
version : `VersionTuple`
    Version of the code defined in implementation class.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 426 of file apdbCassandra.py.

    def apdbImplementationVersion(cls) -> VersionTuple:
        # Docstring inherited from base class.
        return VERSION
 

◆ apdbSchemaVersion()

VersionTuple lsst.dax.apdb.apdbCassandra.ApdbCassandra.apdbSchemaVersion ( self )

Return schema version number as defined in config file.

Returns
-------
version : `VersionTuple`
    Version of the schema defined in schema config file.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 430 of file apdbCassandra.py.

    def apdbSchemaVersion(self) -> VersionTuple:
        # Docstring inherited from base class.
        return self._schema.schemaVersion()
 

◆ containsVisitDetector()

bool lsst.dax.apdb.apdbCassandra.ApdbCassandra.containsVisitDetector	(		self,
		int	visit,
		int	detector )

Test whether data for a given visit-detector is present in the APDB.

Parameters
----------
visit, detector : `int`
    The ID of the visit-detector to search for.

Returns
-------
present : `bool`
    `True` if some DiaObject, DiaSource, or DiaForcedSource records
    exist for the specified observation, `False` otherwise.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 681 of file apdbCassandra.py.

    def containsVisitDetector(self, visit: int, detector: int) -> bool:
        # docstring is inherited from a base class
        raise NotImplementedError()
 

◆ countUnassociatedObjects()

int lsst.dax.apdb.apdbCassandra.ApdbCassandra.countUnassociatedObjects ( self )

Return the number of DiaObjects that have only one DiaSource
associated with them.

Used as part of ap_verify metrics.

Returns
-------
count : `int`
    Number of DiaObjects with exactly one associated DiaSource.

Notes
-----
This method can be very inefficient or slow in some implementations.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 897 of file apdbCassandra.py.

    def countUnassociatedObjects(self) -> int:
        # docstring is inherited from a base class
 
        # It's too inefficient to implement it for Cassandra in current schema.
        raise NotImplementedError()
 

◆ dailyJob()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.dailyJob ( self )

Implement daily activities like cleanup/vacuum.

What should be done during daily activities is determined by
specific implementation.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 893 of file apdbCassandra.py.

    def dailyJob(self) -> None:
        # docstring is inherited from a base class
        pass
 

◆ deleteInsertIds()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.deleteInsertIds	(		self,
		Iterable[ApdbInsertId]	ids )

Remove insert identifiers from the database.

Parameters
----------
ids : `iterable` [`ApdbInsertId`]
    Insert identifiers, can include items returned from `getInsertIds`.

Notes
-----
This method causes Apdb to forget about specified identifiers. If there
are any auxiliary data associated with the identifiers, it is also
removed from database (but data in regular tables is not removed).
This method should be called after successful transfer of data from
APDB to PPDB to free space used by history.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 709 of file apdbCassandra.py.

    def deleteInsertIds(self, ids: Iterable[ApdbInsertId]) -> None:
        # docstring is inherited from a base class
        if not self._schema.has_insert_id:
            raise ValueError("APDB is not configured for history storage")
 
        all_insert_ids = [id.id for id in ids]
        # There is 64k limit on number of markers in Cassandra CQL
        for insert_ids in chunk_iterable(all_insert_ids, 20_000):
            params = ",".join("?" * len(insert_ids))
 
            # everything goes into a single partition
            partition = 0
 
            table_name = self._schema.tableName(ExtraTables.DiaInsertId)
            query = (
                f'DELETE FROM "{self._keyspace}"."{table_name}" '
                f"WHERE partition = ? AND insert_id IN ({params})"
            )
 
            self._session.execute(
                self._preparer.prepare(query),
                [partition] + list(insert_ids),
                timeout=self.config.remove_timeout,
            )
 
            # Also remove those insert_ids from Dia*InsertId tables.abs
            for table in (
                ExtraTables.DiaObjectInsertId,
                ExtraTables.DiaSourceInsertId,
                ExtraTables.DiaForcedSourceInsertId,
            ):
                table_name = self._schema.tableName(table)
                query = f'DELETE FROM "{self._keyspace}"."{table_name}" WHERE insert_id IN ({params})'
                self._session.execute(
                    self._preparer.prepare(query),
                    insert_ids,
                    timeout=self.config.remove_timeout,
                )
 

◆ getDiaForcedSources()

pandas.DataFrame \| None lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaForcedSources	(		self,
		sphgeom.Region	region,
		Iterable[int] \| None	object_ids,
		astropy.time.Time	visit_time )

Return catalog of DiaForcedSource instances from a given region.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIASources.
object_ids : iterable [ `int` ], optional
    List of DiaObject IDs to further constrain the set of returned
    sources. If list is empty then empty catalog is returned with a
    correct schema. If `None` then returned sources are not
    constrained. Some implementations may not support latter case.
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`, or `None`
    Catalog containing DiaSource records. `None` is returned if
    ``read_forced_sources_months`` configuration parameter is set to 0.

Raises
------
NotImplementedError
    May be raised by some implementations if ``object_ids`` is `None`.

Notes
-----
This method returns DiaForcedSource catalog for a region with
additional filtering based on DiaObject IDs. Only a subset of DiaSource
history is returned limited by ``read_forced_sources_months`` config
parameter, w.r.t. ``visit_time``. If ``object_ids`` is empty then an
empty catalog is always returned with the correct schema
(columns/types). If ``object_ids`` is `None` then no filtering is
performed and some of the returned records may be outside the specified
region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 669 of file apdbCassandra.py.

    ) -> pandas.DataFrame | None:
        # docstring is inherited from a base class
        months = self.config.read_forced_sources_months
        if months == 0:
            return None
        mjd_end = visit_time.mjd
        mjd_start = mjd_end - months * 30
 
        return self._getSources(region, object_ids, mjd_start, mjd_end, ApdbTables.DiaForcedSource)
 

◆ getDiaForcedSourcesHistory()

ApdbTableData lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaForcedSourcesHistory	(		self,
		Iterable[ApdbInsertId]	ids )

Return catalog of DiaForcedSource instances from a given time
period.

Parameters
----------
ids : `iterable` [`ApdbInsertId`]
    Insert identifiers, can include items returned from `getInsertIds`.

Returns
-------
data : `ApdbTableData`
    Catalog containing DiaForcedSource records. In addition to all
    regular columns it will contain ``insert_id`` column.

Notes
-----
This part of API may not be very stable and can change before the
implementation finalizes.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 756 of file apdbCassandra.py.

    def getDiaForcedSourcesHistory(self, ids: Iterable[ApdbInsertId]) -> ApdbTableData:
        # docstring is inherited from a base class
        return self._get_history(ExtraTables.DiaForcedSourceInsertId, ids)
 

◆ getDiaObjects()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaObjects	(		self,
		sphgeom.Region	region )

Return catalog of DiaObject instances from a given region.

This method returns only the last version of each DiaObject. Some
records in a returned catalog may be outside the specified region, it
is up to a client to ignore those records or cleanup the catalog before
futher use.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIAObjects.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog containing DiaObject records for a region that may be a
    superset of the specified region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 621 of file apdbCassandra.py.

    def getDiaObjects(self, region: sphgeom.Region) -> pandas.DataFrame:
        # docstring is inherited from a base class
 
        sp_where = self._spatial_where(region)
        _LOG.debug("getDiaObjects: #partitions: %s", len(sp_where))
 
        # We need to exclude extra partitioning columns from result.
        column_names = self._schema.apdbColumnNames(ApdbTables.DiaObjectLast)
        what = ",".join(quote_id(column) for column in column_names)
 
        table_name = self._schema.tableName(ApdbTables.DiaObjectLast)
        query = f'SELECT {what} from "{self._keyspace}"."{table_name}"'
        statements: list[tuple] = []
        for where, params in sp_where:
            full_query = f"{query} WHERE {where}"
            if params:
                statement = self._preparer.prepare(full_query)
            else:
                # If there are no params then it is likely that query has a
                # bunch of literals rendered already, no point trying to
                # prepare it because it's not reusable.
                statement = cassandra.query.SimpleStatement(full_query)
            statements.append((statement, params))
        _LOG.debug("getDiaObjects: #queries: %s", len(statements))
 
        with Timer("DiaObject select", self.config.timer):
            objects = cast(
                pandas.DataFrame,
                select_concurrent(
                    self._session, statements, "read_pandas_multi", self.config.read_concurrency
                ),
            )
 
        _LOG.debug("found %s DiaObjects", objects.shape[0])
        return objects
 

◆ getDiaObjectsHistory()

ApdbTableData lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaObjectsHistory	(		self,
		Iterable[ApdbInsertId]	ids )

Return catalog of DiaObject instances from a given time period
including the history of each DiaObject.

Parameters
----------
ids : `iterable` [`ApdbInsertId`]
    Insert identifiers, can include items returned from `getInsertIds`.

Returns
-------
data : `ApdbTableData`
    Catalog containing DiaObject records. In addition to all regular
    columns it will contain ``insert_id`` column.

Notes
-----
This part of API may not be very stable and can change before the
implementation finalizes.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 748 of file apdbCassandra.py.

    def getDiaObjectsHistory(self, ids: Iterable[ApdbInsertId]) -> ApdbTableData:
        # docstring is inherited from a base class
        return self._get_history(ExtraTables.DiaObjectInsertId, ids)
 

◆ getDiaSources()

pandas.DataFrame \| None lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaSources	(		self,
		sphgeom.Region	region,
		Iterable[int] \| None	object_ids,
		astropy.time.Time	visit_time )

Return catalog of DiaSource instances from a given region.

Parameters
----------
region : `lsst.sphgeom.Region`
    Region to search for DIASources.
object_ids : iterable [ `int` ], optional
    List of DiaObject IDs to further constrain the set of returned
    sources. If `None` then returned sources are not constrained. If
    list is empty then empty catalog is returned with a correct
    schema.
visit_time : `astropy.time.Time`
    Time of the current visit.

Returns
-------
catalog : `pandas.DataFrame`, or `None`
    Catalog containing DiaSource records. `None` is returned if
    ``read_sources_months`` configuration parameter is set to 0.

Notes
-----
This method returns DiaSource catalog for a region with additional
filtering based on DiaObject IDs. Only a subset of DiaSource history
is returned limited by ``read_sources_months`` config parameter, w.r.t.
``visit_time``. If ``object_ids`` is empty then an empty catalog is
always returned with the correct schema (columns/types). If
``object_ids`` is `None` then no filtering is performed and some of the
returned records may be outside the specified region.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 657 of file apdbCassandra.py.

    ) -> pandas.DataFrame | None:
        # docstring is inherited from a base class
        months = self.config.read_sources_months
        if months == 0:
            return None
        mjd_end = visit_time.mjd
        mjd_start = mjd_end - months * 30
 
        return self._getSources(region, object_ids, mjd_start, mjd_end, ApdbTables.DiaSource)
 

◆ getDiaSourcesHistory()

ApdbTableData lsst.dax.apdb.apdbCassandra.ApdbCassandra.getDiaSourcesHistory	(		self,
		Iterable[ApdbInsertId]	ids )

Return catalog of DiaSource instances from a given time period.

Parameters
----------
ids : `iterable` [`ApdbInsertId`]
    Insert identifiers, can include items returned from `getInsertIds`.

Returns
-------
data : `ApdbTableData`
    Catalog containing DiaSource records. In addition to all regular
    columns it will contain ``insert_id`` column.

Notes
-----
This part of API may not be very stable and can change before the
implementation finalizes.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 752 of file apdbCassandra.py.

    def getDiaSourcesHistory(self, ids: Iterable[ApdbInsertId]) -> ApdbTableData:
        # docstring is inherited from a base class
        return self._get_history(ExtraTables.DiaSourceInsertId, ids)
 

◆ getInsertIds()

list[ApdbInsertId] | None lsst.dax.apdb.apdbCassandra.ApdbCassandra.getInsertIds ( self )

Return collection of insert identifiers known to the database.

Returns
-------
ids : `list` [`ApdbInsertId`] or `None`
    List of identifiers, they may be time-ordered if database supports
    ordering. `None` is returned if database is not configured to store
    insert identifiers.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 685 of file apdbCassandra.py.

    def getInsertIds(self) -> list[ApdbInsertId] | None:
        # docstring is inherited from a base class
        if not self._schema.has_insert_id:
            return None
 
        # everything goes into a single partition
        partition = 0
 
        table_name = self._schema.tableName(ExtraTables.DiaInsertId)
        query = f'SELECT insert_time, insert_id FROM "{self._keyspace}"."{table_name}" WHERE partition = ?'
 
        result = self._session.execute(
            self._preparer.prepare(query),
            (partition,),
            timeout=self.config.read_timeout,
            execution_profile="read_tuples",
        )
        # order by insert_time
        rows = sorted(result)
        return [
            ApdbInsertId(id=row[1], insert_time=astropy.time.Time(row[0].timestamp(), format="unix_tai"))
            for row in rows
        ]
 

◆ getSSObjects()

pandas.DataFrame lsst.dax.apdb.apdbCassandra.ApdbCassandra.getSSObjects ( self )

Return catalog of SSObject instances.

Returns
-------
catalog : `pandas.DataFrame`
    Catalog containing SSObject records, all existing records are
    returned.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 760 of file apdbCassandra.py.

    def getSSObjects(self) -> pandas.DataFrame:
        # docstring is inherited from a base class
        tableName = self._schema.tableName(ApdbTables.SSObject)
        query = f'SELECT * from "{self._keyspace}"."{tableName}"'
 
        objects = None
        with Timer("SSObject select", self.config.timer):
            result = self._session.execute(query, execution_profile="read_pandas")
            objects = result._current_rows
 
        _LOG.debug("found %s DiaObjects", objects.shape[0])
        return objects
 

◆ init_database()

ApdbCassandraConfig lsst.dax.apdb.apdbCassandra.ApdbCassandra.init_database	(		cls,
		list[str]	hosts,
		str	keyspace,
		*str \| None	schema_file = None,
		str \| None	schema_name = None,
		int \| None	read_sources_months = None,
		int \| None	read_forced_sources_months = None,
		bool	use_insert_id = False,
		bool	use_insert_id_skips_diaobjects = False,
		int \| None	port = None,
		str \| None	username = None,
		str \| None	prefix = None,
		str \| None	part_pixelization = None,
		int \| None	part_pix_level = None,
		bool	time_partition_tables = True,
		str \| None	time_partition_start = None,
		str \| None	time_partition_end = None,
		str \| None	read_consistency = None,
		str \| None	write_consistency = None,
		int \| None	read_timeout = None,
		int \| None	write_timeout = None,
		list[str] \| None	ra_dec_columns = None,
		int \| None	replication_factor = None,
		bool	drop = False )

Initialize new APDB instance and make configuration object for it.

Parameters
----------
hosts : `list` [`str`]
    List of host names or IP addresses for Cassandra cluster.
keyspace : `str`
    Name of the keyspace for APDB tables.
schema_file : `str`, optional
    Location of (YAML) configuration file with APDB schema. If not
    specified then default location will be used.
schema_name : `str`, optional
    Name of the schema in YAML configuration file. If not specified
    then default name will be used.
read_sources_months : `int`, optional
    Number of months of history to read from DiaSource.
read_forced_sources_months : `int`, optional
    Number of months of history to read from DiaForcedSource.
use_insert_id : `bool`, optional
    If True, make additional tables used for replication to PPDB.
use_insert_id_skips_diaobjects : `bool`, optional
    If `True` then do not fill regular ``DiaObject`` table when
    ``use_insert_id`` is `True`.
port : `int`, optional
    Port number to use for Cassandra connections.
username : `str`, optional
    User name for Cassandra connections.
prefix : `str`, optional
    Optional prefix for all table names.
part_pixelization : `str`, optional
    Name of the MOC pixelization used for partitioning.
part_pix_level : `int`, optional
    Pixelization level.
time_partition_tables : `bool`, optional
    Create per-partition tables.
time_partition_start : `str`, optional
    Starting time for per-partition tables, in yyyy-mm-ddThh:mm:ss
    format, in TAI.
time_partition_end : `str`, optional
    Ending time for per-partition tables, in yyyy-mm-ddThh:mm:ss
    format, in TAI.
read_consistency : `str`, optional
    Name of the consistency level for read operations.
write_consistency : `str`, optional
    Name of the consistency level for write operations.
read_timeout : `int`, optional
    Read timeout in seconds.
write_timeout : `int`, optional
    Write timeout in seconds.
ra_dec_columns : `list` [`str`], optional
    Names of ra/dec columns in DiaObject table.
replication_factor : `int`, optional
    Replication factor used when creating new keyspace, if keyspace
    already exists its replication factor is not changed.
drop : `bool`, optional
    If `True` then drop existing tables before re-creating the schema.

Returns
-------
config : `ApdbCassandraConfig`
    Resulting configuration object for a created APDB instance.

Definition at line 439 of file apdbCassandra.py.

    ) -> ApdbCassandraConfig:
        """Initialize new APDB instance and make configuration object for it.
 
        Parameters
        ----------
        hosts : `list` [`str`]
            List of host names or IP addresses for Cassandra cluster.
        keyspace : `str`
            Name of the keyspace for APDB tables.
        schema_file : `str`, optional
            Location of (YAML) configuration file with APDB schema. If not
            specified then default location will be used.
        schema_name : `str`, optional
            Name of the schema in YAML configuration file. If not specified
            then default name will be used.
        read_sources_months : `int`, optional
            Number of months of history to read from DiaSource.
        read_forced_sources_months : `int`, optional
            Number of months of history to read from DiaForcedSource.
        use_insert_id : `bool`, optional
            If True, make additional tables used for replication to PPDB.
        use_insert_id_skips_diaobjects : `bool`, optional
            If `True` then do not fill regular ``DiaObject`` table when
            ``use_insert_id`` is `True`.
        port : `int`, optional
            Port number to use for Cassandra connections.
        username : `str`, optional
            User name for Cassandra connections.
        prefix : `str`, optional
            Optional prefix for all table names.
        part_pixelization : `str`, optional
            Name of the MOC pixelization used for partitioning.
        part_pix_level : `int`, optional
            Pixelization level.
        time_partition_tables : `bool`, optional
            Create per-partition tables.
        time_partition_start : `str`, optional
            Starting time for per-partition tables, in yyyy-mm-ddThh:mm:ss
            format, in TAI.
        time_partition_end : `str`, optional
            Ending time for per-partition tables, in yyyy-mm-ddThh:mm:ss
            format, in TAI.
        read_consistency : `str`, optional
            Name of the consistency level for read operations.
        write_consistency : `str`, optional
            Name of the consistency level for write operations.
        read_timeout : `int`, optional
            Read timeout in seconds.
        write_timeout : `int`, optional
            Write timeout in seconds.
        ra_dec_columns : `list` [`str`], optional
            Names of ra/dec columns in DiaObject table.
        replication_factor : `int`, optional
            Replication factor used when creating new keyspace, if keyspace
            already exists its replication factor is not changed.
        drop : `bool`, optional
            If `True` then drop existing tables before re-creating the schema.
 
        Returns
        -------
        config : `ApdbCassandraConfig`
            Resulting configuration object for a created APDB instance.
        """
        config = ApdbCassandraConfig(
            contact_points=hosts,
            keyspace=keyspace,
            use_insert_id=use_insert_id,
            use_insert_id_skips_diaobjects=use_insert_id_skips_diaobjects,
            time_partition_tables=time_partition_tables,
        )
        if schema_file is not None:
            config.schema_file = schema_file
        if schema_name is not None:
            config.schema_name = schema_name
        if read_sources_months is not None:
            config.read_sources_months = read_sources_months
        if read_forced_sources_months is not None:
            config.read_forced_sources_months = read_forced_sources_months
        if port is not None:
            config.port = port
        if username is not None:
            config.username = username
        if prefix is not None:
            config.prefix = prefix
        if part_pixelization is not None:
            config.part_pixelization = part_pixelization
        if part_pix_level is not None:
            config.part_pix_level = part_pix_level
        if time_partition_start is not None:
            config.time_partition_start = time_partition_start
        if time_partition_end is not None:
            config.time_partition_end = time_partition_end
        if read_consistency is not None:
            config.read_consistency = read_consistency
        if write_consistency is not None:
            config.write_consistency = write_consistency
        if read_timeout is not None:
            config.read_timeout = read_timeout
        if write_timeout is not None:
            config.write_timeout = write_timeout
        if ra_dec_columns is not None:
            config.ra_dec_columns = ra_dec_columns
 
        cls._makeSchema(config, drop=drop, replication_factor=replication_factor)
 
        return config
 

◆ metadata()

ApdbMetadata lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadata ( self )

Object controlling access to APDB metadata (`ApdbMetadata`).

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 904 of file apdbCassandra.py.

    def metadata(self) -> ApdbMetadata:
        # docstring is inherited from a base class
        if self._metadata is None:
            raise RuntimeError("Database schema was not initialized.")
        return self._metadata
 

◆ reassignDiaSources()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.reassignDiaSources	(		self,
		Mapping[int, int]	idMap )

Associate DiaSources with SSObjects, dis-associating them
from DiaObjects.

Parameters
----------
idMap : `Mapping`
    Maps DiaSource IDs to their new SSObject IDs.

Raises
------
ValueError
    Raised if DiaSource ID does not exist in the database.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 805 of file apdbCassandra.py.

    def reassignDiaSources(self, idMap: Mapping[int, int]) -> None:
        # docstring is inherited from a base class
 
        # To update a record we need to know its exact primary key (including
        # partition key) so we start by querying for diaSourceId to find the
        # primary keys.
 
        table_name = self._schema.tableName(ExtraTables.DiaSourceToPartition)
        # split it into 1k IDs per query
        selects: list[tuple] = []
        for ids in chunk_iterable(idMap.keys(), 1_000):
            ids_str = ",".join(str(item) for item in ids)
            selects.append(
                (
                    (
                        'SELECT "diaSourceId", "apdb_part", "apdb_time_part", "insert_id" '
                        f'FROM "{self._keyspace}"."{table_name}" WHERE "diaSourceId" IN ({ids_str})'
                    ),
                    {},
                )
            )
 
        # No need for DataFrame here, read data as tuples.
        result = cast(
            list[tuple[int, int, int, uuid.UUID | None]],
            select_concurrent(self._session, selects, "read_tuples", self.config.read_concurrency),
        )
 
        # Make mapping from source ID to its partition.
        id2partitions: dict[int, tuple[int, int]] = {}
        id2insert_id: dict[int, uuid.UUID] = {}
        for row in result:
            id2partitions[row[0]] = row[1:3]
            if row[3] is not None:
                id2insert_id[row[0]] = row[3]
 
        # make sure we know partitions for each ID
        if set(id2partitions) != set(idMap):
            missing = ",".join(str(item) for item in set(idMap) - set(id2partitions))
            raise ValueError(f"Following DiaSource IDs do not exist in the database: {missing}")
 
        # Reassign in standard tables
        queries = cassandra.query.BatchStatement()
        table_name = self._schema.tableName(ApdbTables.DiaSource)
        for diaSourceId, ssObjectId in idMap.items():
            apdb_part, apdb_time_part = id2partitions[diaSourceId]
            values: tuple
            if self.config.time_partition_tables:
                query = (
                    f'UPDATE "{self._keyspace}"."{table_name}_{apdb_time_part}"'
                    ' SET "ssObjectId" = ?, "diaObjectId" = NULL'
                    ' WHERE "apdb_part" = ? AND "diaSourceId" = ?'
                )
                values = (ssObjectId, apdb_part, diaSourceId)
            else:
                query = (
                    f'UPDATE "{self._keyspace}"."{table_name}"'
                    ' SET "ssObjectId" = ?, "diaObjectId" = NULL'
                    ' WHERE "apdb_part" = ? AND "apdb_time_part" = ? AND "diaSourceId" = ?'
                )
                values = (ssObjectId, apdb_part, apdb_time_part, diaSourceId)
            queries.add(self._preparer.prepare(query), values)
 
        # Reassign in history tables, only if history is enabled
        if id2insert_id:
            # Filter out insert ids that have been deleted already. There is a
            # potential race with concurrent removal of insert IDs, but it
            # should be handled by WHERE in UPDATE.
            known_ids = set()
            if insert_ids := self.getInsertIds():
                known_ids = set(insert_id.id for insert_id in insert_ids)
            id2insert_id = {key: value for key, value in id2insert_id.items() if value in known_ids}
            if id2insert_id:
                table_name = self._schema.tableName(ExtraTables.DiaSourceInsertId)
                for diaSourceId, ssObjectId in idMap.items():
                    if insert_id := id2insert_id.get(diaSourceId):
                        query = (
                            f'UPDATE "{self._keyspace}"."{table_name}" '
                            ' SET "ssObjectId" = ?, "diaObjectId" = NULL '
                            'WHERE "insert_id" = ? AND "diaSourceId" = ?'
                        )
                        values = (ssObjectId, insert_id, diaSourceId)
                        queries.add(self._preparer.prepare(query), values)
 
        _LOG.debug("%s: will update %d records", table_name, len(idMap))
        with Timer(table_name + " update", self.config.timer):
            self._session.execute(queries, execution_profile="write")
 

◆ store()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.store	(		self,
		astropy.time.Time	visit_time,
		pandas.DataFrame	objects,
		pandas.DataFrame \| None	sources = None,
		pandas.DataFrame \| None	forced_sources = None )

Store all three types of catalogs in the database.

Parameters
----------
visit_time : `astropy.time.Time`
    Time of the visit.
objects : `pandas.DataFrame`
    Catalog with DiaObject records.
sources : `pandas.DataFrame`, optional
    Catalog with DiaSource records.
forced_sources : `pandas.DataFrame`, optional
    Catalog with DiaForcedSource records.

Notes
-----
This methods takes DataFrame catalogs, their schema must be
compatible with the schema of APDB table:

  - column names must correspond to database table columns
  - types and units of the columns must match database definitions,
    no unit conversion is performed presently
  - columns that have default values in database schema can be
    omitted from catalog
  - this method knows how to fill interval-related columns of DiaObject
    (validityStart, validityEnd) they do not need to appear in a
    catalog
  - source catalogs have ``diaObjectId`` column associating sources
    with objects

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 773 of file apdbCassandra.py.

    ) -> None:
        # docstring is inherited from a base class
 
        insert_id: ApdbInsertId | None = None
        if self._schema.has_insert_id:
            insert_id = ApdbInsertId.new_insert_id(visit_time)
            self._storeInsertId(insert_id, visit_time)
 
        # fill region partition column for DiaObjects
        objects = self._add_obj_part(objects)
        self._storeDiaObjects(objects, visit_time, insert_id)
 
        if sources is not None:
            # copy apdb_part column from DiaObjects to DiaSources
            sources = self._add_src_part(sources, objects)
            self._storeDiaSources(ApdbTables.DiaSource, sources, visit_time, insert_id)
            self._storeDiaSourcesPartitions(sources, visit_time, insert_id)
 
        if forced_sources is not None:
            forced_sources = self._add_fsrc_part(forced_sources, objects)
            self._storeDiaSources(ApdbTables.DiaForcedSource, forced_sources, visit_time, insert_id)
 

◆ storeSSObjects()

None lsst.dax.apdb.apdbCassandra.ApdbCassandra.storeSSObjects	(		self,
		pandas.DataFrame	objects )

Store or update SSObject catalog.

Parameters
----------
objects : `pandas.DataFrame`
    Catalog with SSObject records.

Notes
-----
If SSObjects with matching IDs already exist in the database, their
records will be updated with the information from provided records.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 801 of file apdbCassandra.py.

    def storeSSObjects(self, objects: pandas.DataFrame) -> None:
        # docstring is inherited from a base class
        self._storeObjectsPandas(objects, ApdbTables.SSObject)
 

◆ tableDef()

Table \| None lsst.dax.apdb.apdbCassandra.ApdbCassandra.tableDef	(		self,
		ApdbTables	table )

Return table schema definition for a given table.

Parameters
----------
table : `ApdbTables`
    One of the known APDB tables.

Returns
-------
tableSchema : `felis.simple.Table` or `None`
    Table schema description, `None` is returned if table is not
    defined by this implementation.

Reimplemented from lsst.dax.apdb.apdb.Apdb.

Definition at line 434 of file apdbCassandra.py.

    def tableDef(self, table: ApdbTables) -> Table | None:
        # docstring is inherited from a base class
        return self._schema.tableSchemas.get(table)
 

Member Data Documentation

◆ _cluster

lsst.dax.apdb.apdbCassandra.ApdbCassandra._cluster

protected

Definition at line 291 of file apdbCassandra.py.

◆ _frozen_parameters

tuple lsst.dax.apdb.apdbCassandra.ApdbCassandra._frozen_parameters

staticprotected

Initial value:

=  (
        "use_insert_id",
        "part_pixelization",
        "part_pix_level",
        "ra_dec_columns",
        "time_partition_tables",
        "time_partition_days",
        "use_insert_id_skips_diaobjects",
    )

Definition at line 271 of file apdbCassandra.py.

◆ _keyspace

lsst.dax.apdb.apdbCassandra.ApdbCassandra._keyspace

protected

Definition at line 289 of file apdbCassandra.py.

◆ _metadata

lsst.dax.apdb.apdbCassandra.ApdbCassandra._metadata

protected

Definition at line 294 of file apdbCassandra.py.

◆ _partition_zero_epoch_mjd

lsst.dax.apdb.apdbCassandra.ApdbCassandra._partition_zero_epoch_mjd

protected

Definition at line 323 of file apdbCassandra.py.

◆ _pixelization

lsst.dax.apdb.apdbCassandra.ApdbCassandra._pixelization

protected

Definition at line 308 of file apdbCassandra.py.

◆ _preparer

lsst.dax.apdb.apdbCassandra.ApdbCassandra._preparer

protected

Definition at line 329 of file apdbCassandra.py.

◆ _schema

lsst.dax.apdb.apdbCassandra.ApdbCassandra._schema

protected

Definition at line 314 of file apdbCassandra.py.

◆ _session

lsst.dax.apdb.apdbCassandra.ApdbCassandra._session

protected

Definition at line 291 of file apdbCassandra.py.

◆ config

lsst.dax.apdb.apdbCassandra.ApdbCassandra.config

Definition at line 303 of file apdbCassandra.py.

◆ metadataCodeVersionKey [1/2]

str lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataCodeVersionKey = "version:ApdbCassandra"

static

Definition at line 265 of file apdbCassandra.py.

◆ metadataCodeVersionKey [2/2]

lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataCodeVersionKey

Definition at line 613 of file apdbCassandra.py.

◆ metadataConfigKey [1/2]

str lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataConfigKey = "config:apdb-cassandra.json"

static

Definition at line 268 of file apdbCassandra.py.

◆ metadataConfigKey [2/2]

lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataConfigKey

Definition at line 617 of file apdbCassandra.py.

◆ metadataSchemaVersionKey [1/2]

str lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataSchemaVersionKey = "version:schema"

static

Definition at line 262 of file apdbCassandra.py.

◆ metadataSchemaVersionKey [2/2]

lsst.dax.apdb.apdbCassandra.ApdbCassandra.metadataSchemaVersionKey

Definition at line 612 of file apdbCassandra.py.

◆ partition_zero_epoch

lsst.dax.apdb.apdbCassandra.ApdbCassandra.partition_zero_epoch = astropy.time.Time(0, format="unix_tai")

static

Definition at line 282 of file apdbCassandra.py.

The documentation for this class was generated from the following file:

/j/snowflake/release/lsstsw/stack/lsst-scipipe-8.0.0/Linux64/dax_apdb/gd2a12a3803+f8351bc914/python/lsst/dax/apdb/apdbCassandra.py

Public Member Functions

Public Attributes

Static Public Attributes

Protected Member Functions

Protected Attributes

Static Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

◆ __del__()

Member Function Documentation

◆ _add_fsrc_part()

◆ _add_obj_part()

◆ _add_src_part()

◆ _combine_where()

◆ _get_history()

◆ _getSources()

◆ _make_auth_provider()

◆ _make_empty_catalog()

◆ _make_session()

◆ _makeProfiles()

◆ _makeSchema()

◆ _spatial_where()

◆ _storeDiaObjects()

◆ _storeDiaSources()

◆ _storeDiaSourcesPartitions()

◆ _storeInsertId()

◆ _storeObjectsPandas()

◆ _temporal_where()

◆ _time_partition()

◆ _time_partition_cls()

◆ _versionCheck()

◆ apdbImplementationVersion()

◆ apdbSchemaVersion()

◆ containsVisitDetector()

◆ countUnassociatedObjects()

◆ dailyJob()

◆ deleteInsertIds()

◆ getDiaForcedSources()

◆ getDiaForcedSourcesHistory()

◆ getDiaObjects()

◆ getDiaObjectsHistory()

◆ getDiaSources()

◆ getDiaSourcesHistory()

◆ getInsertIds()

◆ getSSObjects()

◆ init_database()

◆ metadata()

◆ reassignDiaSources()

◆ store()

◆ storeSSObjects()

◆ tableDef()

Member Data Documentation

◆ _cluster

◆ _frozen_parameters

◆ _keyspace

◆ _metadata

◆ _partition_zero_epoch_mjd

◆ _pixelization

◆ _preparer

◆ _schema

◆ _session

◆ config

◆ metadataCodeVersionKey [1/2]

◆ metadataCodeVersionKey [2/2]

◆ metadataConfigKey [1/2]

◆ metadataConfigKey [2/2]

◆ metadataSchemaVersionKey [1/2]

◆ metadataSchemaVersionKey [2/2]

◆ partition_zero_epoch

◆ init()

◆ del()