LSSTApplications  18.1.0
LSSTDataManagementBasePackage
lsst::log; the LSST Logging Framework
Note
See also the Developer Guide for guidance on how to use the logging framework in practice.

Features

Basic C++ examples

The varargs/printf style interface:

subject = "important stuff";
LOG("myLogger", LOG_LVL_INFO, "Here is some information about %s.", subject);

The iostream style interface:

subject = "important stuff";
LOGS("myLogger", LOG_LVL_INFO, "Here is some information about " << subject << ".");

Using the default logger:

LOG_DEBUG("My debugging statement.")

A logger object may be retrieved and used to avoid the cost of excessive lookups:

static LOG_LOGGER logger = LOG_GET("myLogger");
LOG(logger, LOG_LVL_WARN, "Here is a warning sent using a logging object.");

Basic Python examples

 subject = "important stuff"
 lsst.log.log("myLogger", lsst.log.INFO, "Here is some information about %s.", subject)

 lsst.log.warnf("Possible problem {} using {name}", errcode, name=program)
 lsst.log.debug("My debugging statement.")

The standard Python logging module may also be used (and it is recommended for applications mixing Python logging and C++ logging):

lgr = logging.getLogger()
lgr.setLevel(logging.DEBUG)
lgr.addHandler(lsst.log.LogHandler())
lgr.info("This is an info statement via the logging module.")

Note that in this case messages from Python logging are forwarded to log4cxx, and they are filtered by both Python logging and log4cxx. It is easiest to avoid filtering on Python side by setting logging level to DEBUG for root logger and instead configure log4cxx with proper logging level.

Alternatively, get a logger through lsst.log:

lgr = lsst.log.getLogger("myLogger")
lgr.setLevel(lgr.DEBUG)
lgr.info("This is an info statement via %s", "lsst.log interface.")

Example output

 20140313 00:30:06,226 0x7f1f06bf6700 INFO  Here is some information about important stuff.
 20140313 00:30:06,226 0x7f1f06bf6700 DEBUG My debugging statement.



 2014-03-05 17:04:32,380 [0x7feca357e700] INFO  myLogger qserv::master::AsyncQueryManager::add (bld/control/AsyncQueryManager.cc:149) - Here is some information about important stuff.

The first two examples above use the following formatting string to display the date, time, thread id, log level, and log message:

"%d{yyyyMMdd HH:mm:ss,SSS} %t %-5p %m%n"

The last example above uses the following formatting string to display the date, time, thread id, log level, logger name, function, source file, source line number, and log message:

"%d [%t] %-5p %c{2} %M (%F:%L) - %m%n"

Logging Functions

In C++, the following varargs/printf style logging macros are available:

Alternative set of macros allows one to use iostream-based formatting. In the macros below expression is any C++ expression which can appear on the right side of the stream insertion operator, e.g. LOGS_DEBUG("coordinates: x=" << x << " y=" << y);. Usual caveat regarding commas inside macro arguments applies to expression argument:

In Python, the following logging functions are available in the lsst.log module. These functions take a variable number of arguments following a format string in the style of printf(). The use of *args is recommended over the use of calling the % operator directly, to avoid unnecessarily formatting log messages that do not meet the level threshold.

Additionally, Python has variants of the above functions, with a trailing f in the name, supporting the Python string format() syntax. The use of *args and **kwargs is recommended over the use of calling the format() method on the string directly, to avoid unnecessarily formatting log messages that do not meet the level threshold.

Initialization

The underlying log4cxx system can be initialized explicitly from either the C++ or Python layers or default-initialized.

In C++, the following macros can be used:

In Python, the following function is available in the lsst.log module:

If none of the above methods is called by user code then logging is default-initialized and configured according to the rules described below.

Configuration

The logging system is configured either using a standard log4cxx XML config file, a standard log4j Java properties, or using the default configuration. While log4cxx allows for programmatic configuration, as of this writing, only the adjustment of logging level threshold (see below) is exposed via the lsst.log API. All other configuration (e.g. outputs and formatting) are controlled via a configuration file.

In the absence of an explicit call to one of the configuration macros (or Python methods), lsst.log tries first to configure itself from a default configuration file. The name of this default confgiuration file is determined from a value of LSST_LOG_CONFIG environment variable. If this variable is set and points to an existing readable file then that file is used for configuration as if LOG_CONFIG(filename) was called, if errors happen during file parsing then LOG4CXX prints error messages but continues to run in un-configured state. If LSST_LOG_CONFIG is not set or its value does not refer to readable file then log4cxx is configured using pre-defined configuration:

The same configuration algorithm applies to the C++ macro LOG_CONFIG() and the Python lsst.log module function configure(), which do not take a file path argument (which makes calls to this method/macro optional).

LOG_CONFIG(filename) takes name of the file which contains LOG4CXX configuration. If file name ends with .xml extension then it is assumed to be a standard log4cxx XML configuration, otherwise it should contain standard log4j Java properties as explained below. If parsing of the file fails then LOG4CXX will print error messages to standard error but will continue to run (likely in un-configured state).

LOG_CONFIG_PROP(string) takes a string which is a representation of log4j Java properties (including new lines), this is a useful short-cut for cases when configuration has to be included in the application itself.

All configuration macro and their Python wrappers reset existing logging configuration before applying new one, there is no way to configure logging incrementally using these constructs.

Below is an example of an XML file that configures three appenders and two loggers: the root logger and the named logger "debugs". Each of the appenders contains a layout that defines which metadata to display and how to format log messages.

<?xml version="1.0" encoding="UTF-8" ?>
 <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">

  <appender name="appxConsoleAppender" class="org.apache.log4j.ConsoleAppender">
    <param name="Target" value="System.out"/>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d{yyyyMMdd HH:mm:ss,SSS} %t %-5p %m%n"/>
        </layout>
  </appender>


  <appender name="appxFileAppender" class="org.apache.log4j.FileAppender">
    <param name="file" value="/u1/bchick/sandbox2/modules/var/log/qserv-master.log" />
    <param name="append" value="true" />
    <layout class="org.apache.log4j.PatternLayout">
      <param name="ConversionPattern" value="%d %-5p %c{2} - %m%n" />
    </layout>
  </appender>


  <appender name="debugsAppender" class="org.apache.log4j.FileAppender">
    <param name="file" value="/u1/bchick/sandbox2/modules/var/log/debugs.log" />
    <param name="append" value="true" />
    <layout class="org.apache.log4j.PatternLayout">
      <param name="ConversionPattern" value="%d [%t] %-5p %c{2} %M (%F:%L) - %m%n" />
    </layout>
  </appender>

  <root>
        <priority value="info" />
        <appender-ref ref="appxFileAppender"/>
        <appender-ref ref="appxConsoleAppender"/>
  </root>


  <category name="debugs" additivity="false" >
        <priority value ="debug" />
        <appender-ref ref="debugsAppender"/>
  </category>

 </log4j:configuration>

The root logger is setup to append to both the console and the file var/log/qserv-master.log with a threshold of "INFO". The named logger "debugs", meanwhile, is set to exclusively append to the file var/log/debugs.log with a threshold of "DEBUG". In this way, "debugs" may capture and isolate verbose debugging messages without polluting the main log. This behavior is triggered by setting the additivity attribute of the category tag to "false". If additivity were set to "true" (the default value), all messages sent to "debugs" that met its threshold of "DEBUG" would also be sent to the console and var/log/qserv-master.log, the targets of the appenders associated with the root logger.

Note that as an alternative to XML, a configuration file containing log4j Java properties may be used. Here's a trivial example of a log4j properties file that corresponds to default configuration described above:

# Set root logger level to INFO and its only appender to A1.
log4j.rootLogger=INFO, A1

# A1 is set to be a ConsoleAppender.
log4j.appender.A1=org.apache.log4j.ConsoleAppender

# A1 uses PatternLayout.
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%c %p: %m%n

Values in the configuration file can include Unix environment variables using the "`${ENVVAR}`" notation. In addition, "`${user.home}`" and "`${user.name}`" may be useful substitutions. Note that "`~`" substitution does ''not'' work.

Useful appenders include:

For the PatternLayout, all the conversion specifiers from this document should work.

Read more about log4cxx configuration here.

Programmatic Control of Threshold

The threshold level of any logger can be set or queried programmatically in both the C++ and Python layers.

In C++, the following macros are available:

In Python, the lsst.log module includes the following functions and variables:

Fine-level Debugging Example

The following is a simple recipe for emulating additional debugging levels using the above API within Python. Analogous code can be readily written in C++ using the corresponding macros.

 def debugLoggerName(num):
     """
     Returns the logger name that corresponds to fine-level debugging number NUM.
     """
     return 'MyLogger.debug{}'.format(num)

 def debugAt(num, fmt, *args):
     """
     Sends the log message created from FMT and *ARGS to the logger corresponding to fine-level debugging number NUM.
     """
     lsst.log.log(debugLoggerName(num), lsst.log.DEBUG, fmt, *args)

 def debugSetAt(num):
     """
     Adjusts logging level thresholds to emulate debugging with fine-level NUM.
     """
     for i in range(5):
         lsst.log.setLevel(debugLoggerName(i), lsst.log.INFO if i < num else lsst.log.DEBUG)

 debugSetAt(1)
 debugAt(1, "Debug 1 statement that will display")
 debugAt(2, "Debug 2 statement that will display")
 debugAt(3, "Debug 3 statement that will display")
 debugAt(4, "Debug 4 statement that will display")
 debugAt(5, "Debug 5 statement that will display")
 debugSetAt(2)
 debugAt(1, "Debug 1 statement that will NOT display")
 debugAt(2, "Debug 2 statement that will display")
 debugAt(3, "Debug 3 statement that will display")
 debugAt(4, "Debug 4 statement that will display")
 debugAt(5, "Debug 5 statement that will display")
 debugSetAt(3)
 debugAt(1, "Debug 1 statement that will NOT display")
 debugAt(2, "Debug 2 statement that will NOT display")
 debugAt(3, "Debug 3 statement that will display")
 debugAt(4, "Debug 4 statement that will display")
 debugAt(5, "Debug 5 statement that will display")
 debugSetAt(4)
 debugAt(1, "Debug 1 statement that will NOT display")
 debugAt(2, "Debug 2 statement that will NOT display")
 debugAt(3, "Debug 3 statement that will NOT display")
 debugAt(4, "Debug 4 statement that will display")
 debugAt(5, "Debug 5 statement that will display")
 debugSetAt(5)
 debugAt(1, "Debug 1 statement that will NOT display")
 debugAt(2, "Debug 2 statement that will NOT display")
 debugAt(3, "Debug 3 statement that will NOT display")
 debugAt(4, "Debug 4 statement that will NOT display")
 debugAt(5, "Debug 5 statement that will display")

Mapped Diagnostic Context

User-specified metadata in the form of key/value pairs may be given to lsst.log using the macro LOG_MDC(key, value) within C++, or the lsst.log function MDC(key, value) within Python. These metadata may then be automatically included as part of any/all subsequent log messages by specifying the corresponding key in the appender's formatting string as per standard log4cxx. These metadata are handled using log4cxx's mapped diagnostic context (MDC) feature, which has thread-level scope.

For example, a session id may be included by using the following formatting string:

"%d [%t] %-5p %c{2} (%X{session}) %m%n"

or alternatively using %X format code without additional key to display full MDC:

"%d [%t] %-5p %c{2} %m%n MDC=%X"

In C++, the following MDC macros are available:

In Python, the lsst.log module provides the following MDC functions:

Display PID in the LSST logs

PID can be displayed in the logs using code below:

In C++:

#include "lsst/log/Log.h"
...
-- Add a PID key in MDC
LOG_MDC("PID", std::to_string(getpid()));
-- Then configure lsst/log
...

In python:

import lsst.log as log
...
# Add a PID key in MDC
log.MDC("PID", os.getpid())
# Then configure lsst/log
...

This has to be done after fork() and corresponding formatting code needs to be added to the logger configuration:

log4j.appender.FILE.layout.conversionPattern=%X{PID} - %m%n

Display thread ID

The conversion pattern in log4cxx has special conversion code (t) to render "thread ID", the rendered value is a pointer to thread_t object (on POSIX systems) and is rendered as a long string of hexadecimal digits which makes it difficult to read for non-hardcore programmers. lsst.log provides more human-friendly way to print thread-identifying information. Special function lsst::log::lwpID() returns light-weight process ID (LWP) on the plaftorms that support this feature and small incremental integer number on other platforms. Using this function one can add LWP to MDC and configure output format to include this special MDC key:

In C++:

#include "lsst/log/Log.h"
...
  -- Add LWP key to MDC
  LOG_MDC("LWP", std::to_string(lsst::log::lwpID()));

In Python:

import lsst.log as log
...
# Add LWP key to MDC
log.MDC("LWP", log.lwpID())

MDC and multi-threading

MDC is a great tool to associate some context information with the messages but its use in multi-threaded applications can be complicated. MDC is per-thread instance and it is initialized as empty in every new thread. If application adds somethig to MDC in main thread this information will not exist in other threads unless it is explicitly set in those other threads. This may be an issue for applications which do not have complete control over thread lifetime.

This package provides a mechanism to initialize MDC in all threads started by a process. Special macro LOG_MDC_INIT(function) can be used to register a function to be called in in every new thread (and in current thread) before any logging message is sent to a logging system. This function can do anything but its main pupose is to initialize MDC with useful values.

Here is an example of this:

#include "lsst/log/Log.h"
...
void init_mdc() { LOG_MDC("PID", std::to_string(getpid())); }
...
int main(int, char**) {
    LOG_MDC_INIT(init_mdc);
    ....
}

or in Python:

import lsst.log as log
...
def init_mdc()
    log.MDC("PID", os.getpid())
...
log.MDCRegisterInit(init_mdc)

LOG_MDC_INIT(function) returns some unspecified integer value, this can be used to do one-time initialization (e.g. in some shared library):

#include "lsst/log/Log.h"
...
void init_mdc() { LOG_MDC("MY_CONTEXT", "data"); }
...
void foo() {
    static int dummyMdcInit = LOG_MDC_INIT(init_mdc);
    ....
}

One should note that this initialization function will only be called in the current thread from which LOG_MDC_INIT() was called and all new threads, if there are some other threads running at a time of this call the function will not be called for them. Normally the function registered with LOG_MDC_INIT() will be called only once per thread but under some circumstances it may be called more than once. Registering the same function multiple times will result is multiple calls to the same function.

Benchmarks

Measuring the performance of lsst.log when actually writing log messages to output targets such as a file or socket provides little to no information due to buffering and the fact that in the absence of buffering these operations are I/O limited. Conversely, timing calls to log functions when the level threshold is not met is quite valuable since an ideal logging system would add no appreciable overhead when deactivated. Basic measurements of the performance of Log have been made with the level threshold such that logging messages are not written. These measurements are made within a single-node instance of Qserv running on lsst-dev03 without significant competition from other system activity. The average time required to submit the following suppressed log message is 26 nanoseconds:

LOG_INFO("Hello default logger!");

The same holds for the iostream style interface (also 26 nanoseconds):

LOGS_INFO("Hello default logger!");

Importantly, the overhead is unaffected when formatting is introduced since the CPU will not encounter these instructions when the logging threshold is not met. For example, the average time required to submit the following suppressed log message is also 26 nanoseconds:

const char* info_stuff = "important stuff";
LOGS_INFO("Hello default logger with " << info_stuff);

Note that these timings are indistinguishable from those using log4cxx directly, without the Log layer.

Meanwhile, the situation is quite different when looking up the logger by name. For instance, the average time required to submit the following suppressed log message is 1.0 microseconds (~40X slower):

LOG("myLogger", LOG_LVL_INFO, "Hello my logger!");

The same holds when using the iostream style interface or when formatting. That is, the following the example also takes 1.0 microseconds:

const char* info_stuff = "important stuff";
LOGS("myLogger", LOG_LVL_INFO, "Hello my logger with " << info_stuff);

As an alternative to using the default logger, intermediate performance can be achieved for suppressed log messages using logger objects. Retrieve the logger object by name:

LOG_LOGGER myLogger = LOG_GET("myLogger");

Now, the following suppressed log message takes 190 nanoseconds:

LOG(myLogger, LOG_LVL_INFO, "Hello my logger!");

The 190 - 26 = 164 nanoseconds differential between this and the LOG_INFO example above is attributable to looking up the logging level LOG_LVL_INFO.

These measurement were made using the OpenMP API with code like the following:

int iterations = 1000;
double start, stop;
LOG_SET_LVL("", LOG_LVL_WARN);
assert(!LOG_CHECK_LVL("", LOG_LVL_INFO));
start = omp_get_wtime();
for (int i=0; i < iterations; i++)

These log messages will not print. LOG_INFO("Hello default logger!"); stop = omp_get_wtime(); LOG_WARN("LOG_INFO(...): avg time = %f" % ((stop - start)/iterations));