This page describes the information that your code should provide in all log entries it generates, and some tools fybrik provides to ensure consistency across components.
- Log entries should be written to stdout and stderr.
- Fybrik does not collect nor aggregate logs. This may be done by external tools. (ex: logstash, fluentd, etc.)
- A globally unique identifier for each FybrikApplication instance is passed to all control plane and data plane components to be included in log entries. This enables corrrelation of log entries across different logs and clusters for the specific instance, even if the name of the FybrikApplication is reused over time.
Log Entry Contents
All fybrik components, whether control plane or data plane components, should write log entries to stdout and stderr in json format. The contents of the log entries are detailed in fybrik.io/pkg/logging/logging.go.
The fybrik control plane uses zerolog for its golang components, and provides a library of fybrik specific helper functions to be used with it. Examples of how to use zerolog: https://github.com/rs/zerolog/blob/master/log_example_test.go
TBD - fybrik logging helper functions for python and java.
Log Entry Verbosity
The choice of a log level should take into account in which environments the logged information is relevant: production, testing, or development. Although the administrator can configure the verbosity as desired, the following are typical configurations for the different environments.
Errors should always be logged, and preferably with as much information as possible. To this end, the function
LogStructure in in pkg/logging/logging.go converts golang structures to json for inclussion in the log. Please note that panic and fatal should be used sparingly.
- panic (zerolog.PanicLevel, 5) - Errors that prevent the component from operating correctly and handling requests Ex: fybrik control plane did not deploy correctly Ex: Data plane component crashed and cannot handle requests
- fatal (zerolog.FatalLevel, 4) - Errors that prevent the component from successfully completing a particular task Ex: fybrikapplication controller cannot generate a plotter Ex: Arrow/Flight server used to read data cannot access data store
- error (zerolog.ErrorLevel, 3) - Errors that are not fatal nor panic, but that the user / request initiator is made aware of (typical production setting for stable solution) Ex: Dataset requested in fybrikapplication.spec is not allowed to be used Ex: Query to Arrow/Flight server used to read data returns an error because of incorrect dataset ID
- warn (zerolog.WarnLevel, 2) - Errors not shared with the user / request initiator, typically from which the component recovers on its own
All of the previous plus: - info (zerolog.InfoLevel, 1) - High level health information that makes it clear the overall status, but without much detail (highest level used in production)
All of the previous plus: - debug (zerolog.DebugLevel, 0) - Additional information needed to help identify problems (typically used during testing)
All of the previous plus: - trace (zerolog.TraceLevel, -1) - For tracing step by step flow of control (typically used during development)
JSON Logging Standard Format
All Fybrik components should generate logging information in a standard format. This information will be used by different actors for different purposes, so as much relevant information as possible needs to be captured in a consistent format.
We list all mandatory and optional fields to be used by all Fybrik components. In addition to the fields we list, Fybrik components may include extra fields as needed.
The fields in this section are typically generated by the logging libraries.
- level - log level (‘panic’, ‘fatal’, ‘error’, ‘warn’, ‘info’, ‘debug’, or ‘trace’)
- time - timestamp of the log event. Timestamps should be in ISO8601 format with time offset from UTC or timezone. Example: ‘2022-02-16T10:46:21+02:00’
- caller - the code line which generated the error (file name + line number). Example: manager/main.go:319
- app.fybrik.io/app-uuid - unique identifier for kubernetes FybrikApplication, used to correlate log messages across components for a particular FybrikApplication instance. It is also unique over time so one may differentiate between FybrikApplication instances with the same name created at different times
- message - string message for the log entry. Either this field or message_id must be included
- message_id - unique identifier indicating the message string that should be used. This is used instead of a message string for messages that need to support internationalization, such as those that go to users
- funcName - method or function in which the error occurred
- DataSetID - unique identifier for the data set
- ForUser - True if this should be shared with the end user in fybrikapplication status or events. False otherwise
- ForArchive - True if this should be archived long term. For example, if it contains full contents of FybrikApplication and its status and should be stored for auditing purposes
- cluster - cluster name on which the process generating the entry ran
- component - name of the component generating the log entry
- action - current operation being called. For example, “create_catalog” or “update_asset”
- response_time - response time of the current operation in milliseconds. Can be used in monitoring dashboards such as Kibana
- error – the error code or message returned to the fybrik component upon an unsuccessful action. Additional context should usually be provided in the accompanying message field
- LOGGING_VERBOSITY - should be set to one of the levels described in the previous section.
- PRETTY_LOGGING - If true log entries are in human readable format. If false, they are in json. Should only be true during development, since json is preferred to enable easy parsing by aggregator tools.
Logging of Structures
Fybrik provides a helper function called
LogStructure in pkg/logging/logging.go for writing Go structures in json format to the log. It supports different verbosity levels, and thus can be used in production, testing and development environments.