Skip to content

API Reference

Packages:

app.fybrik.io/v1alpha1

Resource Types:

Blueprint

↩ Parent

Blueprint is the Schema for the blueprints API

Name Type Description Required
apiVersion string app.fybrik.io/v1alpha1 true
kind string Blueprint true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication).
false
status object BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
false

Blueprint.spec

↩ Parent

BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication).

Name Type Description Required
cluster string Cluster indicates the cluster on which the Blueprint runs
true
modules map[string]object Modules is a map which contains modules that indicate the data path components that run in this cluster The map key is InstanceName which is the unique name for the deployed instance related to this workload
true

Blueprint.spec.modules[key]

↩ Parent

BlueprintModule is a copy of a FybrikModule Custom Resource. It contains the information necessary to instantiate a datapath component, including the parameters relevant for the particular workload.

Name Type Description Required
arguments object Arguments are the input parameters for a specific instance of a module.
false
assetIds []string assetIDs indicate the assets processed by this module. Included so we can track asset status as well as module status in the future.
false
chart object Chart contains the location of the helm chart with info detailing how to deploy
true
name string Name of the fybrikmodule on which this is based
true

Blueprint.spec.modules[key].arguments

↩ Parent

Arguments are the input parameters for a specific instance of a module.

Name Type Description Required
appSelector object Application selector is used to identify the user workload. It is obtained from FybrikApplication spec.
false
copy object CopyArgs are parameters specific to modules that copy data from one data store to another.
false
labels map[string]string Labels of FybrikApplication
false
read []object ReadArgs are parameters that are specific to modules that enable an application to read data
false
write []object WriteArgs are parameters that are specific to modules that enable an application to write data
false

Blueprint.spec.modules[key].arguments.appSelector

↩ Parent

Application selector is used to identify the user workload. It is obtained from FybrikApplication spec.

Name Type Description Required
matchExpressions []object matchExpressions is a list of label selector requirements. The requirements are ANDed.
false
matchLabels map[string]string matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
false

Blueprint.spec.modules[key].arguments.appSelector.matchExpressions[index]

↩ Parent

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name Type Description Required
values []string values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
false
key string key is the label key that the selector applies to.
true
operator string operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
true

Blueprint.spec.modules[key].arguments.copy

↩ Parent

CopyArgs are parameters specific to modules that copy data from one data store to another.

Name Type Description Required
transformations []object Transformations are different types of processing that may be done to the data as it is copied.
false
assetID string AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource
true
destination object Destination is the data store to which the data will be copied
true
source object Source is the where the data currently resides
true

Blueprint.spec.modules[key].arguments.copy.destination

↩ Parent

Destination is the data store to which the data will be copied

Name Type Description Required
connection object Connection has the relevant details for accesing the data (url, table, ssl, etc.)
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true
vault map[string]object Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.
true

Blueprint.spec.modules[key].arguments.copy.destination.vault[key]

↩ Parent

Holds details for retrieving credentials from Vault store.

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

Blueprint.spec.modules[key].arguments.copy.source

↩ Parent

Source is the where the data currently resides

Name Type Description Required
connection object Connection has the relevant details for accesing the data (url, table, ssl, etc.)
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true
vault map[string]object Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.
true

Blueprint.spec.modules[key].arguments.copy.source.vault[key]

↩ Parent

Holds details for retrieving credentials from Vault store.

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

Blueprint.spec.modules[key].arguments.read[index]

↩ Parent

ReadModuleArgs define the input parameters for modules that read data from location A

Name Type Description Required
transformations []object Transformations are different types of processing that may be done to the data
false
assetID string AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource
true
source object Source of the read path module
true

Blueprint.spec.modules[key].arguments.read[index].source

↩ Parent

Source of the read path module

Name Type Description Required
connection object Connection has the relevant details for accesing the data (url, table, ssl, etc.)
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true
vault map[string]object Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.
true

Blueprint.spec.modules[key].arguments.read[index].source.vault[key]

↩ Parent

Holds details for retrieving credentials from Vault store.

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

Blueprint.spec.modules[key].arguments.write[index]

↩ Parent

WriteModuleArgs define the input parameters for modules that write data to location B

Name Type Description Required
transformations []object Transformations are different types of processing that may be done to the data as it is written.
false
assetID string AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource
true
destination object Destination is the data store to which the data will be written
true

Blueprint.spec.modules[key].arguments.write[index].destination

↩ Parent

Destination is the data store to which the data will be written

Name Type Description Required
connection object Connection has the relevant details for accesing the data (url, table, ssl, etc.)
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true
vault map[string]object Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.
true

Blueprint.spec.modules[key].arguments.write[index].destination.vault[key]

↩ Parent

Holds details for retrieving credentials from Vault store.

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

Blueprint.spec.modules[key].chart

↩ Parent

Chart contains the location of the helm chart with info detailing how to deploy

Name Type Description Required
chartPullSecret string Name of secret containing helm registry credentials
false
values map[string]string Values to pass to helm chart installation
false
name string Name of helm chart
true

Blueprint.status

↩ Parent

BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring

Name Type Description Required
observedGeneration integer ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked.

Format: int64
false
observedState object ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
false
releases map[string]integer Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled.
false
modules map[string]object ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload
true

Blueprint.status.observedState

↩ Parent

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Blueprint.status.modules[key]

↩ Parent

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

FybrikApplication

↩ Parent

FybrikApplication provides information about the application being used by a Data Scientist, the nature of the processing, and the data sets that the Data Scientist has chosen for processing by the application. The FybrikApplication controller (aka pilot) obtains instructions regarding any governance related changes that must be performed on the data, identifies the modules capable of performing such changes, and finally generates the Blueprint which defines the secure runtime environment and all the components in it. This runtime environment provides the Data Scientist's application with access to the data requested in a secure manner and without having to provide any credentials for the data sets. The credentials are obtained automatically by the manager from an external credential management system, which may or may not be part of a data catalog.

Name Type Description Required
apiVersion string app.fybrik.io/v1alpha1 true
kind string FybrikApplication true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object FybrikApplicationSpec defines the desired state of FybrikApplication.
false
status object FybrikApplicationStatus defines the observed state of FybrikApplication.
false

FybrikApplication.spec

↩ Parent

FybrikApplicationSpec defines the desired state of FybrikApplication.

Name Type Description Required
secretRef string SecretRef points to the secret that holds credentials for each system the user has been authenticated with. The secret is deployed in FybrikApplication namespace.
false
selector object Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.
false
appInfo map[string]string AppInfo contains information describing the reasons for the processing that will be done by the Data Scientist's application.
true
data []object Data contains the identifiers of the data to be used by the Data Scientist's application, and the protocol used to access it and the format expected.
true

FybrikApplication.spec.selector

↩ Parent

Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.

Name Type Description Required
clusterName string Cluster name
false
workloadSelector object WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.
true

FybrikApplication.spec.selector.workloadSelector

↩ Parent

WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.

Name Type Description Required
matchExpressions []object matchExpressions is a list of label selector requirements. The requirements are ANDed.
false
matchLabels map[string]string matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
false

FybrikApplication.spec.selector.workloadSelector.matchExpressions[index]

↩ Parent

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name Type Description Required
values []string values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
false
key string key is the label key that the selector applies to.
true
operator string operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
true

FybrikApplication.spec.data[index]

↩ Parent

DataContext indicates data set chosen by the Data Scientist to be used by his application, and includes information about the data format and technologies used by the application to access the data.

Name Type Description Required
catalogService string CatalogService represents the catalog service for accessing the requested dataset. If not specified, the enterprise catalog service will be used.
false
dataSetID string DataSetID is a unique identifier of the dataset chosen from the data catalog for processing by the data user application.
true
requirements object Requirements from the system
true

FybrikApplication.spec.data[index].requirements

↩ Parent

Requirements from the system

Name Type Description Required
copy object CopyRequrements include the requirements for copying the data
false
interface object Interface indicates the protocol and format expected by the data user
true

FybrikApplication.spec.data[index].requirements.copy

↩ Parent

CopyRequrements include the requirements for copying the data

Name Type Description Required
catalog object Catalog indicates that the data asset must be cataloged.
false
required boolean Required indicates that the data must be copied.
false

FybrikApplication.spec.data[index].requirements.copy.catalog

↩ Parent

Catalog indicates that the data asset must be cataloged.

Name Type Description Required
catalogID string CatalogID specifies the catalog where the data will be cataloged.
false
service string CatalogService specifies the datacatalog service that will be used for catalogging the data into.
false

FybrikApplication.spec.data[index].requirements.interface

↩ Parent

Interface indicates the protocol and format expected by the data user

Name Type Description Required
dataformat string DataFormat defines the data format type
false
protocol string Protocol defines the interface protocol used for data transactions
true

FybrikApplication.status

↩ Parent

FybrikApplicationStatus defines the observed state of FybrikApplication.

Name Type Description Required
assetStates map[string]object AssetStates provides a status per asset
false
errorMessage string ErrorMessage indicates that an error has happened during the reconcile, unrelated to a specific asset
false
generated object Generated resource identifier
false
observedGeneration integer ObservedGeneration is taken from the FybrikApplication metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether the Blueprint status changed.

Format: int64
false
provisionedStorage map[string]object ProvisionedStorage maps a dataset (identified by AssetID) to the new provisioned bucket. It allows FybrikApplication controller to manage buckets in case the spec has been modified, an error has occurred, or a delete event has been received. ProvisionedStorage has the information required to register the dataset once the owned plotter resource is ready
false
ready boolean Ready is true if all specified assets are either ready to be used or are denied access.
false
validApplication string ValidApplication indicates whether the FybrikApplication is valid given the defined taxonomy
false
validatedGeneration integer ValidatedGeneration is the version of the FyrbikApplication that has been validated with the taxonomy defined.

Format: int64
false

FybrikApplication.status.assetStates[key]

↩ Parent

AssetState defines the observed state of an asset

Name Type Description Required
catalogedAsset string CatalogedAsset provides a new asset identifier after being registered in the enterprise catalog
false
conditions []object Conditions indicate the asset state (Ready, Deny, Error)
false
endpoint object Endpoint provides the endpoint spec from which the asset will be served to the application
false

FybrikApplication.status.assetStates[key].conditions[index]

↩ Parent

Condition describes the state of a FybrikApplication at a certain point.

Name Type Description Required
message string Message contains the details of the current condition
false
status string Status of the condition: true or false
true
type string Type of the condition
true

FybrikApplication.status.assetStates[key].endpoint

↩ Parent

Endpoint provides the endpoint spec from which the asset will be served to the application

Name Type Description Required
hostname string Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module.
Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.
false
port integer

Format: int32
true
scheme string For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc
true

FybrikApplication.status.generated

↩ Parent

Generated resource identifier

Name Type Description Required
appVersion integer Version of FybrikApplication that has generated this resource

Format: int64
true
kind string Kind of the resource (Blueprint, Plotter)
true
name string Name of the resource
true
namespace string Namespace of the resource
true

FybrikApplication.status.provisionedStorage[key]

↩ Parent

DatasetDetails contain dataset connection and metadata required to register this dataset in the enterprise catalog

Name Type Description Required
datasetRef string Reference to a Dataset resource containing the request to provision storage
false
details object Dataset information
false
secretRef string Reference to a secret where the credentials are stored
false

FybrikModule

↩ Parent

FybrikModule is a description of an injectable component. the parameters it requires, as well as the specification of how to instantiate such a component. It is used as metadata only. There is no status nor reconciliation.

Name Type Description Required
apiVersion string app.fybrik.io/v1alpha1 true
kind string FybrikModule true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.
true

FybrikModule.spec

↩ Parent

FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.

Name Type Description Required
dependencies []object Other components that must be installed in order for this module to work
false
description string An explanation of what this module does
false
pluginType string Plugin type indicates the plugin technology used to invoke the capabilities Ex: vault, fybrik-wasm... Should be provided if type is plugin
false
statusIndicators []object StatusIndicators allow to check status of a non-standard resource that can not be computed by helm/kstatus
false
capabilities []object Capabilities declares what this module knows how to do and the types of data it knows how to handle The key to the map is a CapabilityType string
true
chart object Reference to a Helm chart that allows deployment of the resources required for this module
true
type string May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module
true

FybrikModule.spec.dependencies[index]

↩ Parent

Dependency details another component on which this module relies - i.e. a pre-requisit

Name Type Description Required
name string Name is the name of the dependent component
true
type enum Type provides information used in determining how to instantiate the component

Enum: module, connector, feature
true

FybrikModule.spec.statusIndicators[index]

↩ Parent

ResourceStatusIndicator is used to determine the status of an orchestrated resource

Name Type Description Required
errorMessage string ErrorMessage specifies the resource field to check for an error, e.g. status.errorMsg
false
failureCondition string FailureCondition specifies a condition that indicates the resource failure It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/)
false
kind string Kind provides information about the resource kind
true
successCondition string SuccessCondition specifies a condition that indicates that the resource is ready It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/)
true

FybrikModule.spec.capabilities[index]

↩ Parent

Capability declares what this module knows how to do and the types of data it knows how to handle

Name Type Description Required
actions []object Actions are the data transformations that the module supports
false
api object API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required?
false
plugins []object Plugins enable the module to add libraries to perform actions rather than implementing them by itself
false
scope enum Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset

Enum: asset, workload, cluster
false
supportedInterfaces []object Copy should have one or more instances in the list, and its content should have source and sink Read should have one or more instances in the list, each with source populated Write should have one or more instances in the list, each with sink populated This field may not be required if not handling data
false
capability enum Capability declares what this module knows how to do - ex: read, write, transform...

Enum: copy, read, write, transform
true

FybrikModule.spec.capabilities[index].api

↩ Parent

API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required?

Name Type Description Required
dataformat string DataFormat defines the data format type
false
endpoint object EndpointSpec is used both by the module creator and by the status of the fybrikapplication
true
protocol string Protocol defines the interface protocol used for data transactions
true

FybrikModule.spec.capabilities[index].api.endpoint

↩ Parent

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name Type Description Required
hostname string Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module.
Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.
false
port integer

Format: int32
true
scheme string For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc
true

FybrikModule.spec.capabilities[index].plugins[index]

↩ Parent

Name Type Description Required
dataFormat string DataFormat indicates the format of data the plugin knows how to process
true
pluginType string PluginType indicates the technology used for the module and the plugin to interact The values supported should come from the module taxonomy Examples of such mechanisms are vault plugins, wasm, etc
true

FybrikModule.spec.capabilities[index].supportedInterfaces[index]

↩ Parent

ModuleInOut specifies the protocol and format of the data input and output by the module - if any

Name Type Description Required
sink object Sink specifies the output data protocol and format
false
source object Source specifies the input data protocol and format
false

FybrikModule.spec.capabilities[index].supportedInterfaces[index].sink

↩ Parent

Sink specifies the output data protocol and format

Name Type Description Required
dataformat string DataFormat defines the data format type
false
protocol string Protocol defines the interface protocol used for data transactions
true

FybrikModule.spec.capabilities[index].supportedInterfaces[index].source

↩ Parent

Source specifies the input data protocol and format

Name Type Description Required
dataformat string DataFormat defines the data format type
false
protocol string Protocol defines the interface protocol used for data transactions
true

FybrikModule.spec.chart

↩ Parent

Reference to a Helm chart that allows deployment of the resources required for this module

Name Type Description Required
chartPullSecret string Name of secret containing helm registry credentials
false
values map[string]string Values to pass to helm chart installation
false
name string Name of helm chart
true

FybrikStorageAccount

↩ Parent

FybrikStorageAccount defines a storage account used for copying data. Only S3 based storage is supported. It contains endpoint, region and a reference to the credentials a Owner of the asset is responsible to store the credentials

Name Type Description Required
apiVersion string app.fybrik.io/v1alpha1 true
kind string FybrikStorageAccount true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount
false
status object FybrikStorageAccountStatus defines the observed state of FybrikStorageAccount
false

FybrikStorageAccount.spec

↩ Parent

FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount

Name Type Description Required
endpoint string Endpoint
true
regions []string Regions
true
secretRef string A name of k8s secret deployed in the control plane. This secret includes secretKey and accessKey credentials for S3 bucket
true

Plotter

↩ Parent

Plotter is the Schema for the plotters API

Name Type Description Required
apiVersion string app.fybrik.io/v1alpha1 true
kind string Plotter true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.
false
status object PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring
false

Plotter.spec

↩ Parent

PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.

Name Type Description Required
appSelector object Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.
false
assets map[string]object Assets is a map holding information about the assets The key is the assetID
true
flows []object
true
templates map[string]object Templates is a map holding the templates used in this plotter steps The key is the template name
true

Plotter.spec.appSelector

↩ Parent

Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.

Name Type Description Required
clusterName string Cluster name
false
workloadSelector object WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.
true

Plotter.spec.appSelector.workloadSelector

↩ Parent

WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.

Name Type Description Required
matchExpressions []object matchExpressions is a list of label selector requirements. The requirements are ANDed.
false
matchLabels map[string]string matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
false

Plotter.spec.appSelector.workloadSelector.matchExpressions[index]

↩ Parent

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name Type Description Required
values []string values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
false
key string key is the label key that the selector applies to.
true
operator string operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
true

Plotter.spec.assets[key]

↩ Parent

AssetDetails is a list of assets used in the fybrikapplication. In addition to assets declared in fybrikapplication, AssetDetails list also contains assets that are allocated by the control-plane in order to serve fybrikapplication

Name Type Description Required
advertisedAssetId string AdvertisedAssetID links this asset to asset from fybrikapplication and is used by user facing services
false
assetDetails object DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property.
true

Plotter.spec.assets[key].assetDetails

↩ Parent

DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property.

Name Type Description Required
connection object Connection has the relevant details for accesing the data (url, table, ssl, etc.)
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true
vault map[string]object Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.
true

Plotter.spec.assets[key].assetDetails.vault[key]

↩ Parent

Holds details for retrieving credentials from Vault store.

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

Plotter.spec.flows[index]

↩ Parent

Flows is the list of data flows driven from fybrikapplication: Each element in the list holds the flow of the data requested in fybrikapplication.

Name Type Description Required
assetId string AssetID indicates the data set being used in this data flow
true
flowType string Type of the flow (e.g. read)
true
name string Name of the flow
true
subFlows []object
true

Plotter.spec.flows[index].subFlows[index]

↩ Parent

Subflows is a list of data flows which are originated from the same data asset but are triggered differently (e.g., one upon init trigger and one upon workload trigger)

Name Type Description Required
flowType string Type of the flow (e.g. read)
true
name string Name of the SubFlow
true
steps [][]object Steps defines a series of sequential/parallel data flow steps The first dimension represents parallel data flows. The second sequential components within the same parallel data flow.
true
triggers []enum Triggers
true

Plotter.spec.flows[index].subFlows[index].steps[index][index]

↩ Parent

DataFlowStep contains details on a single data flow step

Name Type Description Required
parameters object Step parameters TODO why not flatten the parameters into this data flow step
false
cluster string Name of the cluster this step is executed on
true
name string Name of the step
true
template string Template is the name of the template to execute the step The full details of the template can be extracted from Plotter.spec.templates list field.
true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters

↩ Parent

Step parameters TODO why not flatten the parameters into this data flow step

Name Type Description Required
action []object Actions are the data transformations that the module supports
false
api object Service holds information for accessing a module instance
false
sink object StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane
false
source object StepSource is the source of this step: it could be assetID or an enpoint of another step
false

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api

↩ Parent

Service holds information for accessing a module instance

Name Type Description Required
endpoint object EndpointSpec is used both by the module creator and by the status of the fybrikapplication
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api.endpoint

↩ Parent

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name Type Description Required
hostname string Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module.
Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.
false
port integer

Format: int32
true
scheme string For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc
true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.sink

↩ Parent

StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane

Name Type Description Required
assetId string AssetID identifies the target asset of this step
true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source

↩ Parent

StepSource is the source of this step: it could be assetID or an enpoint of another step

Name Type Description Required
api object Service holds information for accessing a module instance
false
assetId string AssetID identifies the source asset of this step
false

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api

↩ Parent

Service holds information for accessing a module instance

Name Type Description Required
endpoint object EndpointSpec is used both by the module creator and by the status of the fybrikapplication
true
format string Format represents data format (e.g. parquet) as received from catalog connectors
true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api.endpoint

↩ Parent

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name Type Description Required
hostname string Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module.
Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.
false
port integer

Format: int32
true
scheme string For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc
true

Plotter.spec.templates[key]

↩ Parent

Template contains basic information about the required modules to serve the fybrikapplication e.g., the module helm chart name.

Name Type Description Required
name string Name of the template
false
modules []object Modules is a list of dependent modules. e.g., if a plugin module is used then the service module is used in should appear first in the modules list of the same template. If the modules list contains more than one module, the first module in the list is referred to as the "primary module" of which all the parameters to this template are sent to.
true

Plotter.spec.templates[key].modules[index]

↩ Parent

ModuleInfo is a copy of FybrikModule Custom Resource. It contains information to instantiate resource of type FybrikModule.

Name Type Description Required
scope enum Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset

Enum: asset, workload, cluster
false
chart object Chart contains the information needed to use helm to install the capability
true
name string Name of the module
true
type string May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module
true

Plotter.spec.templates[key].modules[index].chart

↩ Parent

Chart contains the information needed to use helm to install the capability

Name Type Description Required
chartPullSecret string Name of secret containing helm registry credentials
false
values map[string]string Values to pass to helm chart installation
false
name string Name of helm chart
true

Plotter.status

↩ Parent

PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring

Name Type Description Required
assets map[string]object Assets is a map containing the status per asset. The key of this map is assetId
false
blueprints map[string]object
false
conditions []object Conditions represent the possible error and failure conditions
false
flows map[string]object Flows is a map containing the status for each flow the key is the flow name
false
observedGeneration integer ObservedGeneration is taken from the Plotter metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated blueprints should be checked.

Format: int64
false
observedState object ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
false
readyTimestamp string

Format: date-time
false

Plotter.status.assets[key]

↩ Parent

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Plotter.status.blueprints[key]

↩ Parent

MetaBlueprint defines blueprint metadata (name, namespace) and status

Name Type Description Required
name string
true
namespace string
true
status object BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
true

Plotter.status.blueprints[key].status

↩ Parent

BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring

Name Type Description Required
observedGeneration integer ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked.

Format: int64
false
observedState object ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
false
releases map[string]integer Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled.
false
modules map[string]object ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload
true

Plotter.status.blueprints[key].status.observedState

↩ Parent

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Plotter.status.blueprints[key].status.modules[key]

↩ Parent

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Plotter.status.conditions[index]

↩ Parent

Condition describes the state of a FybrikApplication at a certain point.

Name Type Description Required
message string Message contains the details of the current condition
false
status string Status of the condition: true or false
true
type string Type of the condition
true

Plotter.status.flows[key]

↩ Parent

FlowStatus includes information to be reported back to the FybrikApplication resource It holds the status per data flow

Name Type Description Required
status object ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions
false
subFlows map[string]object
true

Plotter.status.flows[key].status

↩ Parent

ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Plotter.status.flows[key].subFlows[key]

↩ Parent

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

Plotter.status.observedState

↩ Parent

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name Type Description Required
error string Error indicates that there has been an error to orchestrate the modules and provides the error message
false
ready boolean Ready represents that the modules have been orchestrated successfully and the data is ready for usage
false

katalog.fybrik.io/v1alpha1

Resource Types:

Asset

↩ Parent

Name Type Description Required
apiVersion string katalog.fybrik.io/v1alpha1 true
kind string Asset true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object
true

Asset.spec

↩ Parent

Name Type Description Required
assetDetails object Asset details
true
assetMetadata object
true
secretRef object Reference to a Secret resource holding credentials for this asset
true

Asset.spec.assetDetails

↩ Parent

Asset details

Name Type Description Required
dataFormat string
false
connection object Connection information
true

Asset.spec.assetDetails.connection

↩ Parent

Connection information

Name Type Description Required
db2 object
false
kafka object
false
s3 object Connection information for S3 compatible object store
false
type enum

Enum: s3, db2, kafka
true

Asset.spec.assetDetails.connection.db2

↩ Parent

Name Type Description Required
database string
false
port string
false
ssl string
false
table string
false
url string
false

Asset.spec.assetDetails.connection.kafka

↩ Parent

Name Type Description Required
bootstrap_servers string
false
key_deserializer string
false
sasl_mechanism string
false
schema_registry string
false
security_protocol string
false
ssl_truststore string
false
ssl_truststore_password string
false
topic_name string
false
value_deserializer string
false

Asset.spec.assetDetails.connection.s3

↩ Parent

Connection information for S3 compatible object store

Name Type Description Required
region string
false
bucket string
true
endpoint string
true
objectKey string
true

Asset.spec.assetMetadata

↩ Parent

Name Type Description Required
componentsMetadata map[string]object metadata for each component in asset (e.g., column)
false
geography string
false
namedMetadata map[string]string
false
owner string
false
tags []string Tags associated with the asset
false

Asset.spec.assetMetadata.componentsMetadata[key]

↩ Parent

Name Type Description Required
componentType string
false
namedMetadata map[string]string Named terms, that exist in Catalog toxonomy and the values for these terms for columns we will have "SchemaDetails" key, that will include technical schema details for this column TODO: Consider create special field for schema outside of metadata
false
tags []string Tags - can be any free text added to a component (no taxonomy)
false

Asset.spec.secretRef

↩ Parent

Reference to a Secret resource holding credentials for this asset

Name Type Description Required
name string Name of the Secret resource (must exist in the same namespace)
true

motion.fybrik.io/v1alpha1

Resource Types:

BatchTransfer

↩ Parent

BatchTransfer is the Schema for the batchtransfers API

Name Type Description Required
apiVersion string motion.fybrik.io/v1alpha1 true
kind string BatchTransfer true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.
false
status object BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data
false

BatchTransfer.spec

↩ Parent

BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.

Name Type Description Required
failedJobHistoryLimit integer Maximal number of failed Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set.

Minimum: 0
Maximum: 20
false
flowType enum Data flow type that specifies if this is a stream or a batch workflow

Enum: Batch, Stream
false
image string Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set.
false
imagePullPolicy string Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
maxFailedRetries integer Maximal number of failed retries until the batch job should stop trying. This property will be defaulted by the webhook if not set.

Minimum: 0
Maximum: 10
false
noFinalizer boolean If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set.
false
readDataType enum Data type of the data that is read from source (log data or change data)

Enum: LogData, ChangeData
false
schedule string Cron schedule if this BatchTransfer job should run on a regular schedule. Values are specified like cron job schedules. A good translation to human language can be found here https://crontab.guru/
false
secretProviderRole string Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
secretProviderURL string Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
spark object Optional Spark configuration for tuning
false
successfulJobHistoryLimit integer Maximal number of successful Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set.

Minimum: 0
Maximum: 20
false
suspend boolean If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set.
false
transformation []object Transformations to be applied to the source data before writing to destination
false
writeDataType enum Data type of how the data should be written to the target (log data or change data)

Enum: LogData, ChangeData
false
writeOperation enum Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream.

Enum: Overwrite, Append, Update
false
destination object Destination data store for this batch job
true
source object Source data store for this batch job
true

BatchTransfer.spec.spark

↩ Parent

Optional Spark configuration for tuning

Name Type Description Required
appName string Name of the transaction. Mainly used for debugging and lineage tracking.
false
driverCores integer Number of cores that the driver should use
false
driverMemory integer Memory that the driver should have
false
executorCores integer Number of cores that each executor should have
false
executorMemory string Memory that each executor should have
false
image string Image to be used for executors
false
imagePullPolicy string Image pull policy to be used for executor
false
numExecutors integer Number of executors to be started
false
options map[string]string Additional options for Spark configuration.
false
shufflePartitions integer Number of shuffle partitions for Spark
false

BatchTransfer.spec.transformation[index]

↩ Parent

to be refined...

Name Type Description Required
action enum Transformation action that should be performed.

Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows
false
columns []string Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation.
false
name string Name of the transaction. Mainly used for debugging and lineage tracking.
false
options map[string]string Additional options for this transformation.
false

BatchTransfer.spec.destination

↩ Parent

Destination data store for this batch job

Name Type Description Required
cloudant object IBM Cloudant. Needs cloudant legacy credentials.
false
database object Database data store. For the moment only Db2 is supported.
false
description string Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.
false
kafka object Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
false
s3 object An object store data store that is compatible with S3. This can be a COS bucket.
false

BatchTransfer.spec.destination.cloudant

↩ Parent

IBM Cloudant. Needs cloudant legacy credentials.

Name Type Description Required
password string Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
username string Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
database string Database to be read from/written to
true
host string Host of cloudant instance
true

BatchTransfer.spec.destination.cloudant.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.destination.database

↩ Parent

Database data store. For the moment only Db2 is supported.

Name Type Description Required
password string Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
user string Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
db2URL string URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.
true
table string Table to be read
true

BatchTransfer.spec.destination.database.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.destination.kafka

↩ Parent

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name Type Description Required
createSnapshot boolean If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
keyDeserializer string Deserializer to be used for the keys of the topic
false
password string Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.
false
saslMechanism string SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified
false
schemaRegistryURL string URL to the schema registry. The registry has to be Confluent schema registry compatible.
false
secretImport string Define a secret import definition.
false
securityProtocol string Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified
false
sslTruststore string A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret
false
sslTruststoreLocation string SSL truststore location.
false
sslTruststorePassword string SSL truststore password.
false
sslTruststoreSecret string Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as
false
user string Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
valueDeserializer string Deserializer to be used for the values of the topic
false
vault object Define secrets that are fetched from a Vault instance
false
kafkaBrokers string Kafka broker URLs as a comma separated list.
true
kafkaTopic string Kafka topic
true

BatchTransfer.spec.destination.kafka.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.destination.s3

↩ Parent

An object store data store that is compatible with S3. This can be a COS bucket.

Name Type Description Required
accessKey string Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
partitionBy []string Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.
false
region string Region of S3 service
false
secretImport string Define a secret import definition.
false
secretKey string Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
bucket string Bucket of S3 service
true
endpoint string Endpoint of S3 service
true
objectKey string Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!
true

BatchTransfer.spec.destination.s3.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.source

↩ Parent

Source data store for this batch job

Name Type Description Required
cloudant object IBM Cloudant. Needs cloudant legacy credentials.
false
database object Database data store. For the moment only Db2 is supported.
false
description string Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.
false
kafka object Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
false
s3 object An object store data store that is compatible with S3. This can be a COS bucket.
false

BatchTransfer.spec.source.cloudant

↩ Parent

IBM Cloudant. Needs cloudant legacy credentials.

Name Type Description Required
password string Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
username string Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
database string Database to be read from/written to
true
host string Host of cloudant instance
true

BatchTransfer.spec.source.cloudant.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.source.database

↩ Parent

Database data store. For the moment only Db2 is supported.

Name Type Description Required
password string Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
user string Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
db2URL string URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.
true
table string Table to be read
true

BatchTransfer.spec.source.database.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.source.kafka

↩ Parent

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name Type Description Required
createSnapshot boolean If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
keyDeserializer string Deserializer to be used for the keys of the topic
false
password string Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.
false
saslMechanism string SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified
false
schemaRegistryURL string URL to the schema registry. The registry has to be Confluent schema registry compatible.
false
secretImport string Define a secret import definition.
false
securityProtocol string Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified
false
sslTruststore string A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret
false
sslTruststoreLocation string SSL truststore location.
false
sslTruststorePassword string SSL truststore password.
false
sslTruststoreSecret string Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as
false
user string Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
valueDeserializer string Deserializer to be used for the values of the topic
false
vault object Define secrets that are fetched from a Vault instance
false
kafkaBrokers string Kafka broker URLs as a comma separated list.
true
kafkaTopic string Kafka topic
true

BatchTransfer.spec.source.kafka.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.spec.source.s3

↩ Parent

An object store data store that is compatible with S3. This can be a COS bucket.

Name Type Description Required
accessKey string Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
partitionBy []string Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.
false
region string Region of S3 service
false
secretImport string Define a secret import definition.
false
secretKey string Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
bucket string Bucket of S3 service
true
endpoint string Endpoint of S3 service
true
objectKey string Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!
true

BatchTransfer.spec.source.s3.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

BatchTransfer.status

↩ Parent

BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data

Name Type Description Required
active object A pointer to the currently running job (or nil)
false
error string
false
lastCompleted object ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
false
lastFailed object ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
false
lastRecordTime string

Format: date-time
false
lastScheduleTime string Information when was the last time the job was successfully scheduled.

Format: date-time
false
lastSuccessTime string

Format: date-time
false
numRecords integer

Format: int64
Minimum: 0
false
status enum

Enum: STARTING, RUNNING, SUCCEEDED, FAILED
false

BatchTransfer.status.active

↩ Parent

A pointer to the currently running job (or nil)

Name Type Description Required
apiVersion string API version of the referent.
false
fieldPath string If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.
false
kind string Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
false
name string Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
false
namespace string Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
false
resourceVersion string Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
false
uid string UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
false

BatchTransfer.status.lastCompleted

↩ Parent

ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .

Name Type Description Required
apiVersion string API version of the referent.
false
fieldPath string If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.
false
kind string Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
false
name string Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
false
namespace string Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
false
resourceVersion string Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
false
uid string UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
false

BatchTransfer.status.lastFailed

↩ Parent

ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .

Name Type Description Required
apiVersion string API version of the referent.
false
fieldPath string If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.
false
kind string Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
false
name string Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
false
namespace string Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
false
resourceVersion string Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
false
uid string UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
false

StreamTransfer

↩ Parent

StreamTransfer is the Schema for the streamtransfers API

Name Type Description Required
apiVersion string motion.fybrik.io/v1alpha1 true
kind string StreamTransfer true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object StreamTransferSpec defines the desired state of StreamTransfer
false
status object StreamTransferStatus defines the observed state of StreamTransfer
false

StreamTransfer.spec

↩ Parent

StreamTransferSpec defines the desired state of StreamTransfer

Name Type Description Required
flowType enum Data flow type that specifies if this is a stream or a batch workflow

Enum: Batch, Stream
false
image string Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set.
false
imagePullPolicy string Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
noFinalizer boolean If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set.
false
readDataType enum Data type of the data that is read from source (log data or change data)

Enum: LogData, ChangeData
false
secretProviderRole string Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
secretProviderURL string Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set.
false
suspend boolean If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set.
false
transformation []object Transformations to be applied to the source data before writing to destination
false
triggerInterval string Interval in which the Micro batches of this stream should be triggered The default is '5 seconds'.
false
writeDataType enum Data type of how the data should be written to the target (log data or change data)

Enum: LogData, ChangeData
false
writeOperation enum Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream.

Enum: Overwrite, Append, Update
false
destination object Destination data store for this batch job
true
source object Source data store for this batch job
true

StreamTransfer.spec.transformation[index]

↩ Parent

to be refined...

Name Type Description Required
action enum Transformation action that should be performed.

Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows
false
columns []string Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation.
false
name string Name of the transaction. Mainly used for debugging and lineage tracking.
false
options map[string]string Additional options for this transformation.
false

StreamTransfer.spec.destination

↩ Parent

Destination data store for this batch job

Name Type Description Required
cloudant object IBM Cloudant. Needs cloudant legacy credentials.
false
database object Database data store. For the moment only Db2 is supported.
false
description string Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.
false
kafka object Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
false
s3 object An object store data store that is compatible with S3. This can be a COS bucket.
false

StreamTransfer.spec.destination.cloudant

↩ Parent

IBM Cloudant. Needs cloudant legacy credentials.

Name Type Description Required
password string Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
username string Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
database string Database to be read from/written to
true
host string Host of cloudant instance
true

StreamTransfer.spec.destination.cloudant.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.destination.database

↩ Parent

Database data store. For the moment only Db2 is supported.

Name Type Description Required
password string Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
user string Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
db2URL string URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.
true
table string Table to be read
true

StreamTransfer.spec.destination.database.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.destination.kafka

↩ Parent

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name Type Description Required
createSnapshot boolean If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
keyDeserializer string Deserializer to be used for the keys of the topic
false
password string Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.
false
saslMechanism string SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified
false
schemaRegistryURL string URL to the schema registry. The registry has to be Confluent schema registry compatible.
false
secretImport string Define a secret import definition.
false
securityProtocol string Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified
false
sslTruststore string A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret
false
sslTruststoreLocation string SSL truststore location.
false
sslTruststorePassword string SSL truststore password.
false
sslTruststoreSecret string Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as
false
user string Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
valueDeserializer string Deserializer to be used for the values of the topic
false
vault object Define secrets that are fetched from a Vault instance
false
kafkaBrokers string Kafka broker URLs as a comma separated list.
true
kafkaTopic string Kafka topic
true

StreamTransfer.spec.destination.kafka.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.destination.s3

↩ Parent

An object store data store that is compatible with S3. This can be a COS bucket.

Name Type Description Required
accessKey string Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
partitionBy []string Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.
false
region string Region of S3 service
false
secretImport string Define a secret import definition.
false
secretKey string Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
bucket string Bucket of S3 service
true
endpoint string Endpoint of S3 service
true
objectKey string Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!
true

StreamTransfer.spec.destination.s3.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.source

↩ Parent

Source data store for this batch job

Name Type Description Required
cloudant object IBM Cloudant. Needs cloudant legacy credentials.
false
database object Database data store. For the moment only Db2 is supported.
false
description string Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.
false
kafka object Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
false
s3 object An object store data store that is compatible with S3. This can be a COS bucket.
false

StreamTransfer.spec.source.cloudant

↩ Parent

IBM Cloudant. Needs cloudant legacy credentials.

Name Type Description Required
password string Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
username string Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
database string Database to be read from/written to
true
host string Host of cloudant instance
true

StreamTransfer.spec.source.cloudant.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.source.database

↩ Parent

Database data store. For the moment only Db2 is supported.

Name Type Description Required
password string Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
secretImport string Define a secret import definition.
false
user string Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
db2URL string URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.
true
table string Table to be read
true

StreamTransfer.spec.source.database.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.source.kafka

↩ Parent

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name Type Description Required
createSnapshot boolean If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
keyDeserializer string Deserializer to be used for the keys of the topic
false
password string Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.
false
saslMechanism string SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified
false
schemaRegistryURL string URL to the schema registry. The registry has to be Confluent schema registry compatible.
false
secretImport string Define a secret import definition.
false
securityProtocol string Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified
false
sslTruststore string A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret
false
sslTruststoreLocation string SSL truststore location.
false
sslTruststorePassword string SSL truststore password.
false
sslTruststoreSecret string Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as
false
user string Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
valueDeserializer string Deserializer to be used for the values of the topic
false
vault object Define secrets that are fetched from a Vault instance
false
kafkaBrokers string Kafka broker URLs as a comma separated list.
true
kafkaTopic string Kafka topic
true

StreamTransfer.spec.source.kafka.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.spec.source.s3

↩ Parent

An object store data store that is compatible with S3. This can be a COS bucket.

Name Type Description Required
accessKey string Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
dataFormat string Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.
false
partitionBy []string Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.
false
region string Region of S3 service
false
secretImport string Define a secret import definition.
false
secretKey string Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.
false
vault object Define secrets that are fetched from a Vault instance
false
bucket string Bucket of S3 service
true
endpoint string Endpoint of S3 service
true
objectKey string Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!
true

StreamTransfer.spec.source.s3.vault

↩ Parent

Define secrets that are fetched from a Vault instance

Name Type Description Required
address string Address is Vault address
true
authPath string AuthPath is the path to auth method i.e. kubernetes
true
role string Role is the Vault role used for retrieving the credentials
true
secretPath string SecretPath is the path of the secret holding the Credentials in Vault
true

StreamTransfer.status

↩ Parent

StreamTransferStatus defines the observed state of StreamTransfer

Name Type Description Required
active object A pointer to the currently running job (or nil)
false
error string
false
status enum

Enum: STARTING, RUNNING, STOPPED, FAILING
false

StreamTransfer.status.active

↩ Parent

A pointer to the currently running job (or nil)

Name Type Description Required
apiVersion string API version of the referent.
false
fieldPath string If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.
false
kind string Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
false
name string Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
false
namespace string Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
false
resourceVersion string Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
false
uid string UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
false