API Reference

Packages:

app.fybrik.io/v1alpha1
katalog.fybrik.io/v1alpha1
motion.fybrik.io/v1alpha1

app.fybrik.io/v1alpha1

Resource Types:

Blueprint
FybrikApplication
FybrikModule
FybrikStorageAccount
Plotter

Blueprint

Blueprint is the Schema for the blueprints API

Name	Type	Description	Required
apiVersion	string	app.fybrik.io/v1alpha1	true
kind	string	Blueprint	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication).	false
status	object	BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring	false

Blueprint.spec

^{^{↩ Parent}}

BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication).

Name	Type	Description	Required
cluster	string	Cluster indicates the cluster on which the Blueprint runs	true
modules	map[string]object	Modules is a map which contains modules that indicate the data path components that run in this cluster The map key is InstanceName which is the unique name for the deployed instance related to this workload	true

Blueprint.spec.modules[key]

^{^{↩ Parent}}

BlueprintModule is a copy of a FybrikModule Custom Resource. It contains the information necessary to instantiate a datapath component, including the parameters relevant for the particular workload.

Name	Type	Description	Required
arguments	object	Arguments are the input parameters for a specific instance of a module.	false
assetIds	[]string	assetIDs indicate the assets processed by this module. Included so we can track asset status as well as module status in the future.	false
chart	object	Chart contains the location of the helm chart with info detailing how to deploy	true
name	string	Name of the fybrikmodule on which this is based	true

Blueprint.spec.modules[key].arguments

^{^{↩ Parent}}

Arguments are the input parameters for a specific instance of a module.

Name	Type	Description	Required
appSelector	object	Application selector is used to identify the user workload. It is obtained from FybrikApplication spec.	false
copy	object	CopyArgs are parameters specific to modules that copy data from one data store to another.	false
labels	map[string]string	Labels of FybrikApplication	false
read	[]object	ReadArgs are parameters that are specific to modules that enable an application to read data	false
write	[]object	WriteArgs are parameters that are specific to modules that enable an application to write data	false

Blueprint.spec.modules[key].arguments.appSelector

^{^{↩ Parent}}

Application selector is used to identify the user workload. It is obtained from FybrikApplication spec.

Name	Type	Description	Required
matchExpressions	[]object	matchExpressions is a list of label selector requirements. The requirements are ANDed.	false
matchLabels	map[string]string	matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.	false

Blueprint.spec.modules[key].arguments.appSelector.matchExpressions[index]

^{^{↩ Parent}}

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name	Type	Description	Required
values	[]string	values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.	false
key	string	key is the label key that the selector applies to.	true
operator	string	operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.	true

Blueprint.spec.modules[key].arguments.copy

^{^{↩ Parent}}

CopyArgs are parameters specific to modules that copy data from one data store to another.

Name	Type	Description	Required
transformations	[]object	Transformations are different types of processing that may be done to the data as it is copied.	false
assetID	string	AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource	true
destination	object	Destination is the data store to which the data will be copied	true
source	object	Source is the where the data currently resides	true

Blueprint.spec.modules[key].arguments.copy.destination

^{^{↩ Parent}}

Destination is the data store to which the data will be copied

Name	Type	Description	Required
connection	object	Connection has the relevant details for accesing the data (url, table, ssl, etc.)	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true
vault	map[string]object	Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.	true

Blueprint.spec.modules[key].arguments.copy.destination.vault[key]

^{^{↩ Parent}}

Holds details for retrieving credentials from Vault store.

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

Blueprint.spec.modules[key].arguments.copy.source

^{^{↩ Parent}}

Source is the where the data currently resides

Name	Type	Description	Required
connection	object	Connection has the relevant details for accesing the data (url, table, ssl, etc.)	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true
vault	map[string]object	Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.	true

Blueprint.spec.modules[key].arguments.copy.source.vault[key]

^{^{↩ Parent}}

Holds details for retrieving credentials from Vault store.

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

Blueprint.spec.modules[key].arguments.read[index]

^{^{↩ Parent}}

ReadModuleArgs define the input parameters for modules that read data from location A

Name	Type	Description	Required
transformations	[]object	Transformations are different types of processing that may be done to the data	false
assetID	string	AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource	true
source	object	Source of the read path module	true

Blueprint.spec.modules[key].arguments.read[index].source

^{^{↩ Parent}}

Source of the read path module

Name	Type	Description	Required
connection	object	Connection has the relevant details for accesing the data (url, table, ssl, etc.)	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true
vault	map[string]object	Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.	true

Blueprint.spec.modules[key].arguments.read[index].source.vault[key]

^{^{↩ Parent}}

Holds details for retrieving credentials from Vault store.

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

Blueprint.spec.modules[key].arguments.write[index]

^{^{↩ Parent}}

WriteModuleArgs define the input parameters for modules that write data to location B

Name	Type	Description	Required
transformations	[]object	Transformations are different types of processing that may be done to the data as it is written.	false
assetID	string	AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource	true
destination	object	Destination is the data store to which the data will be written	true

Blueprint.spec.modules[key].arguments.write[index].destination

^{^{↩ Parent}}

Destination is the data store to which the data will be written

Name	Type	Description	Required
connection	object	Connection has the relevant details for accesing the data (url, table, ssl, etc.)	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true
vault	map[string]object	Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.	true

Blueprint.spec.modules[key].arguments.write[index].destination.vault[key]

^{^{↩ Parent}}

Holds details for retrieving credentials from Vault store.

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

Blueprint.spec.modules[key].chart

^{^{↩ Parent}}

Chart contains the location of the helm chart with info detailing how to deploy

Name	Type	Description	Required
chartPullSecret	string	Name of secret containing helm registry credentials	false
values	map[string]string	Values to pass to helm chart installation	false
name	string	Name of helm chart	true

Blueprint.status

^{^{↩ Parent}}

BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring

Name	Type	Description	Required
observedGeneration	integer	ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64	false
observedState	object	ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions	false
releases	map[string]integer	Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled.	false
modules	map[string]object	ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload	true

Blueprint.status.observedState

^{^{↩ Parent}}

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Blueprint.status.modules[key]

^{^{↩ Parent}}

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

FybrikApplication

^{^{↩ Parent}}

FybrikApplication provides information about the application being used by a Data Scientist, the nature of the processing, and the data sets that the Data Scientist has chosen for processing by the application. The FybrikApplication controller (aka pilot) obtains instructions regarding any governance related changes that must be performed on the data, identifies the modules capable of performing such changes, and finally generates the Blueprint which defines the secure runtime environment and all the components in it. This runtime environment provides the Data Scientist's application with access to the data requested in a secure manner and without having to provide any credentials for the data sets. The credentials are obtained automatically by the manager from an external credential management system, which may or may not be part of a data catalog.

Name	Type	Description	Required
apiVersion	string	app.fybrik.io/v1alpha1	true
kind	string	FybrikApplication	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	FybrikApplicationSpec defines the desired state of FybrikApplication.	false
status	object	FybrikApplicationStatus defines the observed state of FybrikApplication.	false

FybrikApplication.spec

^{^{↩ Parent}}

FybrikApplicationSpec defines the desired state of FybrikApplication.

Name	Type	Description	Required
secretRef	string	SecretRef points to the secret that holds credentials for each system the user has been authenticated with. The secret is deployed in FybrikApplication namespace.	false
selector	object	Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.	false
appInfo	map[string]string	AppInfo contains information describing the reasons for the processing that will be done by the Data Scientist's application.	true
data	[]object	Data contains the identifiers of the data to be used by the Data Scientist's application, and the protocol used to access it and the format expected.	true

FybrikApplication.spec.selector

^{^{↩ Parent}}

Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.

Name	Type	Description	Required
clusterName	string	Cluster name	false
workloadSelector	object	WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.	true

FybrikApplication.spec.selector.workloadSelector

^{^{↩ Parent}}

WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.

Name	Type	Description	Required
matchExpressions	[]object	matchExpressions is a list of label selector requirements. The requirements are ANDed.	false
matchLabels	map[string]string	matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.	false

FybrikApplication.spec.selector.workloadSelector.matchExpressions[index]

^{^{↩ Parent}}

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name	Type	Description	Required
values	[]string	values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.	false
key	string	key is the label key that the selector applies to.	true
operator	string	operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.	true

FybrikApplication.spec.data[index]

^{^{↩ Parent}}

DataContext indicates data set chosen by the Data Scientist to be used by his application, and includes information about the data format and technologies used by the application to access the data.

Name	Type	Description	Required
catalogService	string	CatalogService represents the catalog service for accessing the requested dataset. If not specified, the enterprise catalog service will be used.	false
dataSetID	string	DataSetID is a unique identifier of the dataset chosen from the data catalog for processing by the data user application.	true
requirements	object	Requirements from the system	true

FybrikApplication.spec.data[index].requirements

^{^{↩ Parent}}

Requirements from the system

Name	Type	Description	Required
copy	object	CopyRequrements include the requirements for copying the data	false
interface	object	Interface indicates the protocol and format expected by the data user	true

FybrikApplication.spec.data[index].requirements.copy

^{^{↩ Parent}}

CopyRequrements include the requirements for copying the data

Name	Type	Description	Required
catalog	object	Catalog indicates that the data asset must be cataloged.	false
required	boolean	Required indicates that the data must be copied.	false

FybrikApplication.spec.data[index].requirements.copy.catalog

^{^{↩ Parent}}

Catalog indicates that the data asset must be cataloged.

Name	Type	Description	Required
catalogID	string	CatalogID specifies the catalog where the data will be cataloged.	false
service	string	CatalogService specifies the datacatalog service that will be used for catalogging the data into.	false

FybrikApplication.spec.data[index].requirements.interface

^{^{↩ Parent}}

Interface indicates the protocol and format expected by the data user

Name	Type	Description	Required
dataformat	string	DataFormat defines the data format type	false
protocol	string	Protocol defines the interface protocol used for data transactions	true

FybrikApplication.status

^{^{↩ Parent}}

FybrikApplicationStatus defines the observed state of FybrikApplication.

Name	Type	Description	Required
assetStates	map[string]object	AssetStates provides a status per asset	false
errorMessage	string	ErrorMessage indicates that an error has happened during the reconcile, unrelated to a specific asset	false
generated	object	Generated resource identifier	false
observedGeneration	integer	ObservedGeneration is taken from the FybrikApplication metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether the Blueprint status changed. Format: int64	false
provisionedStorage	map[string]object	ProvisionedStorage maps a dataset (identified by AssetID) to the new provisioned bucket. It allows FybrikApplication controller to manage buckets in case the spec has been modified, an error has occurred, or a delete event has been received. ProvisionedStorage has the information required to register the dataset once the owned plotter resource is ready	false
ready	boolean	Ready is true if all specified assets are either ready to be used or are denied access.	false
validApplication	string	ValidApplication indicates whether the FybrikApplication is valid given the defined taxonomy	false
validatedGeneration	integer	ValidatedGeneration is the version of the FyrbikApplication that has been validated with the taxonomy defined. Format: int64	false

FybrikApplication.status.assetStates[key]

^{^{↩ Parent}}

AssetState defines the observed state of an asset

Name	Type	Description	Required
catalogedAsset	string	CatalogedAsset provides a new asset identifier after being registered in the enterprise catalog	false
conditions	[]object	Conditions indicate the asset state (Ready, Deny, Error)	false
endpoint	object	Endpoint provides the endpoint spec from which the asset will be served to the application	false

FybrikApplication.status.assetStates[key].conditions[index]

^{^{↩ Parent}}

Condition describes the state of a FybrikApplication at a certain point.

Name	Type	Description	Required
message	string	Message contains the details of the current condition	false
status	string	Status of the condition: true or false	true
type	string	Type of the condition	true

FybrikApplication.status.assetStates[key].endpoint

^{^{↩ Parent}}

Endpoint provides the endpoint spec from which the asset will be served to the application

Name	Type	Description	Required
hostname	string	Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.	false
port	integer	Format: int32	true
scheme	string	For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc	true

FybrikApplication.status.generated

^{^{↩ Parent}}

Generated resource identifier

Name	Type	Description	Required
appVersion	integer	Version of FybrikApplication that has generated this resource Format: int64	true
kind	string	Kind of the resource (Blueprint, Plotter)	true
name	string	Name of the resource	true
namespace	string	Namespace of the resource	true

FybrikApplication.status.provisionedStorage[key]

^{^{↩ Parent}}

DatasetDetails contain dataset connection and metadata required to register this dataset in the enterprise catalog

Name	Type	Description	Required
datasetRef	string	Reference to a Dataset resource containing the request to provision storage	false
details	object	Dataset information	false
secretRef	string	Reference to a secret where the credentials are stored	false

FybrikModule

^{^{↩ Parent}}

FybrikModule is a description of an injectable component. the parameters it requires, as well as the specification of how to instantiate such a component. It is used as metadata only. There is no status nor reconciliation.

Name	Type	Description	Required
apiVersion	string	app.fybrik.io/v1alpha1	true
kind	string	FybrikModule	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.	true

FybrikModule.spec

^{^{↩ Parent}}

FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.

Name	Type	Description	Required
dependencies	[]object	Other components that must be installed in order for this module to work	false
description	string	An explanation of what this module does	false
pluginType	string	Plugin type indicates the plugin technology used to invoke the capabilities Ex: vault, fybrik-wasm... Should be provided if type is plugin	false
statusIndicators	[]object	StatusIndicators allow to check status of a non-standard resource that can not be computed by helm/kstatus	false
capabilities	[]object	Capabilities declares what this module knows how to do and the types of data it knows how to handle The key to the map is a CapabilityType string	true
chart	object	Reference to a Helm chart that allows deployment of the resources required for this module	true
type	string	May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module	true

FybrikModule.spec.dependencies[index]

^{^{↩ Parent}}

Dependency details another component on which this module relies - i.e. a pre-requisit

Name	Type	Description	Required
name	string	Name is the name of the dependent component	true
type	enum	Type provides information used in determining how to instantiate the component Enum: module, connector, feature	true

FybrikModule.spec.statusIndicators[index]

^{^{↩ Parent}}

ResourceStatusIndicator is used to determine the status of an orchestrated resource

Name	Type	Description	Required
errorMessage	string	ErrorMessage specifies the resource field to check for an error, e.g. status.errorMsg	false
failureCondition	string	FailureCondition specifies a condition that indicates the resource failure It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/)	false
kind	string	Kind provides information about the resource kind	true
successCondition	string	SuccessCondition specifies a condition that indicates that the resource is ready It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/)	true

FybrikModule.spec.capabilities[index]

^{^{↩ Parent}}

Capability declares what this module knows how to do and the types of data it knows how to handle

Name	Type	Description	Required
actions	[]object	Actions are the data transformations that the module supports	false
api	object	API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required?	false
plugins	[]object	Plugins enable the module to add libraries to perform actions rather than implementing them by itself	false
scope	enum	Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset Enum: asset, workload, cluster	false
supportedInterfaces	[]object	Copy should have one or more instances in the list, and its content should have source and sink Read should have one or more instances in the list, each with source populated Write should have one or more instances in the list, each with sink populated This field may not be required if not handling data	false
capability	enum	Capability declares what this module knows how to do - ex: read, write, transform... Enum: copy, read, write, transform	true

FybrikModule.spec.capabilities[index].api

^{^{↩ Parent}}

API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required?

Name	Type	Description	Required
dataformat	string	DataFormat defines the data format type	false
endpoint	object	EndpointSpec is used both by the module creator and by the status of the fybrikapplication	true
protocol	string	Protocol defines the interface protocol used for data transactions	true

FybrikModule.spec.capabilities[index].api.endpoint

^{^{↩ Parent}}

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name	Type	Description	Required
hostname	string	Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.	false
port	integer	Format: int32	true
scheme	string	For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc	true

FybrikModule.spec.capabilities[index].plugins[index]

^{^{↩ Parent}}

Name	Type	Description	Required
dataFormat	string	DataFormat indicates the format of data the plugin knows how to process	true
pluginType	string	PluginType indicates the technology used for the module and the plugin to interact The values supported should come from the module taxonomy Examples of such mechanisms are vault plugins, wasm, etc	true

FybrikModule.spec.capabilities[index].supportedInterfaces[index]

^{^{↩ Parent}}

ModuleInOut specifies the protocol and format of the data input and output by the module - if any

Name	Type	Description	Required
sink	object	Sink specifies the output data protocol and format	false
source	object	Source specifies the input data protocol and format	false

FybrikModule.spec.capabilities[index].supportedInterfaces[index].sink

^{^{↩ Parent}}

Sink specifies the output data protocol and format

Name	Type	Description	Required
dataformat	string	DataFormat defines the data format type	false
protocol	string	Protocol defines the interface protocol used for data transactions	true

FybrikModule.spec.capabilities[index].supportedInterfaces[index].source

^{^{↩ Parent}}

Source specifies the input data protocol and format

Name	Type	Description	Required
dataformat	string	DataFormat defines the data format type	false
protocol	string	Protocol defines the interface protocol used for data transactions	true

FybrikModule.spec.chart

^{^{↩ Parent}}

Reference to a Helm chart that allows deployment of the resources required for this module

Name	Type	Description	Required
chartPullSecret	string	Name of secret containing helm registry credentials	false
values	map[string]string	Values to pass to helm chart installation	false
name	string	Name of helm chart	true

FybrikStorageAccount

^{^{↩ Parent}}

FybrikStorageAccount defines a storage account used for copying data. Only S3 based storage is supported. It contains endpoint, region and a reference to the credentials a Owner of the asset is responsible to store the credentials

Name	Type	Description	Required
apiVersion	string	app.fybrik.io/v1alpha1	true
kind	string	FybrikStorageAccount	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount	false
status	object	FybrikStorageAccountStatus defines the observed state of FybrikStorageAccount	false

FybrikStorageAccount.spec

^{^{↩ Parent}}

FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount

Name	Type	Description	Required
endpoint	string	Endpoint	true
regions	[]string	Regions	true
secretRef	string	A name of k8s secret deployed in the control plane. This secret includes secretKey and accessKey credentials for S3 bucket	true

Plotter

^{^{↩ Parent}}

Plotter is the Schema for the plotters API

Name	Type	Description	Required
apiVersion	string	app.fybrik.io/v1alpha1	true
kind	string	Plotter	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.	false
status	object	PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring	false

Plotter.spec

^{^{↩ Parent}}

PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.

Name	Type	Description	Required
appSelector	object	Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.	false
assets	map[string]object	Assets is a map holding information about the assets The key is the assetID	true
flows	[]object		true
templates	map[string]object	Templates is a map holding the templates used in this plotter steps The key is the template name	true

Plotter.spec.appSelector

^{^{↩ Parent}}

Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.

Name	Type	Description	Required
clusterName	string	Cluster name	false
workloadSelector	object	WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.	true

Plotter.spec.appSelector.workloadSelector

^{^{↩ Parent}}

WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.

Name	Type	Description	Required
matchExpressions	[]object	matchExpressions is a list of label selector requirements. The requirements are ANDed.	false
matchLabels	map[string]string	matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.	false

Plotter.spec.appSelector.workloadSelector.matchExpressions[index]

^{^{↩ Parent}}

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

Name	Type	Description	Required
values	[]string	values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.	false
key	string	key is the label key that the selector applies to.	true
operator	string	operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.	true

Plotter.spec.assets[key]

^{^{↩ Parent}}

AssetDetails is a list of assets used in the fybrikapplication. In addition to assets declared in fybrikapplication, AssetDetails list also contains assets that are allocated by the control-plane in order to serve fybrikapplication

Name	Type	Description	Required
advertisedAssetId	string	AdvertisedAssetID links this asset to asset from fybrikapplication and is used by user facing services	false
assetDetails	object	DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property.	true

Plotter.spec.assets[key].assetDetails

^{^{↩ Parent}}

DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property.

Name	Type	Description	Required
connection	object	Connection has the relevant details for accesing the data (url, table, ssl, etc.)	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true
vault	map[string]object	Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations.	true

Plotter.spec.assets[key].assetDetails.vault[key]

^{^{↩ Parent}}

Holds details for retrieving credentials from Vault store.

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

Plotter.spec.flows[index]

^{^{↩ Parent}}

Flows is the list of data flows driven from fybrikapplication: Each element in the list holds the flow of the data requested in fybrikapplication.

Name	Type	Description	Required
assetId	string	AssetID indicates the data set being used in this data flow	true
flowType	string	Type of the flow (e.g. read)	true
name	string	Name of the flow	true
subFlows	[]object		true

Plotter.spec.flows[index].subFlows[index]

^{^{↩ Parent}}

Subflows is a list of data flows which are originated from the same data asset but are triggered differently (e.g., one upon init trigger and one upon workload trigger)

Name	Type	Description	Required
flowType	string	Type of the flow (e.g. read)	true
name	string	Name of the SubFlow	true
steps	[][]object	Steps defines a series of sequential/parallel data flow steps The first dimension represents parallel data flows. The second sequential components within the same parallel data flow.	true
triggers	[]enum	Triggers	true

Plotter.spec.flows[index].subFlows[index].steps[index][index]

^{^{↩ Parent}}

DataFlowStep contains details on a single data flow step

Name	Type	Description	Required
parameters	object	Step parameters TODO why not flatten the parameters into this data flow step	false
cluster	string	Name of the cluster this step is executed on	true
name	string	Name of the step	true
template	string	Template is the name of the template to execute the step The full details of the template can be extracted from Plotter.spec.templates list field.	true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters

^{^{↩ Parent}}

Step parameters TODO why not flatten the parameters into this data flow step

Name	Type	Description	Required
action	[]object	Actions are the data transformations that the module supports	false
api	object	Service holds information for accessing a module instance	false
sink	object	StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane	false
source	object	StepSource is the source of this step: it could be assetID or an enpoint of another step	false

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api

^{^{↩ Parent}}

Service holds information for accessing a module instance

Name	Type	Description	Required
endpoint	object	EndpointSpec is used both by the module creator and by the status of the fybrikapplication	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api.endpoint

^{^{↩ Parent}}

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name	Type	Description	Required
hostname	string	Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.	false
port	integer	Format: int32	true
scheme	string	For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc	true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.sink

^{^{↩ Parent}}

StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane

Name	Type	Description	Required
assetId	string	AssetID identifies the target asset of this step	true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source

^{^{↩ Parent}}

StepSource is the source of this step: it could be assetID or an enpoint of another step

Name	Type	Description	Required
api	object	Service holds information for accessing a module instance	false
assetId	string	AssetID identifies the source asset of this step	false

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api

^{^{↩ Parent}}

Service holds information for accessing a module instance

Name	Type	Description	Required
endpoint	object	EndpointSpec is used both by the module creator and by the status of the fybrikapplication	true
format	string	Format represents data format (e.g. parquet) as received from catalog connectors	true

Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api.endpoint

^{^{↩ Parent}}

EndpointSpec is used both by the module creator and by the status of the fybrikapplication

Name	Type	Description	Required
hostname	string	Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables.	false
port	integer	Format: int32	true
scheme	string	For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc	true

Plotter.spec.templates[key]

^{^{↩ Parent}}

Template contains basic information about the required modules to serve the fybrikapplication e.g., the module helm chart name.

Name	Type	Description	Required
name	string	Name of the template	false
modules	[]object	Modules is a list of dependent modules. e.g., if a plugin module is used then the service module is used in should appear first in the modules list of the same template. If the modules list contains more than one module, the first module in the list is referred to as the "primary module" of which all the parameters to this template are sent to.	true

Plotter.spec.templates[key].modules[index]

^{^{↩ Parent}}

ModuleInfo is a copy of FybrikModule Custom Resource. It contains information to instantiate resource of type FybrikModule.

Name	Type	Description	Required
scope	enum	Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset Enum: asset, workload, cluster	false
chart	object	Chart contains the information needed to use helm to install the capability	true
name	string	Name of the module	true
type	string	May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module	true

Plotter.spec.templates[key].modules[index].chart

^{^{↩ Parent}}

Chart contains the information needed to use helm to install the capability

Name	Type	Description	Required
chartPullSecret	string	Name of secret containing helm registry credentials	false
values	map[string]string	Values to pass to helm chart installation	false
name	string	Name of helm chart	true

Plotter.status

^{^{↩ Parent}}

PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring

Name	Type	Description	Required
assets	map[string]object	Assets is a map containing the status per asset. The key of this map is assetId	false
blueprints	map[string]object		false
conditions	[]object	Conditions represent the possible error and failure conditions	false
flows	map[string]object	Flows is a map containing the status for each flow the key is the flow name	false
observedGeneration	integer	ObservedGeneration is taken from the Plotter metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated blueprints should be checked. Format: int64	false
observedState	object	ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions	false
readyTimestamp	string	Format: date-time	false

Plotter.status.assets[key]

^{^{↩ Parent}}

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Plotter.status.blueprints[key]

^{^{↩ Parent}}

MetaBlueprint defines blueprint metadata (name, namespace) and status

Name	Type	Description	Required
name	string		true
namespace	string		true
status	object	BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring	true

Plotter.status.blueprints[key].status

^{^{↩ Parent}}

BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring

Name	Type	Description	Required
observedGeneration	integer	ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64	false
observedState	object	ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions	false
releases	map[string]integer	Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled.	false
modules	map[string]object	ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload	true

Plotter.status.blueprints[key].status.observedState

^{^{↩ Parent}}

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Plotter.status.blueprints[key].status.modules[key]

^{^{↩ Parent}}

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Plotter.status.conditions[index]

^{^{↩ Parent}}

Condition describes the state of a FybrikApplication at a certain point.

Name	Type	Description	Required
message	string	Message contains the details of the current condition	false
status	string	Status of the condition: true or false	true
type	string	Type of the condition	true

Plotter.status.flows[key]

^{^{↩ Parent}}

FlowStatus includes information to be reported back to the FybrikApplication resource It holds the status per data flow

Name	Type	Description	Required
status	object	ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions	false
subFlows	map[string]object		true

Plotter.status.flows[key].status

^{^{↩ Parent}}

ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Plotter.status.flows[key].subFlows[key]

^{^{↩ Parent}}

ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

Plotter.status.observedState

^{^{↩ Parent}}

ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions

Name	Type	Description	Required
error	string	Error indicates that there has been an error to orchestrate the modules and provides the error message	false
ready	boolean	Ready represents that the modules have been orchestrated successfully and the data is ready for usage	false

katalog.fybrik.io/v1alpha1

Resource Types:

Asset

Asset

^{^{↩ Parent}}

Name	Type	Description	Required
apiVersion	string	katalog.fybrik.io/v1alpha1	true
kind	string	Asset	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object		true

Asset.spec

^{^{↩ Parent}}

Name	Type	Description	Required
assetDetails	object	Asset details	true
assetMetadata	object		true
secretRef	object	Reference to a Secret resource holding credentials for this asset	true

Asset.spec.assetDetails

^{^{↩ Parent}}

Asset details

Name	Type	Description	Required
dataFormat	string		false
connection	object	Connection information	true

Asset.spec.assetDetails.connection

^{^{↩ Parent}}

Connection information

Name	Type	Description	Required
db2	object		false
kafka	object		false
s3	object	Connection information for S3 compatible object store	false
type	enum	Enum: s3, db2, kafka	true

Asset.spec.assetDetails.connection.db2

^{^{↩ Parent}}

Name	Type	Required
database	string	false
port	string	false
ssl	string	false
table	string	false
url	string	false

Asset.spec.assetDetails.connection.kafka

^{^{↩ Parent}}

Name	Type	Required
bootstrap_servers	string	false
key_deserializer	string	false
sasl_mechanism	string	false
schema_registry	string	false
security_protocol	string	false
ssl_truststore	string	false
ssl_truststore_password	string	false
topic_name	string	false
value_deserializer	string	false

Asset.spec.assetDetails.connection.s3

^{^{↩ Parent}}

Connection information for S3 compatible object store

Name	Type	Required
region	string	false
bucket	string	true
endpoint	string	true
objectKey	string	true

Asset.spec.assetMetadata

^{^{↩ Parent}}

Name	Type	Description	Required
componentsMetadata	map[string]object	metadata for each component in asset (e.g., column)	false
geography	string		false
namedMetadata	map[string]string		false
owner	string		false
tags	[]string	Tags associated with the asset	false

Asset.spec.assetMetadata.componentsMetadata[key]

^{^{↩ Parent}}

Name	Type	Description	Required
componentType	string		false
namedMetadata	map[string]string	Named terms, that exist in Catalog toxonomy and the values for these terms for columns we will have "SchemaDetails" key, that will include technical schema details for this column TODO: Consider create special field for schema outside of metadata	false
tags	[]string	Tags - can be any free text added to a component (no taxonomy)	false

Asset.spec.secretRef

^{^{↩ Parent}}

Reference to a Secret resource holding credentials for this asset

Name	Type	Description	Required
name	string	Name of the Secret resource (must exist in the same namespace)	true

motion.fybrik.io/v1alpha1

Resource Types:

BatchTransfer
StreamTransfer

BatchTransfer

^{^{↩ Parent}}

BatchTransfer is the Schema for the batchtransfers API

Name	Type	Description	Required
apiVersion	string	motion.fybrik.io/v1alpha1	true
kind	string	BatchTransfer	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.	false
status	object	BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data	false

BatchTransfer.spec

^{^{↩ Parent}}

BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.

Name	Type	Description	Required
failedJobHistoryLimit	integer	Maximal number of failed Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20	false
flowType	enum	Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream	false
image	string	Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set.	false
imagePullPolicy	string	Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
maxFailedRetries	integer	Maximal number of failed retries until the batch job should stop trying. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 10	false
noFinalizer	boolean	If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set.	false
readDataType	enum	Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData	false
schedule	string	Cron schedule if this BatchTransfer job should run on a regular schedule. Values are specified like cron job schedules. A good translation to human language can be found here https://crontab.guru/	false
secretProviderRole	string	Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
secretProviderURL	string	Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
spark	object	Optional Spark configuration for tuning	false
successfulJobHistoryLimit	integer	Maximal number of successful Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20	false
suspend	boolean	If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set.	false
transformation	[]object	Transformations to be applied to the source data before writing to destination	false
writeDataType	enum	Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData	false
writeOperation	enum	Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update	false
destination	object	Destination data store for this batch job	true
source	object	Source data store for this batch job	true

BatchTransfer.spec.spark

^{^{↩ Parent}}

Optional Spark configuration for tuning

Name	Type	Description	Required
appName	string	Name of the transaction. Mainly used for debugging and lineage tracking.	false
driverCores	integer	Number of cores that the driver should use	false
driverMemory	integer	Memory that the driver should have	false
executorCores	integer	Number of cores that each executor should have	false
executorMemory	string	Memory that each executor should have	false
image	string	Image to be used for executors	false
imagePullPolicy	string	Image pull policy to be used for executor	false
numExecutors	integer	Number of executors to be started	false
options	map[string]string	Additional options for Spark configuration.	false
shufflePartitions	integer	Number of shuffle partitions for Spark	false

BatchTransfer.spec.transformation[index]

^{^{↩ Parent}}

to be refined...

Name	Type	Description	Required
action	enum	Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows	false
columns	[]string	Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation.	false
name	string	Name of the transaction. Mainly used for debugging and lineage tracking.	false
options	map[string]string	Additional options for this transformation.	false

BatchTransfer.spec.destination

^{^{↩ Parent}}

Destination data store for this batch job

Name	Type	Description	Required
cloudant	object	IBM Cloudant. Needs cloudant legacy credentials.	false
database	object	Database data store. For the moment only Db2 is supported.	false
description	string	Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.	false
kafka	object	Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.	false
s3	object	An object store data store that is compatible with S3. This can be a COS bucket.	false

BatchTransfer.spec.destination.cloudant

^{^{↩ Parent}}

IBM Cloudant. Needs cloudant legacy credentials.

Name	Type	Description	Required
password	string	Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
username	string	Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
database	string	Database to be read from/written to	true
host	string	Host of cloudant instance	true

BatchTransfer.spec.destination.cloudant.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.destination.database

^{^{↩ Parent}}

Database data store. For the moment only Db2 is supported.

Name	Type	Description	Required
password	string	Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
user	string	Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
db2URL	string	URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.	true
table	string	Table to be read	true

BatchTransfer.spec.destination.database.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.destination.kafka

^{^{↩ Parent}}

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name	Type	Description	Required
createSnapshot	boolean	If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
keyDeserializer	string	Deserializer to be used for the keys of the topic	false
password	string	Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.	false
saslMechanism	string	SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified	false
schemaRegistryURL	string	URL to the schema registry. The registry has to be Confluent schema registry compatible.	false
secretImport	string	Define a secret import definition.	false
securityProtocol	string	Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified	false
sslTruststore	string	A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret	false
sslTruststoreLocation	string	SSL truststore location.	false
sslTruststorePassword	string	SSL truststore password.	false
sslTruststoreSecret	string	Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as	false
user	string	Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
valueDeserializer	string	Deserializer to be used for the values of the topic	false
vault	object	Define secrets that are fetched from a Vault instance	false
kafkaBrokers	string	Kafka broker URLs as a comma separated list.	true
kafkaTopic	string	Kafka topic	true

BatchTransfer.spec.destination.kafka.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.destination.s3

^{^{↩ Parent}}

An object store data store that is compatible with S3. This can be a COS bucket.

Name	Type	Description	Required
accessKey	string	Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
partitionBy	[]string	Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.	false
region	string	Region of S3 service	false
secretImport	string	Define a secret import definition.	false
secretKey	string	Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
bucket	string	Bucket of S3 service	true
endpoint	string	Endpoint of S3 service	true
objectKey	string	Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!	true

BatchTransfer.spec.destination.s3.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.source

^{^{↩ Parent}}

Source data store for this batch job

Name	Type	Description	Required
cloudant	object	IBM Cloudant. Needs cloudant legacy credentials.	false
database	object	Database data store. For the moment only Db2 is supported.	false
description	string	Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.	false
kafka	object	Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.	false
s3	object	An object store data store that is compatible with S3. This can be a COS bucket.	false

BatchTransfer.spec.source.cloudant

^{^{↩ Parent}}

IBM Cloudant. Needs cloudant legacy credentials.

Name	Type	Description	Required
password	string	Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
username	string	Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
database	string	Database to be read from/written to	true
host	string	Host of cloudant instance	true

BatchTransfer.spec.source.cloudant.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.source.database

^{^{↩ Parent}}

Database data store. For the moment only Db2 is supported.

Name	Type	Description	Required
password	string	Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
user	string	Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
db2URL	string	URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.	true
table	string	Table to be read	true

BatchTransfer.spec.source.database.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.source.kafka

^{^{↩ Parent}}

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name	Type	Description	Required
createSnapshot	boolean	If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
keyDeserializer	string	Deserializer to be used for the keys of the topic	false
password	string	Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.	false
saslMechanism	string	SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified	false
schemaRegistryURL	string	URL to the schema registry. The registry has to be Confluent schema registry compatible.	false
secretImport	string	Define a secret import definition.	false
securityProtocol	string	Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified	false
sslTruststore	string	A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret	false
sslTruststoreLocation	string	SSL truststore location.	false
sslTruststorePassword	string	SSL truststore password.	false
sslTruststoreSecret	string	Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as	false
user	string	Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
valueDeserializer	string	Deserializer to be used for the values of the topic	false
vault	object	Define secrets that are fetched from a Vault instance	false
kafkaBrokers	string	Kafka broker URLs as a comma separated list.	true
kafkaTopic	string	Kafka topic	true

BatchTransfer.spec.source.kafka.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.spec.source.s3

^{^{↩ Parent}}

An object store data store that is compatible with S3. This can be a COS bucket.

Name	Type	Description	Required
accessKey	string	Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
partitionBy	[]string	Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.	false
region	string	Region of S3 service	false
secretImport	string	Define a secret import definition.	false
secretKey	string	Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
bucket	string	Bucket of S3 service	true
endpoint	string	Endpoint of S3 service	true
objectKey	string	Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!	true

BatchTransfer.spec.source.s3.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

BatchTransfer.status

^{^{↩ Parent}}

BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data

Name	Type	Description	Required
active	object	A pointer to the currently running job (or nil)	false
error	string		false
lastCompleted	object	ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .	false
lastFailed	object	ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .	false
lastRecordTime	string	Format: date-time	false
lastScheduleTime	string	Information when was the last time the job was successfully scheduled. Format: date-time	false
lastSuccessTime	string	Format: date-time	false
numRecords	integer	Format: int64 Minimum: 0	false
status	enum	Enum: STARTING, RUNNING, SUCCEEDED, FAILED	false

BatchTransfer.status.active

^{^{↩ Parent}}

A pointer to the currently running job (or nil)

Name	Type	Description	Required
apiVersion	string	API version of the referent.	false
fieldPath	string	If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.	false
kind	string	Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	false
name	string	Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names	false
namespace	string	Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/	false
resourceVersion	string	Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency	false
uid	string	UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids	false

BatchTransfer.status.lastCompleted

^{^{↩ Parent}}

ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .

Name	Type	Description	Required
apiVersion	string	API version of the referent.	false
fieldPath	string	If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.	false
kind	string	Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	false
name	string	Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names	false
namespace	string	Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/	false
resourceVersion	string	Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency	false
uid	string	UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids	false

BatchTransfer.status.lastFailed

^{^{↩ Parent}}

ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .

Name	Type	Description	Required
apiVersion	string	API version of the referent.	false
fieldPath	string	If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.	false
kind	string	Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	false
name	string	Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names	false
namespace	string	Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/	false
resourceVersion	string	Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency	false
uid	string	UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids	false

StreamTransfer

^{^{↩ Parent}}

StreamTransfer is the Schema for the streamtransfers API

Name	Type	Description	Required
apiVersion	string	motion.fybrik.io/v1alpha1	true
kind	string	StreamTransfer	true
metadata	object	Refer to the Kubernetes API documentation for the fields of the `metadata` field.	true
spec	object	StreamTransferSpec defines the desired state of StreamTransfer	false
status	object	StreamTransferStatus defines the observed state of StreamTransfer	false

StreamTransfer.spec

^{^{↩ Parent}}

StreamTransferSpec defines the desired state of StreamTransfer

Name	Type	Description	Required
flowType	enum	Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream	false
image	string	Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set.	false
imagePullPolicy	string	Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
noFinalizer	boolean	If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set.	false
readDataType	enum	Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData	false
secretProviderRole	string	Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
secretProviderURL	string	Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set.	false
suspend	boolean	If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set.	false
transformation	[]object	Transformations to be applied to the source data before writing to destination	false
triggerInterval	string	Interval in which the Micro batches of this stream should be triggered The default is '5 seconds'.	false
writeDataType	enum	Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData	false
writeOperation	enum	Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update	false
destination	object	Destination data store for this batch job	true
source	object	Source data store for this batch job	true

StreamTransfer.spec.transformation[index]

^{^{↩ Parent}}

to be refined...

Name	Type	Description	Required
action	enum	Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows	false
columns	[]string	Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation.	false
name	string	Name of the transaction. Mainly used for debugging and lineage tracking.	false
options	map[string]string	Additional options for this transformation.	false

StreamTransfer.spec.destination

^{^{↩ Parent}}

Destination data store for this batch job

Name	Type	Description	Required
cloudant	object	IBM Cloudant. Needs cloudant legacy credentials.	false
database	object	Database data store. For the moment only Db2 is supported.	false
description	string	Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.	false
kafka	object	Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.	false
s3	object	An object store data store that is compatible with S3. This can be a COS bucket.	false

StreamTransfer.spec.destination.cloudant

^{^{↩ Parent}}

IBM Cloudant. Needs cloudant legacy credentials.

Name	Type	Description	Required
password	string	Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
username	string	Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
database	string	Database to be read from/written to	true
host	string	Host of cloudant instance	true

StreamTransfer.spec.destination.cloudant.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.destination.database

^{^{↩ Parent}}

Database data store. For the moment only Db2 is supported.

Name	Type	Description	Required
password	string	Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
user	string	Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
db2URL	string	URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.	true
table	string	Table to be read	true

StreamTransfer.spec.destination.database.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.destination.kafka

^{^{↩ Parent}}

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name	Type	Description	Required
createSnapshot	boolean	If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
keyDeserializer	string	Deserializer to be used for the keys of the topic	false
password	string	Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.	false
saslMechanism	string	SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified	false
schemaRegistryURL	string	URL to the schema registry. The registry has to be Confluent schema registry compatible.	false
secretImport	string	Define a secret import definition.	false
securityProtocol	string	Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified	false
sslTruststore	string	A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret	false
sslTruststoreLocation	string	SSL truststore location.	false
sslTruststorePassword	string	SSL truststore password.	false
sslTruststoreSecret	string	Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as	false
user	string	Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
valueDeserializer	string	Deserializer to be used for the values of the topic	false
vault	object	Define secrets that are fetched from a Vault instance	false
kafkaBrokers	string	Kafka broker URLs as a comma separated list.	true
kafkaTopic	string	Kafka topic	true

StreamTransfer.spec.destination.kafka.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.destination.s3

^{^{↩ Parent}}

An object store data store that is compatible with S3. This can be a COS bucket.

Name	Type	Description	Required
accessKey	string	Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
partitionBy	[]string	Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.	false
region	string	Region of S3 service	false
secretImport	string	Define a secret import definition.	false
secretKey	string	Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
bucket	string	Bucket of S3 service	true
endpoint	string	Endpoint of S3 service	true
objectKey	string	Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!	true

StreamTransfer.spec.destination.s3.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.source

^{^{↩ Parent}}

Source data store for this batch job

Name	Type	Description	Required
cloudant	object	IBM Cloudant. Needs cloudant legacy credentials.	false
database	object	Database data store. For the moment only Db2 is supported.	false
description	string	Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified.	false
kafka	object	Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.	false
s3	object	An object store data store that is compatible with S3. This can be a COS bucket.	false

StreamTransfer.spec.source.cloudant

^{^{↩ Parent}}

IBM Cloudant. Needs cloudant legacy credentials.

Name	Type	Description	Required
password	string	Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
username	string	Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
database	string	Database to be read from/written to	true
host	string	Host of cloudant instance	true

StreamTransfer.spec.source.cloudant.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.source.database

^{^{↩ Parent}}

Database data store. For the moment only Db2 is supported.

Name	Type	Description	Required
password	string	Database password. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
secretImport	string	Define a secret import definition.	false
user	string	Database user. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
db2URL	string	URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates.	true
table	string	Table to be read	true

StreamTransfer.spec.source.database.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.source.kafka

^{^{↩ Parent}}

Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.

Name	Type	Description	Required
createSnapshot	boolean	If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
keyDeserializer	string	Deserializer to be used for the keys of the topic	false
password	string	Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional.	false
saslMechanism	string	SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified	false
schemaRegistryURL	string	URL to the schema registry. The registry has to be Confluent schema registry compatible.	false
secretImport	string	Define a secret import definition.	false
securityProtocol	string	Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified	false
sslTruststore	string	A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret	false
sslTruststoreLocation	string	SSL truststore location.	false
sslTruststorePassword	string	SSL truststore password.	false
sslTruststoreSecret	string	Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as	false
user	string	Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
valueDeserializer	string	Deserializer to be used for the values of the topic	false
vault	object	Define secrets that are fetched from a Vault instance	false
kafkaBrokers	string	Kafka broker URLs as a comma separated list.	true
kafkaTopic	string	Kafka topic	true

StreamTransfer.spec.source.kafka.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.spec.source.s3

^{^{↩ Parent}}

An object store data store that is compatible with S3. This can be a COS bucket.

Name	Type	Description	Required
accessKey	string	Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
dataFormat	string	Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values.	false
partitionBy	[]string	Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store.	false
region	string	Region of S3 service	false
secretImport	string	Define a secret import definition.	false
secretKey	string	Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional.	false
vault	object	Define secrets that are fetched from a Vault instance	false
bucket	string	Bucket of S3 service	true
endpoint	string	Endpoint of S3 service	true
objectKey	string	Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input!	true

StreamTransfer.spec.source.s3.vault

^{^{↩ Parent}}

Define secrets that are fetched from a Vault instance

Name	Type	Description	Required
address	string	Address is Vault address	true
authPath	string	AuthPath is the path to auth method i.e. kubernetes	true
role	string	Role is the Vault role used for retrieving the credentials	true
secretPath	string	SecretPath is the path of the secret holding the Credentials in Vault	true

StreamTransfer.status

^{^{↩ Parent}}

StreamTransferStatus defines the observed state of StreamTransfer

Name	Type	Description	Required
active	object	A pointer to the currently running job (or nil)	false
error	string		false
status	enum	Enum: STARTING, RUNNING, STOPPED, FAILING	false

StreamTransfer.status.active

^{^{↩ Parent}}

A pointer to the currently running job (or nil)

Name	Type	Description	Required
apiVersion	string	API version of the referent.	false
fieldPath	string	If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future.	false
kind	string	Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	false
name	string	Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names	false
namespace	string	Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/	false
resourceVersion	string	Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency	false
uid	string	UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids	false