API Reference
Packages:
app.fybrik.io/v1alpha1
Resource Types:
Blueprint
Blueprint is the Schema for the blueprints API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | Blueprint | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication). |
false |
status | object |
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring |
false |
Blueprint.spec
BlueprintSpec defines the desired state of Blueprint, which defines the components of the workload's data path that run in a particular cluster. In a single cluster environment there is one blueprint. In a multi-cluster environment there is one Blueprint per cluster per workload (FybrikApplication).
Name | Type | Description | Required |
---|---|---|---|
cluster | string |
Cluster indicates the cluster on which the Blueprint runs |
true |
modules | map[string]object |
Modules is a map which contains modules that indicate the data path components that run in this cluster The map key is InstanceName which is the unique name for the deployed instance related to this workload |
true |
Blueprint.spec.modules[key]
BlueprintModule is a copy of a FybrikModule Custom Resource. It contains the information necessary to instantiate a datapath component, including the parameters relevant for the particular workload.
Name | Type | Description | Required |
---|---|---|---|
arguments | object |
Arguments are the input parameters for a specific instance of a module. |
false |
assetIds | []string |
assetIDs indicate the assets processed by this module. Included so we can track asset status as well as module status in the future. |
false |
chart | object |
Chart contains the location of the helm chart with info detailing how to deploy |
true |
name | string |
Name of the fybrikmodule on which this is based |
true |
Blueprint.spec.modules[key].arguments
Arguments are the input parameters for a specific instance of a module.
Name | Type | Description | Required |
---|---|---|---|
appSelector | object |
Application selector is used to identify the user workload. It is obtained from FybrikApplication spec. |
false |
copy | object |
CopyArgs are parameters specific to modules that copy data from one data store to another. |
false |
labels | map[string]string |
Labels of FybrikApplication |
false |
read | []object |
ReadArgs are parameters that are specific to modules that enable an application to read data |
false |
write | []object |
WriteArgs are parameters that are specific to modules that enable an application to write data |
false |
Blueprint.spec.modules[key].arguments.appSelector
Application selector is used to identify the user workload. It is obtained from FybrikApplication spec.
Name | Type | Description | Required |
---|---|---|---|
matchExpressions | []object |
matchExpressions is a list of label selector requirements. The requirements are ANDed. |
false |
matchLabels | map[string]string |
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. |
false |
Blueprint.spec.modules[key].arguments.appSelector.matchExpressions[index]
A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
Name | Type | Description | Required |
---|---|---|---|
values | []string |
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. |
false |
key | string |
key is the label key that the selector applies to. |
true |
operator | string |
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. |
true |
Blueprint.spec.modules[key].arguments.copy
CopyArgs are parameters specific to modules that copy data from one data store to another.
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is copied. |
false |
assetID | string |
AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource |
true |
destination | object |
Destination is the data store to which the data will be copied |
true |
source | object |
Source is the where the data currently resides |
true |
Blueprint.spec.modules[key].arguments.copy.destination
Destination is the data store to which the data will be copied
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | map[string]object |
Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations. |
true |
Blueprint.spec.modules[key].arguments.copy.destination.vault[key]
Holds details for retrieving credentials from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.modules[key].arguments.copy.source
Source is the where the data currently resides
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | map[string]object |
Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations. |
true |
Blueprint.spec.modules[key].arguments.copy.source.vault[key]
Holds details for retrieving credentials from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.modules[key].arguments.read[index]
ReadModuleArgs define the input parameters for modules that read data from location A
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data |
false |
assetID | string |
AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource |
true |
source | object |
Source of the read path module |
true |
Blueprint.spec.modules[key].arguments.read[index].source
Source of the read path module
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | map[string]object |
Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations. |
true |
Blueprint.spec.modules[key].arguments.read[index].source.vault[key]
Holds details for retrieving credentials from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.modules[key].arguments.write[index]
WriteModuleArgs define the input parameters for modules that write data to location B
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is written. |
false |
assetID | string |
AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource |
true |
destination | object |
Destination is the data store to which the data will be written |
true |
Blueprint.spec.modules[key].arguments.write[index].destination
Destination is the data store to which the data will be written
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | map[string]object |
Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations. |
true |
Blueprint.spec.modules[key].arguments.write[index].destination.vault[key]
Holds details for retrieving credentials from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.modules[key].chart
Chart contains the location of the helm chart with info detailing how to deploy
Name | Type | Description | Required |
---|---|---|---|
chartPullSecret | string |
Name of secret containing helm registry credentials |
false |
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
Blueprint.status
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
observedGeneration | integer |
ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
releases | map[string]integer |
Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled. |
false |
modules | map[string]object |
ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload |
true |
Blueprint.status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Blueprint.status.modules[key]
ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
FybrikApplication
FybrikApplication provides information about the application being used by a Data Scientist, the nature of the processing, and the data sets that the Data Scientist has chosen for processing by the application. The FybrikApplication controller (aka pilot) obtains instructions regarding any governance related changes that must be performed on the data, identifies the modules capable of performing such changes, and finally generates the Blueprint which defines the secure runtime environment and all the components in it. This runtime environment provides the Data Scientist's application with access to the data requested in a secure manner and without having to provide any credentials for the data sets. The credentials are obtained automatically by the manager from an external credential management system, which may or may not be part of a data catalog.
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikApplication | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikApplicationSpec defines the desired state of FybrikApplication. |
false |
status | object |
FybrikApplicationStatus defines the observed state of FybrikApplication. |
false |
FybrikApplication.spec
FybrikApplicationSpec defines the desired state of FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
secretRef | string |
SecretRef points to the secret that holds credentials for each system the user has been authenticated with. The secret is deployed in FybrikApplication namespace. |
false |
selector | object |
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used. |
false |
appInfo | map[string]string |
AppInfo contains information describing the reasons for the processing that will be done by the Data Scientist's application. |
true |
data | []object |
Data contains the identifiers of the data to be used by the Data Scientist's application, and the protocol used to access it and the format expected. |
true |
FybrikApplication.spec.selector
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.
Name | Type | Description | Required |
---|---|---|---|
clusterName | string |
Cluster name |
false |
workloadSelector | object |
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector. |
true |
FybrikApplication.spec.selector.workloadSelector
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.
Name | Type | Description | Required |
---|---|---|---|
matchExpressions | []object |
matchExpressions is a list of label selector requirements. The requirements are ANDed. |
false |
matchLabels | map[string]string |
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. |
false |
FybrikApplication.spec.selector.workloadSelector.matchExpressions[index]
A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
Name | Type | Description | Required |
---|---|---|---|
values | []string |
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. |
false |
key | string |
key is the label key that the selector applies to. |
true |
operator | string |
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. |
true |
FybrikApplication.spec.data[index]
DataContext indicates data set chosen by the Data Scientist to be used by his application, and includes information about the data format and technologies used by the application to access the data.
Name | Type | Description | Required |
---|---|---|---|
catalogService | string |
CatalogService represents the catalog service for accessing the requested dataset. If not specified, the enterprise catalog service will be used. |
false |
dataSetID | string |
DataSetID is a unique identifier of the dataset chosen from the data catalog for processing by the data user application. |
true |
requirements | object |
Requirements from the system |
true |
FybrikApplication.spec.data[index].requirements
Requirements from the system
Name | Type | Description | Required |
---|---|---|---|
copy | object |
CopyRequrements include the requirements for copying the data |
false |
interface | object |
Interface indicates the protocol and format expected by the data user |
true |
FybrikApplication.spec.data[index].requirements.copy
CopyRequrements include the requirements for copying the data
Name | Type | Description | Required |
---|---|---|---|
catalog | object |
Catalog indicates that the data asset must be cataloged. |
false |
required | boolean |
Required indicates that the data must be copied. |
false |
FybrikApplication.spec.data[index].requirements.copy.catalog
Catalog indicates that the data asset must be cataloged.
Name | Type | Description | Required |
---|---|---|---|
catalogID | string |
CatalogID specifies the catalog where the data will be cataloged. |
false |
service | string |
CatalogService specifies the datacatalog service that will be used for catalogging the data into. |
false |
FybrikApplication.spec.data[index].requirements.interface
Interface indicates the protocol and format expected by the data user
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikApplication.status
FybrikApplicationStatus defines the observed state of FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
assetStates | map[string]object |
AssetStates provides a status per asset |
false |
errorMessage | string |
ErrorMessage indicates that an error has happened during the reconcile, unrelated to a specific asset |
false |
generated | object |
Generated resource identifier |
false |
observedGeneration | integer |
ObservedGeneration is taken from the FybrikApplication metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether the Blueprint status changed. Format: int64 |
false |
provisionedStorage | map[string]object |
ProvisionedStorage maps a dataset (identified by AssetID) to the new provisioned bucket. It allows FybrikApplication controller to manage buckets in case the spec has been modified, an error has occurred, or a delete event has been received. ProvisionedStorage has the information required to register the dataset once the owned plotter resource is ready |
false |
ready | boolean |
Ready is true if all specified assets are either ready to be used or are denied access. |
false |
validApplication | string |
ValidApplication indicates whether the FybrikApplication is valid given the defined taxonomy |
false |
validatedGeneration | integer |
ValidatedGeneration is the version of the FyrbikApplication that has been validated with the taxonomy defined. Format: int64 |
false |
FybrikApplication.status.assetStates[key]
AssetState defines the observed state of an asset
Name | Type | Description | Required |
---|---|---|---|
catalogedAsset | string |
CatalogedAsset provides a new asset identifier after being registered in the enterprise catalog |
false |
conditions | []object |
Conditions indicate the asset state (Ready, Deny, Error) |
false |
endpoint | object |
Endpoint provides the endpoint spec from which the asset will be served to the application |
false |
FybrikApplication.status.assetStates[key].conditions[index]
Condition describes the state of a FybrikApplication at a certain point.
Name | Type | Description | Required |
---|---|---|---|
message | string |
Message contains the details of the current condition |
false |
status | string |
Status of the condition: true or false |
true |
type | string |
Type of the condition |
true |
FybrikApplication.status.assetStates[key].endpoint
Endpoint provides the endpoint spec from which the asset will be served to the application
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
FybrikApplication.status.generated
Generated resource identifier
Name | Type | Description | Required |
---|---|---|---|
appVersion | integer |
Version of FybrikApplication that has generated this resource Format: int64 |
true |
kind | string |
Kind of the resource (Blueprint, Plotter) |
true |
name | string |
Name of the resource |
true |
namespace | string |
Namespace of the resource |
true |
FybrikApplication.status.provisionedStorage[key]
DatasetDetails contain dataset connection and metadata required to register this dataset in the enterprise catalog
Name | Type | Description | Required |
---|---|---|---|
datasetRef | string |
Reference to a Dataset resource containing the request to provision storage |
false |
details | object |
Dataset information |
false |
secretRef | string |
Reference to a secret where the credentials are stored |
false |
FybrikModule
FybrikModule is a description of an injectable component. the parameters it requires, as well as the specification of how to instantiate such a component. It is used as metadata only. There is no status nor reconciliation.
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikModule | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application. |
true |
FybrikModule.spec
FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.
Name | Type | Description | Required |
---|---|---|---|
dependencies | []object |
Other components that must be installed in order for this module to work |
false |
description | string |
An explanation of what this module does |
false |
pluginType | string |
Plugin type indicates the plugin technology used to invoke the capabilities Ex: vault, fybrik-wasm... Should be provided if type is plugin |
false |
statusIndicators | []object |
StatusIndicators allow to check status of a non-standard resource that can not be computed by helm/kstatus |
false |
capabilities | []object |
Capabilities declares what this module knows how to do and the types of data it knows how to handle The key to the map is a CapabilityType string |
true |
chart | object |
Reference to a Helm chart that allows deployment of the resources required for this module |
true |
type | string |
May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module |
true |
FybrikModule.spec.dependencies[index]
Dependency details another component on which this module relies - i.e. a pre-requisit
Name | Type | Description | Required |
---|---|---|---|
name | string |
Name is the name of the dependent component |
true |
type | enum |
Type provides information used in determining how to instantiate the component Enum: module, connector, feature |
true |
FybrikModule.spec.statusIndicators[index]
ResourceStatusIndicator is used to determine the status of an orchestrated resource
Name | Type | Description | Required |
---|---|---|---|
errorMessage | string |
ErrorMessage specifies the resource field to check for an error, e.g. status.errorMsg |
false |
failureCondition | string |
FailureCondition specifies a condition that indicates the resource failure It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) |
false |
kind | string |
Kind provides information about the resource kind |
true |
successCondition | string |
SuccessCondition specifies a condition that indicates that the resource is ready It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) |
true |
FybrikModule.spec.capabilities[index]
Capability declares what this module knows how to do and the types of data it knows how to handle
Name | Type | Description | Required |
---|---|---|---|
actions | []object |
Actions are the data transformations that the module supports |
false |
api | object |
API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required? |
false |
plugins | []object |
Plugins enable the module to add libraries to perform actions rather than implementing them by itself |
false |
scope | enum |
Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset Enum: asset, workload, cluster |
false |
supportedInterfaces | []object |
Copy should have one or more instances in the list, and its content should have source and sink Read should have one or more instances in the list, each with source populated Write should have one or more instances in the list, each with sink populated This field may not be required if not handling data |
false |
capability | enum |
Capability declares what this module knows how to do - ex: read, write, transform... Enum: copy, read, write, transform |
true |
FybrikModule.spec.capabilities[index].api
API indicates to the application how to access the capabilities provided by the module TODO This is optional but in ModuleAPI the endpoint is required?
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
endpoint | object |
EndpointSpec is used both by the module creator and by the status of the fybrikapplication |
true |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.capabilities[index].api.endpoint
EndpointSpec is used both by the module creator and by the status of the fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
FybrikModule.spec.capabilities[index].plugins[index]
Name | Type | Description | Required |
---|---|---|---|
dataFormat | string |
DataFormat indicates the format of data the plugin knows how to process |
true |
pluginType | string |
PluginType indicates the technology used for the module and the plugin to interact The values supported should come from the module taxonomy Examples of such mechanisms are vault plugins, wasm, etc |
true |
FybrikModule.spec.capabilities[index].supportedInterfaces[index]
ModuleInOut specifies the protocol and format of the data input and output by the module - if any
Name | Type | Description | Required |
---|---|---|---|
sink | object |
Sink specifies the output data protocol and format |
false |
source | object |
Source specifies the input data protocol and format |
false |
FybrikModule.spec.capabilities[index].supportedInterfaces[index].sink
Sink specifies the output data protocol and format
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.capabilities[index].supportedInterfaces[index].source
Source specifies the input data protocol and format
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.chart
Reference to a Helm chart that allows deployment of the resources required for this module
Name | Type | Description | Required |
---|---|---|---|
chartPullSecret | string |
Name of secret containing helm registry credentials |
false |
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
FybrikStorageAccount
FybrikStorageAccount defines a storage account used for copying data. Only S3 based storage is supported. It contains endpoint, region and a reference to the credentials a Owner of the asset is responsible to store the credentials
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikStorageAccount | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount |
false |
status | object |
FybrikStorageAccountStatus defines the observed state of FybrikStorageAccount |
false |
FybrikStorageAccount.spec
FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount
Name | Type | Description | Required |
---|---|---|---|
endpoint | string |
Endpoint |
true |
regions | []string |
Regions |
true |
secretRef | string |
A name of k8s secret deployed in the control plane. This secret includes secretKey and accessKey credentials for S3 bucket |
true |
Plotter
Plotter is the Schema for the plotters API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | Plotter | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication. |
false |
status | object |
PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring |
false |
Plotter.spec
PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter declares what needs to be installed and where (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
appSelector | object |
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used. |
false |
assets | map[string]object |
Assets is a map holding information about the assets The key is the assetID |
true |
flows | []object |
|
true |
templates | map[string]object |
Templates is a map holding the templates used in this plotter steps The key is the template name |
true |
Plotter.spec.appSelector
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.
Name | Type | Description | Required |
---|---|---|---|
clusterName | string |
Cluster name |
false |
workloadSelector | object |
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector. |
true |
Plotter.spec.appSelector.workloadSelector
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.
Name | Type | Description | Required |
---|---|---|---|
matchExpressions | []object |
matchExpressions is a list of label selector requirements. The requirements are ANDed. |
false |
matchLabels | map[string]string |
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. |
false |
Plotter.spec.appSelector.workloadSelector.matchExpressions[index]
A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
Name | Type | Description | Required |
---|---|---|---|
values | []string |
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. |
false |
key | string |
key is the label key that the selector applies to. |
true |
operator | string |
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. |
true |
Plotter.spec.assets[key]
AssetDetails is a list of assets used in the fybrikapplication. In addition to assets declared in fybrikapplication, AssetDetails list also contains assets that are allocated by the control-plane in order to serve fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
advertisedAssetId | string |
AdvertisedAssetID links this asset to asset from fybrikapplication and is used by user facing services |
false |
assetDetails | object |
DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property. |
true |
Plotter.spec.assets[key].assetDetails
DataStore contains the details for accesing the data that are sent by catalog connectors Credentials for accesing the data are stored in Vault, in the location represented by Vault property.
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | map[string]object |
Holds details for retrieving credentials by the modules from Vault store. It is a map so that different credentials can be stored for the different DataFlow operations. |
true |
Plotter.spec.assets[key].assetDetails.vault[key]
Holds details for retrieving credentials from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Plotter.spec.flows[index]
Flows is the list of data flows driven from fybrikapplication: Each element in the list holds the flow of the data requested in fybrikapplication.
Name | Type | Description | Required |
---|---|---|---|
assetId | string |
AssetID indicates the data set being used in this data flow |
true |
flowType | string |
Type of the flow (e.g. read) |
true |
name | string |
Name of the flow |
true |
subFlows | []object |
|
true |
Plotter.spec.flows[index].subFlows[index]
Subflows is a list of data flows which are originated from the same data asset but are triggered differently (e.g., one upon init trigger and one upon workload trigger)
Name | Type | Description | Required |
---|---|---|---|
flowType | string |
Type of the flow (e.g. read) |
true |
name | string |
Name of the SubFlow |
true |
steps | [][]object |
Steps defines a series of sequential/parallel data flow steps The first dimension represents parallel data flows. The second sequential components within the same parallel data flow. |
true |
triggers | []enum |
Triggers |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index]
DataFlowStep contains details on a single data flow step
Name | Type | Description | Required |
---|---|---|---|
parameters | object |
Step parameters TODO why not flatten the parameters into this data flow step |
false |
cluster | string |
Name of the cluster this step is executed on |
true |
name | string |
Name of the step |
true |
template | string |
Template is the name of the template to execute the step The full details of the template can be extracted from Plotter.spec.templates list field. |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters
Step parameters TODO why not flatten the parameters into this data flow step
Name | Type | Description | Required |
---|---|---|---|
action | []object |
Actions are the data transformations that the module supports |
false |
api | object |
Service holds information for accessing a module instance |
false |
sink | object |
StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane |
false |
source | object |
StepSource is the source of this step: it could be assetID or an enpoint of another step |
false |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api
Service holds information for accessing a module instance
Name | Type | Description | Required |
---|---|---|---|
endpoint | object |
EndpointSpec is used both by the module creator and by the status of the fybrikapplication |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.api.endpoint
EndpointSpec is used both by the module creator and by the status of the fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.sink
StepSink holds information to where the target data will be written: it could be assetID of an asset specified in fybrikapplication or of an asset created by fybrik control-plane
Name | Type | Description | Required |
---|---|---|---|
assetId | string |
AssetID identifies the target asset of this step |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source
StepSource is the source of this step: it could be assetID or an enpoint of another step
Name | Type | Description | Required |
---|---|---|---|
api | object |
Service holds information for accessing a module instance |
false |
assetId | string |
AssetID identifies the source asset of this step |
false |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api
Service holds information for accessing a module instance
Name | Type | Description | Required |
---|---|---|---|
endpoint | object |
EndpointSpec is used both by the module creator and by the status of the fybrikapplication |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
Plotter.spec.flows[index].subFlows[index].steps[index][index].parameters.source.api.endpoint
EndpointSpec is used both by the module creator and by the status of the fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Hostname is the hostname to connect to for connecting to a module exposed service. By default this equals to "{{.Release.Name}}.{{.Release.Namespace}}" of the module. Module developers can overide the default behavior by providing a template that may use the ".Release.Name", ".Release.Namespace" and ".Values.labels" variables. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
Plotter.spec.templates[key]
Template contains basic information about the required modules to serve the fybrikapplication e.g., the module helm chart name.
Name | Type | Description | Required |
---|---|---|---|
name | string |
Name of the template |
false |
modules | []object |
Modules is a list of dependent modules. e.g., if a plugin module is used then the service module is used in should appear first in the modules list of the same template. If the modules list contains more than one module, the first module in the list is referred to as the "primary module" of which all the parameters to this template are sent to. |
true |
Plotter.spec.templates[key].modules[index]
ModuleInfo is a copy of FybrikModule Custom Resource. It contains information to instantiate resource of type FybrikModule.
Name | Type | Description | Required |
---|---|---|---|
scope | enum |
Scope indicates at what level the capability is used: workload, asset, cluster If not indicated it is assumed to be asset Enum: asset, workload, cluster |
false |
chart | object |
Chart contains the information needed to use helm to install the capability |
true |
name | string |
Name of the module |
true |
type | string |
May be one of service, config or plugin Service: Means that the control plane deploys the component that performs the capability Config: Another pre-installed service performs the capability and the module deployed configures it for the particular workload or dataset Plugin: Indicates that this module performs a capability as part of another service or module rather than as a stand-alone module |
true |
Plotter.spec.templates[key].modules[index].chart
Chart contains the information needed to use helm to install the capability
Name | Type | Description | Required |
---|---|---|---|
chartPullSecret | string |
Name of secret containing helm registry credentials |
false |
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
Plotter.status
PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
assets | map[string]object |
Assets is a map containing the status per asset. The key of this map is assetId |
false |
blueprints | map[string]object |
|
false |
conditions | []object |
Conditions represent the possible error and failure conditions |
false |
flows | map[string]object |
Flows is a map containing the status for each flow the key is the flow name |
false |
observedGeneration | integer |
ObservedGeneration is taken from the Plotter metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated blueprints should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
readyTimestamp | string |
Format: date-time |
false |
Plotter.status.assets[key]
ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.blueprints[key]
MetaBlueprint defines blueprint metadata (name, namespace) and status
Name | Type | Description | Required |
---|---|---|---|
name | string |
|
true |
namespace | string |
|
true |
status | object |
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring |
true |
Plotter.status.blueprints[key].status
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
observedGeneration | integer |
ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
releases | map[string]integer |
Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled. |
false |
modules | map[string]object |
ModulesState is a map which holds the status of each module its key is the instance name which is the unique name for the deployed instance related to this workload |
true |
Plotter.status.blueprints[key].status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.blueprints[key].status.modules[key]
ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.conditions[index]
Condition describes the state of a FybrikApplication at a certain point.
Name | Type | Description | Required |
---|---|---|---|
message | string |
Message contains the details of the current condition |
false |
status | string |
Status of the condition: true or false |
true |
type | string |
Type of the condition |
true |
Plotter.status.flows[key]
FlowStatus includes information to be reported back to the FybrikApplication resource It holds the status per data flow
Name | Type | Description | Required |
---|---|---|---|
status | object |
ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions |
false |
subFlows | map[string]object |
|
true |
Plotter.status.flows[key].status
ObservedState includes information about the current flow It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.flows[key].subFlows[key]
ObservedState represents a part of the generated Blueprint/Plotter resource status that allows update of FybrikApplication status
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
katalog.fybrik.io/v1alpha1
Resource Types:
Asset
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | katalog.fybrik.io/v1alpha1 | true |
kind | string | Asset | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
|
true |
Asset.spec
Name | Type | Description | Required |
---|---|---|---|
assetDetails | object |
Asset details |
true |
assetMetadata | object |
|
true |
secretRef | object |
Reference to a Secret resource holding credentials for this asset |
true |
Asset.spec.assetDetails
Asset details
Name | Type | Description | Required |
---|---|---|---|
dataFormat | string |
|
false |
connection | object |
Connection information |
true |
Asset.spec.assetDetails.connection
Connection information
Name | Type | Description | Required |
---|---|---|---|
db2 | object |
|
false |
kafka | object |
|
false |
s3 | object |
Connection information for S3 compatible object store |
false |
type | enum |
Enum: s3, db2, kafka |
true |
Asset.spec.assetDetails.connection.db2
Name | Type | Description | Required |
---|---|---|---|
database | string |
|
false |
port | string |
|
false |
ssl | string |
|
false |
table | string |
|
false |
url | string |
|
false |
Asset.spec.assetDetails.connection.kafka
Name | Type | Description | Required |
---|---|---|---|
bootstrap_servers | string |
|
false |
key_deserializer | string |
|
false |
sasl_mechanism | string |
|
false |
schema_registry | string |
|
false |
security_protocol | string |
|
false |
ssl_truststore | string |
|
false |
ssl_truststore_password | string |
|
false |
topic_name | string |
|
false |
value_deserializer | string |
|
false |
Asset.spec.assetDetails.connection.s3
Connection information for S3 compatible object store
Name | Type | Description | Required |
---|---|---|---|
region | string |
|
false |
bucket | string |
|
true |
endpoint | string |
|
true |
objectKey | string |
|
true |
Asset.spec.assetMetadata
Name | Type | Description | Required |
---|---|---|---|
componentsMetadata | map[string]object |
metadata for each component in asset (e.g., column) |
false |
geography | string |
|
false |
namedMetadata | map[string]string |
|
false |
owner | string |
|
false |
tags | []string |
Tags associated with the asset |
false |
Asset.spec.assetMetadata.componentsMetadata[key]
Name | Type | Description | Required |
---|---|---|---|
componentType | string |
|
false |
namedMetadata | map[string]string |
Named terms, that exist in Catalog toxonomy and the values for these terms for columns we will have "SchemaDetails" key, that will include technical schema details for this column TODO: Consider create special field for schema outside of metadata |
false |
tags | []string |
Tags - can be any free text added to a component (no taxonomy) |
false |
Asset.spec.secretRef
Reference to a Secret resource holding credentials for this asset
Name | Type | Description | Required |
---|---|---|---|
name | string |
Name of the Secret resource (must exist in the same namespace) |
true |
motion.fybrik.io/v1alpha1
Resource Types:
BatchTransfer
BatchTransfer is the Schema for the batchtransfers API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | motion.fybrik.io/v1alpha1 | true |
kind | string | BatchTransfer | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD. |
false |
status | object |
BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data |
false |
BatchTransfer.spec
BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.
Name | Type | Description | Required |
---|---|---|---|
failedJobHistoryLimit | integer |
Maximal number of failed Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20 |
false |
flowType | enum |
Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream |
false |
image | string |
Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set. |
false |
imagePullPolicy | string |
Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
maxFailedRetries | integer |
Maximal number of failed retries until the batch job should stop trying. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 10 |
false |
noFinalizer | boolean |
If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set. |
false |
readDataType | enum |
Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData |
false |
schedule | string |
Cron schedule if this BatchTransfer job should run on a regular schedule. Values are specified like cron job schedules. A good translation to human language can be found here https://crontab.guru/ |
false |
secretProviderRole | string |
Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
secretProviderURL | string |
Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
spark | object |
Optional Spark configuration for tuning |
false |
successfulJobHistoryLimit | integer |
Maximal number of successful Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20 |
false |
suspend | boolean |
If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set. |
false |
transformation | []object |
Transformations to be applied to the source data before writing to destination |
false |
writeDataType | enum |
Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData |
false |
writeOperation | enum |
Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update |
false |
destination | object |
Destination data store for this batch job |
true |
source | object |
Source data store for this batch job |
true |
BatchTransfer.spec.spark
Optional Spark configuration for tuning
Name | Type | Description | Required |
---|---|---|---|
appName | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
driverCores | integer |
Number of cores that the driver should use |
false |
driverMemory | integer |
Memory that the driver should have |
false |
executorCores | integer |
Number of cores that each executor should have |
false |
executorMemory | string |
Memory that each executor should have |
false |
image | string |
Image to be used for executors |
false |
imagePullPolicy | string |
Image pull policy to be used for executor |
false |
numExecutors | integer |
Number of executors to be started |
false |
options | map[string]string |
Additional options for Spark configuration. |
false |
shufflePartitions | integer |
Number of shuffle partitions for Spark |
false |
BatchTransfer.spec.transformation[index]
to be refined...
Name | Type | Description | Required |
---|---|---|---|
action | enum |
Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows |
false |
columns | []string |
Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation. |
false |
name | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
options | map[string]string |
Additional options for this transformation. |
false |
BatchTransfer.spec.destination
Destination data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
BatchTransfer.spec.destination.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
BatchTransfer.spec.destination.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
BatchTransfer.spec.destination.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
BatchTransfer.spec.destination.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
BatchTransfer.spec.destination.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source
Source data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
BatchTransfer.spec.source.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
BatchTransfer.spec.source.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
BatchTransfer.spec.source.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
BatchTransfer.spec.source.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
BatchTransfer.spec.source.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.status
BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data
Name | Type | Description | Required |
---|---|---|---|
active | object |
A pointer to the currently running job (or nil) |
false |
error | string |
|
false |
lastCompleted | object |
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 . |
false |
lastFailed | object |
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 . |
false |
lastRecordTime | string |
Format: date-time |
false |
lastScheduleTime | string |
Information when was the last time the job was successfully scheduled. Format: date-time |
false |
lastSuccessTime | string |
Format: date-time |
false |
numRecords | integer |
Format: int64 Minimum: 0 |
false |
status | enum |
Enum: STARTING, RUNNING, SUCCEEDED, FAILED |
false |
BatchTransfer.status.active
A pointer to the currently running job (or nil)
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
BatchTransfer.status.lastCompleted
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
BatchTransfer.status.lastFailed
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
StreamTransfer
StreamTransfer is the Schema for the streamtransfers API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | motion.fybrik.io/v1alpha1 | true |
kind | string | StreamTransfer | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
StreamTransferSpec defines the desired state of StreamTransfer |
false |
status | object |
StreamTransferStatus defines the observed state of StreamTransfer |
false |
StreamTransfer.spec
StreamTransferSpec defines the desired state of StreamTransfer
Name | Type | Description | Required |
---|---|---|---|
flowType | enum |
Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream |
false |
image | string |
Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set. |
false |
imagePullPolicy | string |
Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
noFinalizer | boolean |
If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set. |
false |
readDataType | enum |
Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData |
false |
secretProviderRole | string |
Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
secretProviderURL | string |
Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
suspend | boolean |
If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set. |
false |
transformation | []object |
Transformations to be applied to the source data before writing to destination |
false |
triggerInterval | string |
Interval in which the Micro batches of this stream should be triggered The default is '5 seconds'. |
false |
writeDataType | enum |
Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData |
false |
writeOperation | enum |
Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update |
false |
destination | object |
Destination data store for this batch job |
true |
source | object |
Source data store for this batch job |
true |
StreamTransfer.spec.transformation[index]
to be refined...
Name | Type | Description | Required |
---|---|---|---|
action | enum |
Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows |
false |
columns | []string |
Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation. |
false |
name | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
options | map[string]string |
Additional options for this transformation. |
false |
StreamTransfer.spec.destination
Destination data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
StreamTransfer.spec.destination.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
StreamTransfer.spec.destination.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
StreamTransfer.spec.destination.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
StreamTransfer.spec.destination.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
StreamTransfer.spec.destination.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source
Source data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
StreamTransfer.spec.source.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
StreamTransfer.spec.source.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
StreamTransfer.spec.source.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
StreamTransfer.spec.source.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
StreamTransfer.spec.source.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.status
StreamTransferStatus defines the observed state of StreamTransfer
Name | Type | Description | Required |
---|---|---|---|
active | object |
A pointer to the currently running job (or nil) |
false |
error | string |
|
false |
status | enum |
Enum: STARTING, RUNNING, STOPPED, FAILING |
false |
StreamTransfer.status.active
A pointer to the currently running job (or nil)
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |