API Reference
Packages:
app.fybrik.io/v1alpha1
Resource Types:
Blueprint
Blueprint is the Schema for the blueprints API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | Blueprint | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
BlueprintSpec defines the desired state of Blueprint, which is the runtime environment which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication. The blueprint uses an "argo like" syntax which indicates the components and the flow of data between them as steps TODO: Add an indication of the communication relationships between the components |
false |
status | object |
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring |
false |
Blueprint.spec
BlueprintSpec defines the desired state of Blueprint, which is the runtime environment which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication. The blueprint uses an "argo like" syntax which indicates the components and the flow of data between them as steps TODO: Add an indication of the communication relationships between the components
Name | Type | Description | Required |
---|---|---|---|
entrypoint | string |
|
true |
flow | object |
DataFlow indicates the flow of the data between the components Currently we assume this is linear and thus use steps, but other more complex graphs could be defined as per how it is done in argo workflow |
true |
templates | []object |
|
true |
Blueprint.spec.flow
DataFlow indicates the flow of the data between the components Currently we assume this is linear and thus use steps, but other more complex graphs could be defined as per how it is done in argo workflow
Name | Type | Description | Required |
---|---|---|---|
name | string |
|
true |
steps | []object |
|
true |
Blueprint.spec.flow.steps[index]
FlowStep is one step indicates an instance of a module in the blueprint, It includes the name of the module template (spec) and the parameters received by the component instance that is initiated by the orchestrator.
Name | Type | Description | Required |
---|---|---|---|
arguments | object |
Arguments are the input parameters for a specific instance of a module. |
false |
name | string |
Name is the name of the instance of the module. For example, if the application is named "notebook" and an implicitcopy module is deemed necessary. The FlowStep name would be notebook-implicitcopy. |
true |
template | string |
Template is the name of the specification in the Blueprint describing how to instantiate a component indicated by the module. It is the name of a FybrikModule CRD. For example: implicit-copy-db2wh-to-s3-latest |
true |
Blueprint.spec.flow.steps[index].arguments
Arguments are the input parameters for a specific instance of a module.
Name | Type | Description | Required |
---|---|---|---|
copy | object |
CopyArgs are parameters specific to modules that copy data from one data store to another. |
false |
read | []object |
ReadArgs are parameters that are specific to modules that enable an application to read data |
false |
write | []object |
WriteArgs are parameters that are specific to modules that enable an application to write data |
false |
Blueprint.spec.flow.steps[index].arguments.copy
CopyArgs are parameters specific to modules that copy data from one data store to another.
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is copied. |
false |
destination | object |
Destination is the data store to which the data will be copied |
true |
source | object |
Source is the where the data currently resides |
true |
Blueprint.spec.flow.steps[index].arguments.copy.destination
Destination is the data store to which the data will be copied
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Blueprint.spec.flow.steps[index].arguments.copy.destination.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.flow.steps[index].arguments.copy.source
Source is the where the data currently resides
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Blueprint.spec.flow.steps[index].arguments.copy.source.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.flow.steps[index].arguments.read[index]
ReadModuleArgs define the input parameters for modules that read data from location A
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data |
false |
assetID | string |
AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource |
true |
source | object |
Source of the read path module |
true |
Blueprint.spec.flow.steps[index].arguments.read[index].source
Source of the read path module
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Blueprint.spec.flow.steps[index].arguments.read[index].source.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.flow.steps[index].arguments.write[index]
WriteModuleArgs define the input parameters for modules that write data to location B
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is written. |
false |
destination | object |
Destination is the data store to which the data will be written |
true |
Blueprint.spec.flow.steps[index].arguments.write[index].destination
Destination is the data store to which the data will be written
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Blueprint.spec.flow.steps[index].arguments.write[index].destination.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Blueprint.spec.templates[index]
ComponentTemplate is a copy of a FybrikModule Custom Resource. It contains the information necessary to instantiate a component in a FlowStep, which provides the functionality described by the module. There are 3 different module types.
Name | Type | Description | Required |
---|---|---|---|
chart | object |
Chart contains the location of the helm chart with info detailing how to deploy |
true |
kind | string |
Kind of k8s resource |
true |
name | string |
Name of the template |
true |
Blueprint.spec.templates[index].chart
Chart contains the location of the helm chart with info detailing how to deploy
Name | Type | Description | Required |
---|---|---|---|
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
Blueprint.status
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
observedGeneration | integer |
ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
releases | map[string]integer |
Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled. |
false |
Blueprint.status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
dataAccessInstructions | string |
DataAccessInstructions indicate how the data user or his application may access the data. Instructions are available upon successful orchestration. |
false |
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
FybrikApplication
FybrikApplication provides information about the application being used by a Data Scientist, the nature of the processing, and the data sets that the Data Scientist has chosen for processing by the application. The FybrikApplication controller (aka pilot) obtains instructions regarding any governance related changes that must be performed on the data, identifies the modules capable of performing such changes, and finally generates the Blueprint which defines the secure runtime environment and all the components in it. This runtime environment provides the Data Scientist's application with access to the data requested in a secure manner and without having to provide any credentials for the data sets. The credentials are obtained automatically by the manager from an external credential management system, which may or may not be part of a data catalog.
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikApplication | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikApplicationSpec defines the desired state of FybrikApplication. |
false |
status | object |
FybrikApplicationStatus defines the observed state of FybrikApplication. |
false |
FybrikApplication.spec
FybrikApplicationSpec defines the desired state of FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
secretRef | string |
SecretRef points to the secret that holds credentials for each system the user has been authenticated with. The secret is deployed in FybrikApplication namespace. |
false |
selector | object |
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used. |
false |
appInfo | map[string]string |
AppInfo contains information describing the reasons for the processing that will be done by the Data Scientist's application. |
true |
data | []object |
Data contains the identifiers of the data to be used by the Data Scientist's application, and the protocol used to access it and the format expected. |
true |
FybrikApplication.spec.selector
Selector enables to connect the resource to the application Application labels should match the labels in the selector. For some flows the selector may not be used.
Name | Type | Description | Required |
---|---|---|---|
clusterName | string |
Cluster name |
false |
workloadSelector | object |
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector. |
true |
FybrikApplication.spec.selector.workloadSelector
WorkloadSelector enables to connect the resource to the application Application labels should match the labels in the selector.
Name | Type | Description | Required |
---|---|---|---|
matchExpressions | []object |
matchExpressions is a list of label selector requirements. The requirements are ANDed. |
false |
matchLabels | map[string]string |
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. |
false |
FybrikApplication.spec.selector.workloadSelector.matchExpressions[index]
A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
Name | Type | Description | Required |
---|---|---|---|
values | []string |
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. |
false |
key | string |
key is the label key that the selector applies to. |
true |
operator | string |
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. |
true |
FybrikApplication.spec.data[index]
DataContext indicates data set chosen by the Data Scientist to be used by his application, and includes information about the data format and technologies used by the application to access the data.
Name | Type | Description | Required |
---|---|---|---|
catalogService | string |
CatalogService represents the catalog service for accessing the requested dataset. If not specified, the enterprise catalog service will be used. |
false |
dataSetID | string |
DataSetID is a unique identifier of the dataset chosen from the data catalog for processing by the data user application. |
true |
requirements | object |
Requirements from the system |
true |
FybrikApplication.spec.data[index].requirements
Requirements from the system
Name | Type | Description | Required |
---|---|---|---|
copy | object |
CopyRequrements include the requirements for copying the data |
false |
interface | object |
Interface indicates the protocol and format expected by the data user |
true |
FybrikApplication.spec.data[index].requirements.copy
CopyRequrements include the requirements for copying the data
Name | Type | Description | Required |
---|---|---|---|
catalog | object |
Catalog indicates that the data asset must be cataloged. |
false |
required | boolean |
Required indicates that the data must be copied. |
false |
FybrikApplication.spec.data[index].requirements.copy.catalog
Catalog indicates that the data asset must be cataloged.
Name | Type | Description | Required |
---|---|---|---|
catalogID | string |
CatalogID specifies the catalog where the data will be cataloged. |
false |
service | string |
CatalogService specifies the datacatalog service that will be used for catalogging the data into. |
false |
FybrikApplication.spec.data[index].requirements.interface
Interface indicates the protocol and format expected by the data user
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikApplication.status
FybrikApplicationStatus defines the observed state of FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
catalogedAssets | map[string]string |
CatalogedAssets provide the new asset identifiers after being registered in the enterprise catalog It maps the original asset id to the cataloged asset id. |
false |
conditions | []object |
Conditions represent the possible error and failure conditions |
false |
dataAccessInstructions | string |
DataAccessInstructions indicate how the data user or his application may access the data. Instructions are available upon successful orchestration. |
false |
generated | object |
Generated resource identifier |
false |
observedGeneration | integer |
ObservedGeneration is taken from the FybrikApplication metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether the Blueprint status changed. Format: int64 |
false |
provisionedStorage | map[string]object |
ProvisionedStorage maps a dataset (identified by AssetID) to the new provisioned bucket. It allows FybrikApplication controller to manage buckets in case the spec has been modified, an error has occurred, or a delete event has been received. ProvisionedStorage has the information required to register the dataset once the owned plotter resource is ready |
false |
readEndpointsMap | map[string]object |
ReadEndpointsMap maps an datasetID (after parsing from json to a string with dashes) to the endpoint spec from which the asset will be served to the application |
false |
ready | boolean |
Ready is true if a blueprint has been successfully orchestrated |
false |
FybrikApplication.status.conditions[index]
Condition describes the state of a FybrikApplication at a certain point.
Name | Type | Description | Required |
---|---|---|---|
message | string |
Message contains the details of the current condition |
false |
status | string |
Status of the condition: true or false |
true |
type | string |
Type of the condition |
true |
FybrikApplication.status.generated
Generated resource identifier
Name | Type | Description | Required |
---|---|---|---|
appVersion | integer |
Version of FybrikApplication that has generated this resource Format: int64 |
true |
kind | string |
Kind of the resource (Blueprint, Plotter) |
true |
name | string |
Name of the resource |
true |
namespace | string |
Namespace of the resource |
true |
FybrikApplication.status.provisionedStorage[key]
DatasetDetails contain dataset connection and metadata required to register this dataset in the enterprise catalog
Name | Type | Description | Required |
---|---|---|---|
datasetRef | string |
Reference to a Dataset resource containing the request to provision storage |
false |
details | object |
Dataset information |
false |
secretRef | string |
Reference to a secret where the credentials are stored |
false |
FybrikApplication.status.readEndpointsMap[key]
EndpointSpec is used both by the module creator and by the status of the fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Always equals the release name. Can be omitted. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
FybrikModule
FybrikModule is a description of an injectable component. the parameters it requires, as well as the specification of how to instantiate such a component. It is used as metadata only. There is no status nor reconciliation.
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikModule | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application. |
true |
FybrikModule.spec
FybrikModuleSpec contains the info common to all modules, which are one of the components that process, load, write, audit, monitor the data used by the data scientist's application.
Name | Type | Description | Required |
---|---|---|---|
dependencies | []object |
Other components that must be installed in order for this module to work |
false |
statusIndicators | []object |
StatusIndicators allow to check status of a non-standard resource that can not be computed by helm/kstatus |
false |
capabilities | object |
Capabilities declares what this module knows how to do and the types of data it knows how to handle |
true |
chart | object |
Reference to a Helm chart that allows deployment of the resources required for this module |
true |
flows | []enum |
Flows is a list of the types of capabilities supported by the module - copy, read, write |
true |
FybrikModule.spec.dependencies[index]
Dependency details another component on which this module relies - i.e. a pre-requisit
Name | Type | Description | Required |
---|---|---|---|
name | string |
Name is the name of the dependent component |
true |
type | enum |
Type provides information used in determining how to instantiate the component Enum: module, connector, feature |
true |
FybrikModule.spec.statusIndicators[index]
ResourceStatusIndicator is used to determine the status of an orchestrated resource
Name | Type | Description | Required |
---|---|---|---|
errorMessage | string |
ErrorMessage specifies the resource field to check for an error, e.g. status.errorMsg |
false |
failureCondition | string |
FailureCondition specifies a condition that indicates the resource failure It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) |
false |
kind | string |
Kind provides information about the resource kind |
true |
successCondition | string |
SuccessCondition specifies a condition that indicates that the resource is ready It uses kubernetes label selection syntax (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) |
true |
FybrikModule.spec.capabilities
Capabilities declares what this module knows how to do and the types of data it knows how to handle
Name | Type | Description | Required |
---|---|---|---|
actions | []object |
Actions are the data transformations that the module supports |
false |
api | object |
API indicates to the application how to access/write the data |
false |
supportedInterfaces | []object |
Copy should have one or more instances in the list, and its content should have source and sink Read should have one or more instances in the list, each with source populated Write should have one or more instances in the list, each with sink populated TODO - In the future if we have a module type that doesn't interface directly with data then this list could be empty |
true |
FybrikModule.spec.capabilities.actions[index]
SupportedAction declares an action that the module supports (action identifier and its scope)
Name | Type | Description | Required |
---|---|---|---|
id | string |
|
false |
level | integer |
Format: int32 |
false |
FybrikModule.spec.capabilities.api
API indicates to the application how to access/write the data
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
endpoint | object |
EndpointSpec is used both by the module creator and by the status of the fybrikapplication |
true |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.capabilities.api.endpoint
EndpointSpec is used both by the module creator and by the status of the fybrikapplication
Name | Type | Description | Required |
---|---|---|---|
hostname | string |
Always equals the release name. Can be omitted. |
false |
port | integer |
Format: int32 |
true |
scheme | string |
For example: http, https, grpc, grpc+tls, jdbc:oracle:thin:@ etc |
true |
FybrikModule.spec.capabilities.supportedInterfaces[index]
ModuleInOut specifies the protocol and format of the data input and output by the module - if any
Name | Type | Description | Required |
---|---|---|---|
sink | object |
Sink specifies the output data protocol and format |
false |
source | object |
Source specifies the input data protocol and format |
false |
flow | enum |
Flow for which this interface is supported Enum: copy, read, write |
true |
FybrikModule.spec.capabilities.supportedInterfaces[index].sink
Sink specifies the output data protocol and format
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.capabilities.supportedInterfaces[index].source
Source specifies the input data protocol and format
Name | Type | Description | Required |
---|---|---|---|
dataformat | string |
DataFormat defines the data format type |
false |
protocol | string |
Protocol defines the interface protocol used for data transactions |
true |
FybrikModule.spec.chart
Reference to a Helm chart that allows deployment of the resources required for this module
Name | Type | Description | Required |
---|---|---|---|
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
FybrikStorageAccount
FybrikStorageAccount defines a storage account used for copying data. Only S3 based storage is supported. It contains endpoint, region and a reference to the credentials a Owner of the asset is responsible to store the credentials
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | FybrikStorageAccount | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount |
false |
status | object |
FybrikStorageAccountStatus defines the observed state of FybrikStorageAccount |
false |
FybrikStorageAccount.spec
FybrikStorageAccountSpec defines the desired state of FybrikStorageAccount
Name | Type | Description | Required |
---|---|---|---|
endpoint | string |
Endpoint |
true |
regions | []string |
Regions |
true |
secretRef | string |
A name of k8s secret deployed in the control plane. This secret includes secretKey and accessKey credentials for S3 bucket |
true |
Plotter
Plotter is the Schema for the plotters API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | app.fybrik.io/v1alpha1 | true |
kind | string | Plotter | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter installs the runtime environment (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication. |
false |
status | object |
PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring |
false |
Plotter.spec
PlotterSpec defines the desired state of Plotter, which is applied in a multi-clustered environment. Plotter installs the runtime environment (as blueprints running on remote clusters) which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication.
Name | Type | Description | Required |
---|---|---|---|
blueprints | map[string]object |
Blueprints structure represents remote blueprints mapped by the identifier of a cluster in which they will be running |
true |
selector | object |
Selector enables to connect the resource to the application Should match the selector of the owner - FybrikApplication CRD. |
true |
Plotter.spec.blueprints[key]
BlueprintSpec defines the desired state of Blueprint, which is the runtime environment which provides the Data Scientist's application with secure and governed access to the data requested in the FybrikApplication. The blueprint uses an "argo like" syntax which indicates the components and the flow of data between them as steps TODO: Add an indication of the communication relationships between the components
Name | Type | Description | Required |
---|---|---|---|
entrypoint | string |
|
true |
flow | object |
DataFlow indicates the flow of the data between the components Currently we assume this is linear and thus use steps, but other more complex graphs could be defined as per how it is done in argo workflow |
true |
templates | []object |
|
true |
Plotter.spec.blueprints[key].flow
DataFlow indicates the flow of the data between the components Currently we assume this is linear and thus use steps, but other more complex graphs could be defined as per how it is done in argo workflow
Name | Type | Description | Required |
---|---|---|---|
name | string |
|
true |
steps | []object |
|
true |
Plotter.spec.blueprints[key].flow.steps[index]
FlowStep is one step indicates an instance of a module in the blueprint, It includes the name of the module template (spec) and the parameters received by the component instance that is initiated by the orchestrator.
Name | Type | Description | Required |
---|---|---|---|
arguments | object |
Arguments are the input parameters for a specific instance of a module. |
false |
name | string |
Name is the name of the instance of the module. For example, if the application is named "notebook" and an implicitcopy module is deemed necessary. The FlowStep name would be notebook-implicitcopy. |
true |
template | string |
Template is the name of the specification in the Blueprint describing how to instantiate a component indicated by the module. It is the name of a FybrikModule CRD. For example: implicit-copy-db2wh-to-s3-latest |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments
Arguments are the input parameters for a specific instance of a module.
Name | Type | Description | Required |
---|---|---|---|
copy | object |
CopyArgs are parameters specific to modules that copy data from one data store to another. |
false |
read | []object |
ReadArgs are parameters that are specific to modules that enable an application to read data |
false |
write | []object |
WriteArgs are parameters that are specific to modules that enable an application to write data |
false |
Plotter.spec.blueprints[key].flow.steps[index].arguments.copy
CopyArgs are parameters specific to modules that copy data from one data store to another.
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is copied. |
false |
destination | object |
Destination is the data store to which the data will be copied |
true |
source | object |
Source is the where the data currently resides |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.copy.destination
Destination is the data store to which the data will be copied
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.copy.destination.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.copy.source
Source is the where the data currently resides
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.copy.source.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.read[index]
ReadModuleArgs define the input parameters for modules that read data from location A
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data |
false |
assetID | string |
AssetID identifies the asset to be used for accessing the data when it is ready It is copied from the FybrikApplication resource |
true |
source | object |
Source of the read path module |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.read[index].source
Source of the read path module
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.read[index].source.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.write[index]
WriteModuleArgs define the input parameters for modules that write data to location B
Name | Type | Description | Required |
---|---|---|---|
transformations | []object |
Transformations are different types of processing that may be done to the data as it is written. |
false |
destination | object |
Destination is the data store to which the data will be written |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.write[index].destination
Destination is the data store to which the data will be written
Name | Type | Description | Required |
---|---|---|---|
connection | object |
Connection has the relevant details for accesing the data (url, table, ssl, etc.) |
true |
format | string |
Format represents data format (e.g. parquet) as received from catalog connectors |
true |
vault | object |
Holds details for retrieving credentials by the modules from Vault store. |
true |
Plotter.spec.blueprints[key].flow.steps[index].arguments.write[index].destination.vault
Holds details for retrieving credentials by the modules from Vault store.
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
Plotter.spec.blueprints[key].templates[index]
ComponentTemplate is a copy of a FybrikModule Custom Resource. It contains the information necessary to instantiate a component in a FlowStep, which provides the functionality described by the module. There are 3 different module types.
Name | Type | Description | Required |
---|---|---|---|
chart | object |
Chart contains the location of the helm chart with info detailing how to deploy |
true |
kind | string |
Kind of k8s resource |
true |
name | string |
Name of the template |
true |
Plotter.spec.blueprints[key].templates[index].chart
Chart contains the location of the helm chart with info detailing how to deploy
Name | Type | Description | Required |
---|---|---|---|
values | map[string]string |
Values to pass to helm chart installation |
false |
name | string |
Name of helm chart |
true |
Plotter.spec.selector
Selector enables to connect the resource to the application Should match the selector of the owner - FybrikApplication CRD.
Name | Type | Description | Required |
---|---|---|---|
matchExpressions | []object |
matchExpressions is a list of label selector requirements. The requirements are ANDed. |
false |
matchLabels | map[string]string |
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. |
false |
Plotter.spec.selector.matchExpressions[index]
A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
Name | Type | Description | Required |
---|---|---|---|
values | []string |
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. |
false |
key | string |
key is the label key that the selector applies to. |
true |
operator | string |
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. |
true |
Plotter.status
PlotterStatus defines the observed state of Plotter This includes readiness, error message, and indicators received from blueprint resources owned by the Plotter for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
blueprints | map[string]object |
|
false |
observedGeneration | integer |
ObservedGeneration is taken from the Plotter metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated blueprints should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
readyTimestamp | string |
Format: date-time |
false |
Plotter.status.blueprints[key]
MetaBlueprint defines blueprint metadata (name, namespace) and status
Name | Type | Description | Required |
---|---|---|---|
name | string |
|
true |
namespace | string |
|
true |
status | object |
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring |
true |
Plotter.status.blueprints[key].status
BlueprintStatus defines the observed state of Blueprint This includes readiness, error message, and indicators forthe Kubernetes resources owned by the Blueprint for cleanup and status monitoring
Name | Type | Description | Required |
---|---|---|---|
observedGeneration | integer |
ObservedGeneration is taken from the Blueprint metadata. This is used to determine during reconcile whether reconcile was called because the desired state changed, or whether status of the allocated resources should be checked. Format: int64 |
false |
observedState | object |
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions |
false |
releases | map[string]integer |
Releases map each release to the observed generation of the blueprint containing this release. At the end of reconcile, each release should be mapped to the latest blueprint version or be uninstalled. |
false |
Plotter.status.blueprints[key].status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
dataAccessInstructions | string |
DataAccessInstructions indicate how the data user or his application may access the data. Instructions are available upon successful orchestration. |
false |
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
Plotter.status.observedState
ObservedState includes information to be reported back to the FybrikApplication resource It includes readiness and error indications, as well as user instructions
Name | Type | Description | Required |
---|---|---|---|
dataAccessInstructions | string |
DataAccessInstructions indicate how the data user or his application may access the data. Instructions are available upon successful orchestration. |
false |
error | string |
Error indicates that there has been an error to orchestrate the modules and provides the error message |
false |
ready | boolean |
Ready represents that the modules have been orchestrated successfully and the data is ready for usage |
false |
katalog.fybrik.io/v1alpha1
Resource Types:
Asset
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | katalog.fybrik.io/v1alpha1 | true |
kind | string | Asset | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
|
true |
Asset.spec
Name | Type | Description | Required |
---|---|---|---|
assetDetails | object |
Asset details |
true |
assetMetadata | object |
|
true |
secretRef | object |
Reference to a Secret resource holding credentials for this asset |
true |
Asset.spec.assetDetails
Asset details
Name | Type | Description | Required |
---|---|---|---|
dataFormat | string |
|
false |
connection | object |
Connection information |
true |
Asset.spec.assetDetails.connection
Connection information
Name | Type | Description | Required |
---|---|---|---|
db2 | object |
|
false |
kafka | object |
|
false |
s3 | object |
Connection information for S3 compatible object store |
false |
type | enum |
Enum: s3, db2, kafka |
true |
Asset.spec.assetDetails.connection.db2
Name | Type | Description | Required |
---|---|---|---|
database | string |
|
false |
port | string |
|
false |
ssl | string |
|
false |
table | string |
|
false |
url | string |
|
false |
Asset.spec.assetDetails.connection.kafka
Name | Type | Description | Required |
---|---|---|---|
bootstrap_servers | string |
|
false |
key_deserializer | string |
|
false |
sasl_mechanism | string |
|
false |
schema_registry | string |
|
false |
security_protocol | string |
|
false |
ssl_truststore | string |
|
false |
ssl_truststore_password | string |
|
false |
topic_name | string |
|
false |
value_deserializer | string |
|
false |
Asset.spec.assetDetails.connection.s3
Connection information for S3 compatible object store
Name | Type | Description | Required |
---|---|---|---|
region | string |
|
false |
bucket | string |
|
true |
endpoint | string |
|
true |
objectKey | string |
|
true |
Asset.spec.assetMetadata
Name | Type | Description | Required |
---|---|---|---|
componentsMetadata | map[string]object |
metadata for each component in asset (e.g., column) |
false |
geography | string |
|
false |
namedMetadata | map[string]string |
|
false |
owner | string |
|
false |
tags | []string |
Tags associated with the asset |
false |
Asset.spec.assetMetadata.componentsMetadata[key]
Name | Type | Description | Required |
---|---|---|---|
componentType | string |
|
false |
namedMetadata | map[string]string |
Named terms, that exist in Catalog toxonomy and the values for these terms for columns we will have "SchemaDetails" key, that will include technical schema details for this column TODO: Consider create special field for schema outside of metadata |
false |
tags | []string |
Tags - can be any free text added to a component (no taxonomy) |
false |
Asset.spec.secretRef
Reference to a Secret resource holding credentials for this asset
Name | Type | Description | Required |
---|---|---|---|
name | string |
Name of the Secret resource (must exist in the same namespace) |
true |
motion.fybrik.io/v1alpha1
Resource Types:
BatchTransfer
BatchTransfer is the Schema for the batchtransfers API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | motion.fybrik.io/v1alpha1 | true |
kind | string | BatchTransfer | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD. |
false |
status | object |
BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data |
false |
BatchTransfer.spec
BatchTransferSpec defines the state of a BatchTransfer. The state includes source/destination specification, a schedule and the means by which data movement is to be conducted. The means is given as a kubernetes job description. In addition, the state also contains a sketch of a transformation instruction. In future releases, the transformation description should be specified in a separate CRD.
Name | Type | Description | Required |
---|---|---|---|
failedJobHistoryLimit | integer |
Maximal number of failed Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20 |
false |
flowType | enum |
Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream |
false |
image | string |
Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set. |
false |
imagePullPolicy | string |
Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
maxFailedRetries | integer |
Maximal number of failed retries until the batch job should stop trying. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 10 |
false |
noFinalizer | boolean |
If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set. |
false |
readDataType | enum |
Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData |
false |
schedule | string |
Cron schedule if this BatchTransfer job should run on a regular schedule. Values are specified like cron job schedules. A good translation to human language can be found here https://crontab.guru/ |
false |
secretProviderRole | string |
Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
secretProviderURL | string |
Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
spark | object |
Optional Spark configuration for tuning |
false |
successfulJobHistoryLimit | integer |
Maximal number of successful Kubernetes job objects that should be kept. This property will be defaulted by the webhook if not set. Minimum: 0 Maximum: 20 |
false |
suspend | boolean |
If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set. |
false |
transformation | []object |
Transformations to be applied to the source data before writing to destination |
false |
writeDataType | enum |
Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData |
false |
writeOperation | enum |
Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update |
false |
destination | object |
Destination data store for this batch job |
true |
source | object |
Source data store for this batch job |
true |
BatchTransfer.spec.spark
Optional Spark configuration for tuning
Name | Type | Description | Required |
---|---|---|---|
appName | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
driverCores | integer |
Number of cores that the driver should use |
false |
driverMemory | integer |
Memory that the driver should have |
false |
executorCores | integer |
Number of cores that each executor should have |
false |
executorMemory | string |
Memory that each executor should have |
false |
image | string |
Image to be used for executors |
false |
imagePullPolicy | string |
Image pull policy to be used for executor |
false |
numExecutors | integer |
Number of executors to be started |
false |
options | map[string]string |
Additional options for Spark configuration. |
false |
shufflePartitions | integer |
Number of shuffle partitions for Spark |
false |
BatchTransfer.spec.transformation[index]
to be refined...
Name | Type | Description | Required |
---|---|---|---|
action | enum |
Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows |
false |
columns | []string |
Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation. |
false |
name | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
options | map[string]string |
Additional options for this transformation. |
false |
BatchTransfer.spec.destination
Destination data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
BatchTransfer.spec.destination.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
BatchTransfer.spec.destination.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
BatchTransfer.spec.destination.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
true |
BatchTransfer.spec.destination.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.destination.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
BatchTransfer.spec.destination.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source
Source data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
BatchTransfer.spec.source.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
BatchTransfer.spec.source.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
BatchTransfer.spec.source.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
true |
BatchTransfer.spec.source.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.spec.source.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
BatchTransfer.spec.source.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
BatchTransfer.status
BatchTransferStatus defines the observed state of BatchTransfer This includes a reference to the job that implements the movement as well as the last schedule time. What is missing: Extended status information such as: - number of records moved - technical meta-data
Name | Type | Description | Required |
---|---|---|---|
active | object |
A pointer to the currently running job (or nil) |
false |
error | string |
|
false |
lastCompleted | object |
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 . |
false |
lastFailed | object |
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 . |
false |
lastRecordTime | string |
Format: date-time |
false |
lastScheduleTime | string |
Information when was the last time the job was successfully scheduled. Format: date-time |
false |
lastSuccessTime | string |
Format: date-time |
false |
numRecords | integer |
Format: int64 Minimum: 0 |
false |
status | enum |
Enum: STARTING, RUNNING, SUCCEEDED, FAILED |
false |
BatchTransfer.status.active
A pointer to the currently running job (or nil)
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
BatchTransfer.status.lastCompleted
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
BatchTransfer.status.lastFailed
ObjectReference contains enough information to let you inspect or modify the referred object. --- New uses of this type are discouraged because of difficulty describing its usage when embedded in APIs. 1. Ignored fields. It includes many fields which are not generally honored. For instance, ResourceVersion and FieldPath are both very rarely valid in actual usage. 2. Invalid usage help. It is impossible to add specific help for individual usage. In most embedded usages, there are particular restrictions like, "must refer only to types A and B" or "UID not honored" or "name must be restricted". Those cannot be well described when embedded. 3. Inconsistent validation. Because the usages are different, the validation rules are different by usage, which makes it hard for users to predict what will happen. 4. The fields are both imprecise and overly precise. Kind is not a precise mapping to a URL. This can produce ambiguity during interpretation and require a REST mapping. In most cases, the dependency is on the group,resource tuple and the version of the actual struct is irrelevant. 5. We cannot easily change it. Because this type is embedded in many locations, updates to this type will affect numerous schemas. Don't make new APIs embed an underspecified API type they do not control. Instead of using this type, create a locally provided and used type that is well-focused on your reference. For example, ServiceReferences for admission registration: https://github.com/kubernetes/api/blob/release-1.17/admissionregistration/v1/types.go#L533 .
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |
StreamTransfer
StreamTransfer is the Schema for the streamtransfers API
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string | motion.fybrik.io/v1alpha1 | true |
kind | string | StreamTransfer | true |
metadata | object | Refer to the Kubernetes API documentation for the fields of the `metadata` field. | true |
spec | object |
StreamTransferSpec defines the desired state of StreamTransfer |
false |
status | object |
StreamTransferStatus defines the observed state of StreamTransfer |
false |
StreamTransfer.spec
StreamTransferSpec defines the desired state of StreamTransfer
Name | Type | Description | Required |
---|---|---|---|
flowType | enum |
Data flow type that specifies if this is a stream or a batch workflow Enum: Batch, Stream |
false |
image | string |
Image that should be used for the actual batch job. This is usually a datamover image. This property will be defaulted by the webhook if not set. |
false |
imagePullPolicy | string |
Image pull policy that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
noFinalizer | boolean |
If this batch job instance should have a finalizer or not. This property will be defaulted by the webhook if not set. |
false |
readDataType | enum |
Data type of the data that is read from source (log data or change data) Enum: LogData, ChangeData |
false |
secretProviderRole | string |
Secret provider role that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
secretProviderURL | string |
Secret provider url that should be used for the actual job. This property will be defaulted by the webhook if not set. |
false |
suspend | boolean |
If this batch job instance is run on a schedule the regular schedule can be suspended with this property. This property will be defaulted by the webhook if not set. |
false |
transformation | []object |
Transformations to be applied to the source data before writing to destination |
false |
triggerInterval | string |
Interval in which the Micro batches of this stream should be triggered The default is '5 seconds'. |
false |
writeDataType | enum |
Data type of how the data should be written to the target (log data or change data) Enum: LogData, ChangeData |
false |
writeOperation | enum |
Write operation that should be performed when writing (overwrite,append,update) Caution: Some write operations are only available for batch and some only for stream. Enum: Overwrite, Append, Update |
false |
destination | object |
Destination data store for this batch job |
true |
source | object |
Source data store for this batch job |
true |
StreamTransfer.spec.transformation[index]
to be refined...
Name | Type | Description | Required |
---|---|---|---|
action | enum |
Transformation action that should be performed. Enum: RemoveColumns, EncryptColumns, DigestColumns, RedactColumns, SampleRows, FilterRows |
false |
columns | []string |
Columns that are involved in this action. This property is optional as for some actions no columns have to be specified. E.g. filter is a row based transformation. |
false |
name | string |
Name of the transaction. Mainly used for debugging and lineage tracking. |
false |
options | map[string]string |
Additional options for this transformation. |
false |
StreamTransfer.spec.destination
Destination data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
StreamTransfer.spec.destination.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
StreamTransfer.spec.destination.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
StreamTransfer.spec.destination.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
true |
StreamTransfer.spec.destination.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.destination.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
StreamTransfer.spec.destination.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source
Source data store for this batch job
Name | Type | Description | Required |
---|---|---|---|
cloudant | object |
IBM Cloudant. Needs cloudant legacy credentials. |
false |
database | object |
Database data store. For the moment only Db2 is supported. |
false |
description | string |
Description of the transfer in human readable form that is displayed in the kubectl get If not provided this will be filled in depending on the datastore that is specified. |
false |
kafka | object |
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well. |
false |
s3 | object |
An object store data store that is compatible with S3. This can be a COS bucket. |
false |
StreamTransfer.spec.source.cloudant
IBM Cloudant. Needs cloudant legacy credentials.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Cloudant password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
username | string |
Cloudant user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
database | string |
Database to be read from/written to |
true |
host | string |
Host of cloudant instance |
true |
StreamTransfer.spec.source.cloudant.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.database
Database data store. For the moment only Db2 is supported.
Name | Type | Description | Required |
---|---|---|---|
password | string |
Database password. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
secretImport | string |
Define a secret import definition. |
false |
user | string |
Database user. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
db2URL | string |
URL to Db2 instance in JDBC format Supported SSL certificates are currently certificates signed with IBM Intermediate CA or cloud signed certificates. |
true |
table | string |
Table to be read |
true |
StreamTransfer.spec.source.database.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.kafka
Kafka data store. The supposed format within the given Kafka topic is a Confluent compatible format stored as Avro. A schema registry needs to be specified as well.
Name | Type | Description | Required |
---|---|---|---|
createSnapshot | boolean |
If a snapshot should be created of the topic. Records in Kafka are stored as key-value pairs. Updates/Deletes for the same key are appended to the Kafka topic and the last value for a given key is the valid key in a Snapshot. When this property is true only the last value will be written. If the property is false all values will be written out. As a CDC example: If the property is true a valid snapshot of the log stream will be created. If the property is false the CDC stream will be dumped as is like a change log. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
keyDeserializer | string |
Deserializer to be used for the keys of the topic |
false |
password | string |
Kafka user password Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
saslMechanism | string |
SASL Mechanism to be used (e.g. PLAIN or SCRAM-SHA-512) Default SCRAM-SHA-512 will be assumed if not specified |
false |
secretImport | string |
Define a secret import definition. |
false |
securityProtocol | string |
Kafka security protocol one of (PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL) Default SASL_SSL will be assumed if not specified |
false |
sslTruststore | string |
A truststore or certificate encoded as base64. The format can be JKS or PKCS12. A truststore can be specified like this or in a predefined Kubernetes secret |
false |
sslTruststoreLocation | string |
SSL truststore location. |
false |
sslTruststorePassword | string |
SSL truststore password. |
false |
sslTruststoreSecret | string |
Kubernetes secret that contains the SSL truststore. The format can be JKS or PKCS12. A truststore can be specified like this or as |
false |
user | string |
Kafka user name. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
valueDeserializer | string |
Deserializer to be used for the values of the topic |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
kafkaBrokers | string |
Kafka broker URLs as a comma separated list. |
true |
kafkaTopic | string |
Kafka topic |
true |
schemaRegistryURL | string |
URL to the schema registry. The registry has to be Confluent schema registry compatible. |
true |
StreamTransfer.spec.source.kafka.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.spec.source.s3
An object store data store that is compatible with S3. This can be a COS bucket.
Name | Type | Description | Required |
---|---|---|---|
accessKey | string |
Access key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
dataFormat | string |
Data format of the objects in S3. e.g. parquet or csv. Please refer to struct for allowed values. |
false |
partitionBy | []string |
Partition by partition (for target data stores) Defines the columns to partition the output by for a target data store. |
false |
region | string |
Region of S3 service |
false |
secretImport | string |
Define a secret import definition. |
false |
secretKey | string |
Secret key of the HMAC credentials that can access the given bucket. Can be retrieved from vault if specified in vault parameter and is thus optional. |
false |
vault | object |
Define secrets that are fetched from a Vault instance |
false |
bucket | string |
Bucket of S3 service |
true |
endpoint | string |
Endpoint of S3 service |
true |
objectKey | string |
Object key of the object in S3. This is used as a prefix! Thus all objects that have the given objectKey as prefix will be used as input! |
true |
StreamTransfer.spec.source.s3.vault
Define secrets that are fetched from a Vault instance
Name | Type | Description | Required |
---|---|---|---|
address | string |
Address is Vault address |
true |
authPath | string |
AuthPath is the path to auth method i.e. kubernetes |
true |
role | string |
Role is the Vault role used for retrieving the credentials |
true |
secretPath | string |
SecretPath is the path of the secret holding the Credentials in Vault |
true |
StreamTransfer.status
StreamTransferStatus defines the observed state of StreamTransfer
Name | Type | Description | Required |
---|---|---|---|
active | object |
A pointer to the currently running job (or nil) |
false |
error | string |
|
false |
status | enum |
Enum: STARTING, RUNNING, STOPPED, FAILING |
false |
StreamTransfer.status.active
A pointer to the currently running job (or nil)
Name | Type | Description | Required |
---|---|---|---|
apiVersion | string |
API version of the referent. |
false |
fieldPath | string |
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object. TODO: this design is not final and this field is subject to change in the future. |
false |
kind | string |
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
false |
name | string |
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names |
false |
namespace | string |
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/ |
false |
resourceVersion | string |
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency |
false |
uid | string |
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids |
false |