Quick Start Guide
Follow this guide to install Fybrik using default parameters that are suitable for experimentation on a single cluster.
For a One Click Demo of Fybrik and a read data scenario, refer to OneClickDemo.
Before you begin
Ensure that you have the following:
- Helm 3.3 or greater must be installed and configured on your machine.
- Kubectl 1.20 or newer must be installed on your machine.
- Access to a Kubernetes cluster such as Kind as a cluster administrator. Kubernetes version support range is 1.24-1.22 although older versions may work well.
Add required Helm repositories
helm repo add jetstack https://charts.jetstack.io
helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo add fybrik-charts https://fybrik.github.io/charts
helm repo update
Install cert-manager
Fybrik requires cert-manager to be installed to your cluster1.
Many clusters already include cert-manager. Check if cert-manager
namespace exists in your cluster and only run the following if it doesn't exist:
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v1.6.2 \
--create-namespace \
--set installCRDs=true \
--wait --timeout 120s
Install Hashicorp Vault and plugins
Hashicorp Vault and a secrets-kubernetes-reader plugin are used by Fybrik for credential management.
Install latest development version from GitHub
The published Helm charts are only available for released versions.
To install the dev
version install the charts from the source code.
For example:
git clone https://github.com/fybrik/fybrik.git
cd fybrik
helm dependency update charts/vault
helm install vault charts/vault --create-namespace -n fybrik-system \
--set "vault.injector.enabled=false" \
--set "vault.server.dev.enabled=true" \
--values charts/vault/env/dev/vault-single-cluster-values.yaml
kubectl wait --for=condition=ready --all pod -n fybrik-system --timeout=120s
git clone https://github.com/fybrik/fybrik.git
cd fybrik
helm dependency update charts/vault
helm install vault charts/vault --create-namespace -n fybrik-system \
--set "vault.global.openshift=true" \
--set "vault.injector.enabled=false" \
--set "vault.server.dev.enabled=true" \
--values charts/vault/env/dev/vault-single-cluster-values.yaml
kubectl wait --for=condition=ready --all pod -n fybrik-system --timeout=120s
Run the following to install vault and the plugin in development mode:
helm install vault fybrik-charts/vault --create-namespace -n fybrik-system \
--set "vault.injector.enabled=false" \
--set "vault.server.dev.enabled=true" \
--values https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/charts/vault/env/dev/vault-single-cluster-values.yaml
kubectl wait --for=condition=ready --all pod -n fybrik-system --timeout=120s
helm install vault fybrik-charts/vault --create-namespace -n fybrik-system \
--set "vault.global.openshift=true" \
--set "vault.injector.enabled=false" \
--set "vault.server.dev.enabled=true" \
--values https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/charts/vault/env/dev/vault-single-cluster-values.yaml
kubectl wait --for=condition=ready --all pod -n fybrik-system --timeout=120s
Install data catalog
Fybrik assumes the existence of a data catalog that contains the metadata and connection information for data assets. Fybrik currently supports:
- OpenMetadata: An open-source end-to-end metadata management solution that includes data discovery, governance, data quality, observability, collaboration, and lineage.
- Katalog: a data catalog stub used for testing and evaluation purposes.
If you plan to use Katalog, you can skip to the next section, but keep in mind that Katalog is mostly suitable for development and testing.
To use OpenMetadata, you can either use an existing deployment, or run the following commands to deploy OpenMetadata in kubernetes.
Note: OpenMetadata deployment requires a cluster storage provisioner that has PersistentVolume capability of ReadWriteMany Access Mode. Below we provide examples of OpenMetadata installations on a single node kind cluster (for development and testing) and an OpenShift cluster on IBM Cloud. For other deployments please check OpenMetadata Kubernetes deployment
export FYBRIK_BRANCH=v1.2.1
curl https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/third_party/openmetadata/install_OM.sh | bash -
The installation of OpenMetadata could take a long time (around 20 minutes on a VM running kind Kubernetes).
Alternatively, if you want to change the OpenMetadata configuration parameters, run:
export FYBRIK_BRANCH=v1.2.1
curl https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/third_party/openmetadata/install_OM.sh | bash -s -- --operation getFiles
make
.
Once the installation is over, be sure to remove the temporary directory.
export FYBRIK_BRANCH=v1.2.1
curl https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/third_party/openmetadata/install_OM.sh | bash -s -- --k8s-type ibm-openshift
The installation of OpenMetadata could take a long time (around 20 minutes on a VM running kind Kubernetes).
Alternatively, if you want to change the OpenMetadata configuration parameters, run:
export FYBRIK_BRANCH=v1.2.1
curl https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/third_party/openmetadata/install_OM.sh | bash -s -- --k8s-type ibm-openshift --operation getFiles
make
.
Once the installation is over, be sure to remove the temporary directory.
If you want to use an existing OpenMetadata deployment, you have to configure it according to Fybrik requirements: Run the following commands to download the configuration files:
export FYBRIK_BRANCH=v1.2.1
curl https://raw.githubusercontent.com/fybrik/fybrik/v1.2.1/third_party/openmetadata/install_OM.sh | bash -s -- --operation getFiles
OPENMETADATA_ENDPOINT
, OPENMETADATA_USER
and OPENMETADATA_PASSWORD
) and then run make prepare-openmetadata-for-fybrik
.
Running make
installs OpenMetadata in the open-metadata
namespace. To install OpenMetadata in another namespace, or to change the credentials of the different services used by OpenMetadata, edit the variables in the Makefile.env
file.
Install control plane
When installing fybrik with OpenMetadata as its data catalog, you need to specify the API endpoint for OpenMetadata. The default value for that endpoint is http://openmetadata.open-metadata:8585/api
. If you are using a different OpenMetadata deployment, replace the openmetadataConnector.openmetadata_endpoint
value in the helm installation command.
Install latest development version from GitHub
The published Helm charts are only available for released versions.
To install the dev
version install the charts from the source code.
For example:
git clone https://github.com/fybrik/fybrik.git
cd fybrik
helm install fybrik-crd charts/fybrik-crd -n fybrik-system --wait
helm install fybrik charts/fybrik --set global.tag=master --set coordinator.catalog=openmetadata --set openmetadataConnector.openmetadata_endpoint=http://openmetadata.open-metadata:8585/api -n fybrik-system --wait
The control plane includes a manager
service that connects to a data catalog and to a policy manager.
Install the Fybrik release with OpenMetadata as the data catalog and with Open Policy Agent as the policy manager:
helm install fybrik-crd fybrik-charts/fybrik-crd -n fybrik-system --version 1.2.1 --wait
helm install fybrik fybrik-charts/fybrik --set coordinator.catalog=openmetadata --set openmetadataConnector.openmetadata_endpoint=http://openmetadata.open-metadata:8585/api -n fybrik-system --version 1.2.1 --wait
Install latest development version from GitHub
The published Helm charts are only available for released versions.
To install the dev
version install the charts from the source code.
For example:
git clone https://github.com/fybrik/fybrik.git
cd fybrik
helm install fybrik-crd charts/fybrik-crd -n fybrik-system --wait
helm install fybrik charts/fybrik --set global.tag=master --set coordinator.catalog=katalog -n fybrik-system --wait
The control plane includes a manager
service that connects to a data catalog and to a policy manager.
Install the Fybrik release with Katalog) as the data catalog and with Open Policy Agent as the policy manager:
helm install fybrik-crd fybrik-charts/fybrik-crd -n fybrik-system --version 1.2.1 --wait
helm install fybrik fybrik-charts/fybrik --set coordinator.catalog=katalog -n fybrik-system --version 1.2.1 --wait
Install modules
Install latest development version from GitHub
To apply the latest development version of arrow-flight-module:
kubectl apply -f https://raw.githubusercontent.com/fybrik/arrow-flight-module/master/module.yaml -n fybrik-system
Modules are plugins that the control plane deploys whenever required. The arrow flight module enables reading data through Apache Arrow Flight API.
Install the v0.10.02 release of arrow-flight-module:
kubectl apply -f https://github.com/fybrik/arrow-flight-module/releases/download/v0.10.0/module.yaml -n fybrik-system
-
Fybrik version 0.6.0 and lower should use cert-manager 1.2.0 ↩
-
Refer to the documentation of arrow-flight-module for other versions ↩