Skip to content

Tools used by the actors

  • The data owner would typically register the dataset in a proprietary or open source catalog. We have provided katalog - a thin layer acting as a replacement for the data catalog for evaluation purposes. This simplifies the sample deployment.
  • The data owner stores credentials for accessing the dataset in kubernetes secrets.
  • Proprietary and open source data governance systems are available either as part of a data catalog or as stand-alone systems. This sample uses the open source OpenPolicyAgent. The data governance officer writes the policies in OPA's rego language.
  • Any editor can be used to write the FybrikApplication.yaml via which the data user expresses the data usage requirements.
  • A jupyter notebook is the workload from which the data is consumed by the data user.
  • A Web Browser

Prepare Fybrik environment

Typically, this would be done by an IT administrator.

  • Install Fybrik using the Quick Start guide. This sample assumes the use of the built-in catalog, Open Policy Agent (OPA) and flight module.

Create a namespace for the sample

Create a new Kubernetes namespace and set it as the active namespace:

kubectl create namespace fybrik-notebook-sample
kubectl config set-context --current --namespace=fybrik-notebook-sample

This enables easy cleanup once you're done experimenting with the sample.