Skip to content

Discovering an S3 asset through the OpenMetadata UI

This page explains how to discover an existing S3 asset through the OpenMetadata UI. This is useful when running workloads such as the one described in Notebook sample for the read flow.

The screenshots refer to the localstack cloud storage, but the explanations also apply to other S3 services.

Begin by opening your browser to the OpenMetadata UI. If you installed OpenMetadata in your kubernetes cluster in the open-metadata namespace, go to http://localhost:8585 after running:

kubectl port-forward svc/openmetadata -n open-metadata 8585:8585 &

To create a connection to S3 and discovering your CSV asset:

  1. Login to OpenMetadata. The default username and password are: admin / admin
  2. On the left menu, choose Services
  3. Press Add new Database Service
  4. Choose Datalake and press Next
  5. Enter a name for your service, such as openmetadata-s3, and press Next
  6. Enter the connection information. That information includes the Access Key and Secret Key. The AWS Region is mandatory, but it is ignored if you enter an Endpoint URL. If your object storage is a local localstack deployment, enter its URL (e.g. http://localstack.fybrik-notebook-sample:4566). Optionally, you may enter a Bucket Name, thereby limiting the discovery process to a single bucket
  7. Scroll down and press Test Connection to make sure that the credentials you provided are correct. Once you see that the Connection test was successful, press Save
  8. Choose Add ingestion
  9. You need not change the ingestion configuration. Press Next
  10. Press Next
  11. Press Add & Deploy
  12. The Ingestion Pipeline is created. Press View Service
  13. Choose the Ingestions tab
  14. The status of the Ingestion Pipeline might be Queued...
  15. ... or Running.
  16. Wait until the ingestion process has completed successfully, and press the Explore tab
  17. Given a list of all OpenMetadata tables, press the table in which you are interested
  18. You can learn the name that OpenMetadata gave your table by looking at the URL. If, for instance, the URL is localhost:8585/table/openmetadata-s3.default.demo."PS_20174392719_1491204439457_log.csv", then your assetID is openmetadata-s3.default.demo."PS_20174392719_1491204439457_log.csv".
  19. To add tags, press Add tag
  20. Choose a tag for the dataset, such as Purpose.finance
  21. Press the check mark
  22. Next, you can add tags to some of the columns
  23. For instance, you may choose PII.Sensitive for columns that need to be redacted
  24. Finally, press the Custom Properties tab
  25. Set the asset properties as needed. If the dataFormat field is left blank, it is assumed to be csv. If the discovered asset is of a different format (e.g. parquet), set dataFormat accordingly. These asset properties will be returned to the Fybrik Manager and will be instrumental in the construction of a data path
    You are all set. OpenMetadata has discovered your asset, and you have added tags and metadata values. You can reference this asset using the asset ID