Discovering an S3 asset through the OpenMetadata UI
This page explains how to discover an existing S3 asset through the OpenMetadata UI. This is useful when running workloads such as the one described in Notebook sample for the read flow.
The screenshots refer to the localstack
cloud storage, but the explanations also apply to other S3 services.
Begin by opening your browser to the OpenMetadata UI. If you installed OpenMetadata in your kubernetes cluster in the open-metadata
namespace, go to http://localhost:8585 after running:
kubectl port-forward svc/openmetadata -n open-metadata 8585:8585 &
To create a connection to S3 and discovering your CSV asset:
- Login to OpenMetadata. The default username and password are:
admin
/admin
- On the left menu, choose
Services
- Press
Add new Database Service
- Choose
Datalake
and pressNext
- Enter a name for your service, such as
openmetadata-s3
, and pressNext
- Enter the connection information. That information includes the
Access Key
andSecret Key
. TheAWS Region
is mandatory, but it is ignored if you enter anEndpoint URL
. If your object storage is a locallocalstack
deployment, enter its URL (e.g.http://localstack.fybrik-notebook-sample:4566
). Optionally, you may enter aBucket Name
, thereby limiting the discovery process to a single bucket - Scroll down and press
Test Connection
to make sure that the credentials you provided are correct. Once you see that theConnection test was successful
, pressSave
- Choose
Add ingestion
- You need not change the ingestion configuration. Press
Next
- Press
Next
- Press
Add & Deploy
- The Ingestion Pipeline is created. Press
View Service
- Choose the
Ingestions
tab - The status of the Ingestion Pipeline might be
Queued
... - ... or
Running
. - Wait until the ingestion process has completed successfully, and press the
Explore
tab - Given a list of all OpenMetadata tables, press the table in which you are interested
- You can learn the name that OpenMetadata gave your table by looking at the URL. If, for instance, the URL is
localhost:8585/table/openmetadata-s3.default.demo."PS_20174392719_1491204439457_log.csv"
, then yourassetID
isopenmetadata-s3.default.demo."PS_20174392719_1491204439457_log.csv"
. - To add tags, press
Add tag
- Choose a tag for the dataset, such as
Purpose.finance
- Press the check mark
- Next, you can add tags to some of the columns
- For instance, you may choose
PII.Sensitive
for columns that need to be redacted - Finally, press the
Custom Properties
tab - Set the asset properties as needed. If the
dataFormat
field is left blank, it is assumed to becsv
. If the discovered asset is of a different format (e.g.parquet
), setdataFormat
accordingly. These asset properties will be returned to the Fybrik Manager and will be instrumental in the construction of a data pathYou are all set. OpenMetadata has discovered your asset, and you have added tags and metadata values. You can reference this asset using the asset ID