Skip to content

Protocol Documentation

Top

credentials.proto

Credentials

Field Type Label Description
access_key string access credential for the bucket where the asset is stored
secret_key string
username string
password string
api_key string api key assigned to the bucket in which the asset is stored
resource_instance_id string resource instance id for the bucket

Top

data_catalog_request.proto

CatalogDatasetRequest

Field Type Label Description
credential_path string link to vault plugin for reading k8s secret with user credentials
dataset_id string identifier of asset - always needed. JSON expected. Interpreted by the Connector, can contain any additional information as part of JSON

Top

data_catalog_response.proto

CatalogDatasetInfo

Field Type Label Description
dataset_id string
details DatasetDetails

Top

data_catalog_service.proto

DataCatalogService

Method Name Request Type Response Type Description
GetDatasetInfo CatalogDatasetRequest CatalogDatasetInfo
RegisterDatasetInfo RegisterAssetRequest RegisterAssetResponse

Top

dataset_details.proto

CredentialsInfo

Field Type Label Description
vault_secret_path string the path to Vault secret which is used to retrive the dataset credentials from the catalog.

DataComponentMetadata

Field Type Label Description
component_type string e.g., column
named_metadata DataComponentMetadata.NamedMetadataEntry repeated Named terms, that exist in Catalog toxonomy and the values for these terms for columns we will have "SchemaDetails" key, that will include technical schema details for this column TODO: Consider create special field for schema outside of metadata
tags string repeated Tags - can be any free text added to a component (no taxonomy)

DataComponentMetadata.NamedMetadataEntry

Field Type Label Description
key string
value string

DataStore

Field Type Label Description
type DataStore.DataStoreType
name string for auditing and readability. Can be same as location type or can have more info if availble from catalog
db2 Db2DataStore oneof location { // should have been oneof but for technical rasons, a problem to translate it to JSON, we remove the oneof for now should have been local, db2, s3 without "location" but had a problem to compile it in proto - collision with proto name DataLocationDb2
s3 S3DataStore
kafka KafkaDataStore

DatasetDetails

Field Type Label Description
name string name in Catalog
data_owner string information on the owner of data asset - can have different formats for different catalogs
data_store DataStore All info about the data store
data_format string
geo string geography location where data resides (if this information available)
metadata DatasetMetadata LocationType locationType = 10; //publicCloud/privateCloud etc. Should be filled later when we understand better if we have a closed set of values and how they are used.
credentials_info CredentialsInfo information about how to retrive dataset credentials from the catalog.

DatasetMetadata

Field Type Label Description
dataset_named_metadata DatasetMetadata.DatasetNamedMetadataEntry repeated
dataset_tags string repeated Tags - can be any free text added to a component (no taxonomy)
components_metadata DatasetMetadata.ComponentsMetadataEntry repeated metadata for each component in asset. In tabular data each column is a component, then we will have: column name -> column metadata

DatasetMetadata.ComponentsMetadataEntry

Field Type Label Description
key string
value DataComponentMetadata

DatasetMetadata.DatasetNamedMetadataEntry

Field Type Label Description
key string
value string

Db2DataStore

Field Type Label Description
url string
database string
table string reformat to SCHEMA.TABLE struct
port string
ssl string Note that bool value if set to "false" does not appear in the struct at all

KafkaDataStore

Field Type Label Description
topic_name string
bootstrap_servers string
schema_registry string
key_deserializer string
value_deserializer string
security_protocol string
sasl_mechanism string
ssl_truststore string
ssl_truststore_password string

S3DataStore

Field Type Label Description
endpoint string
bucket string
object_key string can be object name or the prefix for dataset
region string WKC does not return it, it will stay empty in our case!!!

DataStore.DataStoreType

Name Number Description
UNKNOWN 0
LOCAL 1
S3 2
DB2 3
KAFKA 4

Top

policy_manager_request.proto

AccessOperation

Field Type Label Description
type AccessOperation.AccessType
destination string Destination for transfer or write.

ApplicationContext

Field Type Label Description
credential_path string link to vault plugin for reading k8s secret with user credentials
app_info ApplicationDetails
datasets DatasetContext repeated
general_operations AccessOperation repeated

ApplicationDetails

Field Type Label Description
processing_geography string
properties ApplicationDetails.PropertiesEntry repeated

ApplicationDetails.PropertiesEntry

Field Type Label Description
key string
value string

DatasetContext

Field Type Label Description
dataset DatasetIdentifier
operation AccessOperation

DatasetIdentifier

Field Type Label Description
dataset_id string identifier of asset - always needed. JSON expected. Interpreted by the Connector, can contain any additional information as part of JSON

AccessOperation.AccessType

Name Number Description
UNKNOWN 0
READ 1
COPY 2
WRITE 3

Top

policy_manager_response.proto

ComponentVersion

Field Type Label Description
name string
id string
version string

DatasetDecision

Field Type Label Description
dataset DatasetIdentifier
decisions OperationDecision repeated

EnforcementAction

Field Type Label Description
name string
id string
level EnforcementAction.EnforcementActionLevel
args EnforcementAction.ArgsEntry repeated

EnforcementAction.ArgsEntry

Field Type Label Description
key string
value string

OperationDecision

Field Type Label Description
operation AccessOperation
enforcement_actions EnforcementAction repeated
used_policies Policy repeated

PoliciesDecisions

Field Type Label Description
component_versions ComponentVersion repeated
dataset_decisions DatasetDecision repeated one per dataset
general_decisions OperationDecision repeated

Policy

Field Type Label Description
id string
name string
description string
type string
hierarchy string repeated

EnforcementAction.EnforcementActionLevel

Name Number Description
UNKNOWN 0
DATASET 1
COLUMN 2
ROW 3
CELL 4

Top

policy_manager_service.proto

PolicyManagerService

Method Name Request Type Response Type Description
GetPoliciesDecisions ApplicationContext PoliciesDecisions

Top

register_asset_request.proto

RegisterAssetRequest

Field Type Label Description
creds Credentials
dataset_details DatasetDetails
destination_catalog_id string
credential_path string link to vault plugin for reading k8s secret with user credentials

Top

register_asset_response.proto

RegisterAssetResponse

Field Type Label Description
asset_id string Returns the id of the new asset registered in a catalog

Scalar Value Types

.proto Type Notes C++ Java Python Go C# PHP Ruby
double double double float float64 double float Float
float float float float float32 float float Float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required)
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum
uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required)
uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required)
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required)
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required)
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum
sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required)
sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum
bool bool boolean boolean bool bool boolean TrueClass/FalseClass
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8)
bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)