provenaclient.clients.prov_client

Created Date: Monday June 17th 2024 +1000 Author: Peter Baker —– Last Modified: Monday June 17th 2024 4:45:39 pm +1000 Modified By: Peter Baker —– Description: Prov API L2 Client. —– HISTORY: Date By Comments ———- — ———————————————————

18-06-2024 | Peter Baker | Note that this layer does not provide any file IO capabilities - see L3

Classes

ProvAPIEndpoints

An ENUM containing the prov api endpoints.

ProvAPIAdminEndpoints

An ENUM containing the prov api admin endpoints.

ProvAdminClient

This class interface just captures that the client has an instantiated auth

ProvClient

This class interface just captures that the client has an instantiated auth

Module Contents

class provenaclient.clients.prov_client.ProvAPIEndpoints[source]

Bases: str, enum.Enum

An ENUM containing the prov api endpoints.

POST_MODEL_RUN_REGISTER = '/model_run/register'
POST_MODEL_RUN_UPDATE = '/model_run/update'
POST_MODEL_RUN_REGISTER_BATCH = '/model_run/register_batch'
POST_GENERATE_REPORT = '/explore/generate/report'
GET_EXPLORE_UPSTREAM = '/explore/upstream'
GET_EXPLORE_DOWNSTREAM = '/explore/downstream'
GET_EXPLORE_SPECIAL_CONTRIBUTING_DATASETS = '/explore/special/contributing_datasets'
GET_EXPLORE_SPECIAL_EFFECTED_DATASETS = '/explore/special/effected_datasets'
GET_EXPLORE_SPECIAL_CONTRIBUTING_AGENTS = '/explore/special/contributing_agents'
GET_EXPLORE_SPECIAL_EFFECTED_AGENTS = '/explore/special/effected_agents'
GET_HEALTH_CHECK = '/'
GET_BULK_GENERATE_TEMPLATE_CSV = '/bulk/generate_template/csv'
POST_BULK_CONVERT_MODEL_RUNS_CSV = '/bulk/convert_model_runs/csv'
GET_BULK_REGENERATE_FROM_BATCH_CSV = '/bulk/regenerate_from_batch/csv'
GET_CHECK_ACCESS_CHECK_GENERAL_ACCESS = '/check-access/check-general-access'
GET_CHECK_ACCESS_CHECK_ADMIN_ACCESS = '/check-access/check-admin-access'
GET_CHECK_ACCESS_CHECK_READ_ACCESS = '/check-access/check-read-access'
GET_CHECK_ACCESS_CHECK_WRITE_ACCESS = '/check-access/check-write-access'
class provenaclient.clients.prov_client.ProvAPIAdminEndpoints[source]

Bases: str, enum.Enum

An ENUM containing the prov api admin endpoints.

GET_ADMIN_CONFIG = '/admin/config'
POST_ADMIN_STORE_RECORD = '/admin/store_record'
POST_ADMIN_STORE_RECORDS = '/admin/store_records'
POST_ADMIN_STORE_ALL_REGISTRY_RECORDS = '/admin/store_all_registry_records'
GET_ADMIN_SENTRY_DEBUG = '/admin/sentry-debug'
class provenaclient.clients.prov_client.ProvAdminClient(auth: provenaclient.clients.client_helpers.AuthManager, config: provenaclient.clients.client_helpers.Config)[source]

Bases: provenaclient.clients.client_helpers.ClientService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L2 clients.

_auth
_config
_build_endpoint(endpoint: ProvAPIAdminEndpoints) str[source]
async generate_config_file(required_only: bool) str[source]

Generates a nicely formatted .env file of the current required/non supplied properties Used to quickly bootstrap a local environment or to understand currently deployed API.

Parameters:

required_only (bool, optional) – By default True

async store_record(registry_record: ProvenaInterfaces.RegistryAPI.ItemModelRun, validate_record: bool) provenaclient.clients.client_helpers.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing completed provenance record.

Parameters:
  • registry_record (ItemModelRun) – The completed registry record for the model run.

  • validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_multiple_records(registry_record: provenaclient.clients.client_helpers.List[ProvenaInterfaces.RegistryAPI.ItemModelRun], validate_record: bool) provenaclient.clients.client_helpers.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing but multiple completed provenance record.

Parameters:
  • registry_record (List[ItemModelRun]) – List of the completed registry record for the model run validate_record

  • validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_all_registry_records(validate_record: bool) provenaclient.clients.client_helpers.StatusResponse[source]
Applies the store record endpoint action across a list of ItemModelRuns ‘

which is found by querying the registry model run list endpoint directly.

Parameters:

validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

class provenaclient.clients.prov_client.ProvClient(auth: provenaclient.clients.client_helpers.AuthManager, config: provenaclient.clients.client_helpers.Config)[source]

Bases: provenaclient.clients.client_helpers.ClientService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L2 clients.

admin: ProvAdminClient
_auth
_config
_build_endpoint(endpoint: ProvAPIEndpoints) str[source]
async get_health_check() provenaclient.models.general.HealthCheckResponse[source]

Checks the health status of the PROV-API.

Returns:

Response containing the PROV-API health information.

Return type:

HealthCheckResponse

async post_update_model_run(model_run_id: str, reason: str, record: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) ProvenaInterfaces.ProvenanceAPI.PostUpdateModelRunResponse[source]

Updates an existing model run in the system.

Parameters:
  • model_run_id (str) – The ID of the model run to update

  • reason (str) – The reason for the update

  • record (ModelRunRecord) – The updated model run record

Returns:

The response containing the job session ID

Return type:

PostUpdateModelRunResponse

async explore_upstream(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Explores in the upstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async explore_downstream(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Explores in the downstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_contributing_datasets(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches datasets (inputs) which involved in a model run naturally in the upstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_effected_datasets(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches datasets (outputs) which are derived from the model run naturally in the downstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_contributing_agents(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the upstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_effected_agents(starting_id: str, depth: int) ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the downstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async register_batch_model_runs(model_run_batch_payload: ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunRequest) ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunResponse[source]

This function allows you to register multiple model runs in one go (batch) asynchronously.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration(s).

Parameters:

batch_model_run_payload (RegisterBatchModelRunRequest) – A list of model runs (ModelRunRecord objects)

Returns:

The job session id derived from job-api for the model-run batch.

Return type:

RegisterBatchModelRunResponse

async register_model_run(model_run_payload: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) ProvenaInterfaces.ProvenanceAPI.RegisterModelRunResponse[source]

Asynchronously registers a single model run.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration.

Parameters:

model_run_payload (ModelRunRecord) – Contains information needed for the model run such as workflow template, inputs, outputs, description etc.

Returns:

The job session id derived from job-api for the model-run.

Return type:

RegisterModelRunResponse

async generate_csv_template(workflow_template_id: str) str[source]

Generates a model run csv template to be utilised for creating model runs through csv format.

Parameters:

workflow_template_id (str) – An ID of a created and existing model run workflow template.

async convert_model_runs_to_csv(csv_file_contents: str) ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Reads a CSV file, and it’s defined model run contents and lodges a model run.

Parameters:

csv_file_contents (str) – Contains the model run contents.

Returns:

Returns the model run information in an interactive python datatype.

Return type:

ConvertModelRunsResponse

async regenerate_csv_from_model_run_batch(batch_id: str) str[source]

Regenerate/create a csv file containing model run information from a model run batch job.

The batch id must exist in the system.

Parameters:

batch_id (str) – Obtained from creating a batch model run.

async generate_report(report_request: ProvenaInterfaces.ProvenanceAPI.GenerateReportRequest) provenaclient.clients.client_helpers.ByteString[source]

Generates a provenance report from a Study or Model Run Entity containing the associated inputs, model runs and outputs involved.

The report is generated in .docx format by making a POST request to the API.

Parameters:

report_request (GenerateReportRequest) – The request object containing the parameters for generating the report, including the id, item_subtype, and depth.

Returns:

The raw byte content of the generated .docx file. The type of the returned content will be either bytes or bytearray, which can be directly saved to a file.

Return type:

ByteString

Raises:

AssertionError – If the response content is not found or is not in the expected bytes or bytearray format.