provenaclient.clients.prov_client

Created Date: Monday June 17th 2024 +1000 Author: Peter Baker —– Last Modified: Monday June 17th 2024 4:45:39 pm +1000 Modified By: Peter Baker —– Description: Prov API L2 Client. —– HISTORY: Date By Comments ———- — ———————————————————

18-06-2024 | Peter Baker | Note that this layer does not provide any file IO capabilities - see L3

Classes

`ProvAPIEndpoints`	An ENUM containing the prov api endpoints.
`ProvAPIAdminEndpoints`	An ENUM containing the prov api admin endpoints.
`ProvAdminClient`	This class interface just captures that the client has an instantiated auth
`ProvClient`	This class interface just captures that the client has an instantiated auth

Module Contents

class provenaclient.clients.prov_client.ProvAPIEndpoints[source]

Bases: str, enum.Enum

An ENUM containing the prov api endpoints.

POST_MODEL_RUN_REGISTER = '/model_run/register'

POST_MODEL_RUN_UPDATE = '/model_run/update'

POST_MODEL_RUN_REGISTER_BATCH = '/model_run/register_batch'

POST_GENERATE_REPORT = '/explore/generate/report'

GET_EXPLORE_UPSTREAM = '/explore/upstream'

GET_EXPLORE_DOWNSTREAM = '/explore/downstream'

GET_EXPLORE_SPECIAL_CONTRIBUTING_DATASETS = '/explore/special/contributing_datasets'

GET_EXPLORE_SPECIAL_EFFECTED_DATASETS = '/explore/special/effected_datasets'

GET_EXPLORE_SPECIAL_CONTRIBUTING_AGENTS = '/explore/special/contributing_agents'

GET_EXPLORE_SPECIAL_EFFECTED_AGENTS = '/explore/special/effected_agents'

GET_HEALTH_CHECK = '/'

GET_BULK_GENERATE_TEMPLATE_CSV = '/bulk/generate_template/csv'

POST_BULK_CONVERT_MODEL_RUNS_CSV = '/bulk/convert_model_runs/csv'

GET_BULK_REGENERATE_FROM_BATCH_CSV = '/bulk/regenerate_from_batch/csv'

GET_CHECK_ACCESS_CHECK_GENERAL_ACCESS = '/check-access/check-general-access'

GET_CHECK_ACCESS_CHECK_ADMIN_ACCESS = '/check-access/check-admin-access'

GET_CHECK_ACCESS_CHECK_READ_ACCESS = '/check-access/check-read-access'

GET_CHECK_ACCESS_CHECK_WRITE_ACCESS = '/check-access/check-write-access'

class provenaclient.clients.prov_client.ProvAPIAdminEndpoints[source]

Bases: str, enum.Enum

An ENUM containing the prov api admin endpoints.

GET_ADMIN_CONFIG = '/admin/config'

POST_ADMIN_STORE_RECORD = '/admin/store_record'

POST_ADMIN_STORE_RECORDS = '/admin/store_records'

POST_ADMIN_STORE_ALL_REGISTRY_RECORDS = '/admin/store_all_registry_records'

GET_ADMIN_SENTRY_DEBUG = '/admin/sentry-debug'

class provenaclient.clients.prov_client.ProvAdminClient(auth: provenaclient.clients.client_helpers.AuthManager, config: provenaclient.clients.client_helpers.Config)[source]

Bases: provenaclient.clients.client_helpers.ClientService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L2 clients.

_auth

_config

_build_endpoint(endpoint: ProvAPIAdminEndpoints) → str[source]

async generate_config_file(required_only: bool) → str[source]

Generates a nicely formatted .env file of the current required/non supplied properties Used to quickly bootstrap a local environment or to understand currently deployed API.

Parameters:: required_only (bool, optional) – By default True

async store_record(registry_record: ProvenaInterfaces.RegistryAPI.ItemModelRun, validate_record: bool) → provenaclient.clients.client_helpers.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing completed provenance record.

Parameters:

registry_record (ItemModelRun) – The completed registry record for the model run.
validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_multiple_records(registry_record: provenaclient.clients.client_helpers.List[ProvenaInterfaces.RegistryAPI.ItemModelRun], validate_record: bool) → provenaclient.clients.client_helpers.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing but multiple completed provenance record.

Parameters:

registry_record (List[ItemModelRun]) – List of the completed registry record for the model run validate_record
validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_all_registry_records(validate_record: bool) → provenaclient.clients.client_helpers.StatusResponse[source]

Applies the store record endpoint action across a list of ItemModelRuns ‘: which is found by querying the registry model run list endpoint directly.

Parameters:: validate_record (bool) – Optional Should the ids in the payload be validated?, by default True
Returns:: A status response indicating the success of the request and any other details.
Return type:: StatusResponse

class provenaclient.clients.prov_client.ProvClient(auth: provenaclient.clients.client_helpers.AuthManager, config: provenaclient.clients.client_helpers.Config)[source]

Bases: provenaclient.clients.client_helpers.ClientService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L2 clients.

admin: ProvAdminClient

_auth

_config

_build_endpoint(endpoint: ProvAPIEndpoints) → str[source]

async get_health_check() → provenaclient.models.general.HealthCheckResponse[source]

Checks the health status of the PROV-API.

Returns:: Response containing the PROV-API health information.
Return type:: HealthCheckResponse

async post_update_model_run(model_run_id: str, reason: str, record: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) → ProvenaInterfaces.ProvenanceAPI.PostUpdateModelRunResponse[source]

Updates an existing model run in the system.

Parameters:

model_run_id (str) – The ID of the model run to update
reason (str) – The reason for the update
record (ModelRunRecord) – The updated model run record

Returns:

The response containing the job session ID

Return type:

PostUpdateModelRunResponse

async explore_upstream(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Explores in the upstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async explore_downstream(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Explores in the downstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_contributing_datasets(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches datasets (inputs) which involved in a model run naturally in the upstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_effected_datasets(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches datasets (outputs) which are derived from the model run naturally in the downstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_contributing_agents(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the upstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async get_effected_agents(starting_id: str, depth: int) → ProvenaInterfaces.ProvenanceAPI.LineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the downstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A response containing the status, node count, and networkx serialised graph response.

Return type:

LineageResponse

async register_batch_model_runs(model_run_batch_payload: ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunRequest) → ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunResponse[source]

This function allows you to register multiple model runs in one go (batch) asynchronously.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration(s).

Parameters:: batch_model_run_payload (RegisterBatchModelRunRequest) – A list of model runs (ModelRunRecord objects)
Returns:: The job session id derived from job-api for the model-run batch.
Return type:: RegisterBatchModelRunResponse

async register_model_run(model_run_payload: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) → ProvenaInterfaces.ProvenanceAPI.RegisterModelRunResponse[source]

Asynchronously registers a single model run.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration.

Parameters:: model_run_payload (ModelRunRecord) – Contains information needed for the model run such as workflow template, inputs, outputs, description etc.
Returns:: The job session id derived from job-api for the model-run.
Return type:: RegisterModelRunResponse

async generate_csv_template(workflow_template_id: str) → str[source]

Generates a model run csv template to be utilised for creating model runs through csv format.

Parameters:: workflow_template_id (str) – An ID of a created and existing model run workflow template.

async convert_model_runs_to_csv(csv_file_contents: str) → ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Reads a CSV file, and it’s defined model run contents and lodges a model run.

Parameters:: csv_file_contents (str) – Contains the model run contents.
Returns:: Returns the model run information in an interactive python datatype.
Return type:: ConvertModelRunsResponse

async regenerate_csv_from_model_run_batch(batch_id: str) → str[source]

Regenerate/create a csv file containing model run information from a model run batch job.

The batch id must exist in the system.

Parameters:: batch_id (str) – Obtained from creating a batch model run.

async generate_report(report_request: ProvenaInterfaces.ProvenanceAPI.GenerateReportRequest) → provenaclient.clients.client_helpers.ByteString[source]

Generates a provenance report from a Study or Model Run Entity containing the associated inputs, model runs and outputs involved.

The report is generated in .docx format by making a POST request to the API.

Parameters:: report_request (GenerateReportRequest) – The request object containing the parameters for generating the report, including the id, item_subtype, and depth.
Returns:: The raw byte content of the generated .docx file. The type of the returned content will be either bytes or bytearray, which can be directly saved to a file.
Return type:: ByteString
Raises:: AssertionError – If the response content is not found or is not in the expected bytes or bytearray format.