provenaclient.modules.prov

Created Date: Monday June 17th 2024 +1000 Author: Peter Baker —– Last Modified: Friday November 29th 2024 4:21:39 pm +1000 Modified By: Parth Kulkarni —– Description: Provenance API L3 module. Includes the ProvAPI sub module. Contains IO helper functions for writing/reading files. —– HISTORY: Date By Comments ———- — ———————————————————

29-11-2024 | Parth Kulkarni | Added generate-report functionality.

Attributes

PROV_API_DEFAULT_SEARCH_DEPTH

DEFAULT_CONFIG_FILE_NAME

DEFAULT_RELATIVE_FILE_PATH

Classes

ProvAPIAdminSubModule

This class interface just captures that the client has an instantiated auth

Prov

This class interface just captures that the client has an instantiated auth

Module Contents

provenaclient.modules.prov.PROV_API_DEFAULT_SEARCH_DEPTH = 3
provenaclient.modules.prov.DEFAULT_CONFIG_FILE_NAME = 'prov-api.env'
provenaclient.modules.prov.DEFAULT_RELATIVE_FILE_PATH = './'
class provenaclient.modules.prov.ProvAPIAdminSubModule(auth: provenaclient.modules.module_helpers.AuthManager, config: provenaclient.modules.module_helpers.Config, prov_api_client: provenaclient.clients.ProvClient)[source]

Bases: provenaclient.modules.module_helpers.ModuleService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L3 clients.

_prov_api_client: provenaclient.clients.ProvClient
_auth
_config
async generate_config_file(required_only: bool = True, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_file: bool = False) str[source]

Generates a nicely formatted .env file of the current required/non supplied properties Used to quickly bootstrap a local environment or to understand currently deployed API.

Parameters:
  • required_only (bool, optional) – By default True

  • file_path (str, optional) – The path you want to save the config file at WITH the file name. If you don’t specify a path this will be saved in a relative directory.

  • write_to_file (bool, By default False) – A boolean flag to indicate whether you want to save the config response to a file or not.

Returns:

str

Return type:

Response containing the config text.

async store_record(registry_record: ProvenaInterfaces.RegistryAPI.ItemModelRun, validate_record: bool = True) ProvenaInterfaces.SharedTypes.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing completed provenance record.

Parameters:
  • registry_record (ItemModelRun) – The completed registry record for the model run.

  • validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_multiple_records(registry_record: List[ProvenaInterfaces.RegistryAPI.ItemModelRun], validate_record: bool = True) ProvenaInterfaces.SharedTypes.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing but multiple completed provenance record.

Parameters:
  • registry_record (List[ItemModelRun]) – List of the completed registry record for the model run validate_record

  • validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_all_registry_records(validate_record: bool = True) ProvenaInterfaces.SharedTypes.StatusResponse[source]
Applies the store record endpoint action across a list of ItemModelRuns ‘

which is found by querying the registry model run list endpoint directly.

Parameters:

validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

class provenaclient.modules.prov.Prov(auth: provenaclient.modules.module_helpers.AuthManager, config: provenaclient.modules.module_helpers.Config, prov_client: provenaclient.clients.ProvClient)[source]

Bases: provenaclient.modules.module_helpers.ModuleService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L3 clients.

_prov_client: provenaclient.clients.ProvClient
_auth
_config
_prov_api_client
admin
async get_health_check() provenaclient.models.general.HealthCheckResponse[source]

Checks the health status of the PROV-API.

Returns:

Response containing the PROV-API health information.

Return type:

HealthCheckResponse

async update_model_run(model_run_id: str, reason: str, record: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) ProvenaInterfaces.ProvenanceAPI.PostUpdateModelRunResponse[source]

Updates an existing model run with new information.

This function triggers an asynchronous update of a model run. The update is processed as a job, and the job session ID is returned for tracking the update progress.

Parameters:
  • model_run_id (str) – The ID of the model run to update

  • reason (str) – The reason for updating the model run

  • record (ModelRunRecord) – The new model run record details

Returns:

Response containing the job session ID tracking the update

Return type:

PostUpdateModelRunResponse

Example

```python response = await prov_api.update_model_run(

model_run_id=”10378.1/1234567”, reason=”Updating input dataset information”, record=updated_model_run_record

) # Get the session ID to track progress session_id = response.session_id ```

async explore_upstream(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Explores in the upstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async explore_downstream(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Explores in the downstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_contributing_datasets(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Fetches datasets (inputs) which involved in a model run naturally in the upstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_effected_datasets(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Fetches datasets (outputs) which are derived from the model run naturally in the downstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_contributing_agents(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the upstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_effected_agents(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) provenaclient.models.general.CustomLineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the downstream direction.

Parameters:
  • starting_id (str) – The ID of the entity to start at.

  • depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async register_batch_model_runs(batch_model_run_payload: ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunRequest) ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunResponse[source]

This function allows you to register multiple model runs in one go (batch) asynchronously.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration(s).

Parameters:

batch_model_run_payload (RegisterBatchModelRunRequest) – A list of model runs (ModelRunRecord objects)

Returns:

The job session id derived from job-api for the model-run batch.

Return type:

RegisterBatchModelRunResponse

async register_model_run(model_run_payload: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) ProvenaInterfaces.ProvenanceAPI.RegisterModelRunResponse[source]

Asynchronously registers a single model run.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration.

Parameters:

model_run_payload (ModelRunRecord) – Contains information needed for the model run such as workflow template, inputs, outputs, description etc.

Returns:

The job session id derived from job-api for the model-run.

Return type:

RegisterModelRunResponse

async generate_csv_template(workflow_template_id: str, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_csv: bool = False) str[source]

Generates a model run csv template to be utilised for creating model runs through csv format..

Parameters:
  • workflow_template_id (str) – An ID of a created and existing model run workflow template.

  • path_to_save_csv (str, optional) – The path you want to save the csv file at WITH csv file name. If you don’t specify a path this will be saved in a relative directory.

  • write_to_csv (bool, By default False) – A boolean flag to indicate whether you want to save the template to a csv file or not.

Returns:

str

Return type:

Response containing the csv template text (encoded in a csv format).

async convert_model_runs(model_run_content: str) ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Converts model run with model_run_content provided as a string.

Parameters:

model_run_content (str) – The model run information containing the necessary parameters for model run lodge.

Returns:

Returns the model run information in an interactive python datatype.

Return type:

ConvertModelRunsResponse

Raises:

Exception – Exception raised when converting string to bytes.

async convert_model_runs_to_csv_with_file(file_path: str) ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Reads a CSV file, and it’s defined model run contents and lodges a model run.

Parameters:

file_path (str) – The path of an existing created CSV file containing the necessary parameters for model run lodge.

Returns:

Returns the model run information in an interactive python datatype.

Return type:

ConvertModelRunsResponse

async regenerate_csv_from_model_run_batch(batch_id: str, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_csv: bool = False) str[source]

Regenerate/create a csv file containing model run information from a model run batch job.

The batch id must exist in the system.

Parameters:
  • batch_id (str) – Obtained from creating a batch model run.

  • file_path (str, optional) – The path you want to save the csv file at WITH CSV file name. If you don’t specify a path this will be saved in a relative directory.

  • write_to_csv (bool, By default False) – A boolean flag to indicate whether you want to save the template to a csv file or not.

Returns:

str

Return type:

Response containing the model run information (encoded in csv format).

async generate_report(report_request: ProvenaInterfaces.ProvenanceAPI.GenerateReportRequest, file_path: str = DEFAULT_RELATIVE_FILE_PATH) None[source]

Generates a provenance report from a Study or Model Run Entity containing the associated inputs, model runs and outputs involved.

The report is generated in .docx and saved at relative directory level.

Parameters:

report_request (GenerateReportRequest) – The request object containing the parameters for generating the report, including the id, item_subtype, and depth.