provenaclient.modules.prov

Created Date: Monday June 17th 2024 +1000 Author: Peter Baker —– Last Modified: Friday November 29th 2024 4:21:39 pm +1000 Modified By: Parth Kulkarni —– Description: Provenance API L3 module. Includes the ProvAPI sub module. Contains IO helper functions for writing/reading files. —– HISTORY: Date By Comments ———- — ———————————————————

29-11-2024 | Parth Kulkarni | Added generate-report functionality.

Attributes

`PROV_API_DEFAULT_SEARCH_DEPTH`
`DEFAULT_CONFIG_FILE_NAME`
`DEFAULT_RELATIVE_FILE_PATH`

Classes

`ProvAPIAdminSubModule`	This class interface just captures that the client has an instantiated auth
`Prov`	This class interface just captures that the client has an instantiated auth

Module Contents

provenaclient.modules.prov.PROV_API_DEFAULT_SEARCH_DEPTH = 3

provenaclient.modules.prov.DEFAULT_CONFIG_FILE_NAME = 'prov-api.env'

provenaclient.modules.prov.DEFAULT_RELATIVE_FILE_PATH = './'

class provenaclient.modules.prov.ProvAPIAdminSubModule(auth: provenaclient.modules.module_helpers.AuthManager, config: provenaclient.modules.module_helpers.Config, prov_api_client: provenaclient.clients.ProvClient)[source]

Bases: provenaclient.modules.module_helpers.ModuleService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L3 clients.

_prov_api_client: provenaclient.clients.ProvClient

_auth

_config

async generate_config_file(required_only: bool = True, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_file: bool = False) → str[source]

Generates a nicely formatted .env file of the current required/non supplied properties Used to quickly bootstrap a local environment or to understand currently deployed API.

Parameters:

required_only (bool, optional) – By default True
file_path (str, optional) – The path you want to save the config file at WITH the file name. If you don’t specify a path this will be saved in a relative directory.
write_to_file (bool, By default False) – A boolean flag to indicate whether you want to save the config response to a file or not.

Returns:

str

Return type:

Response containing the config text.

async store_record(registry_record: ProvenaInterfaces.RegistryAPI.ItemModelRun, validate_record: bool = True) → ProvenaInterfaces.SharedTypes.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing completed provenance record.

Parameters:

registry_record (ItemModelRun) – The completed registry record for the model run.
validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_multiple_records(registry_record: List[ProvenaInterfaces.RegistryAPI.ItemModelRun], validate_record: bool = True) → ProvenaInterfaces.SharedTypes.StatusResponse[source]

An admin only endpoint which enables the reupload/storage of an existing but multiple completed provenance record.

Parameters:

registry_record (List[ItemModelRun]) – List of the completed registry record for the model run validate_record
validate_record (bool) – Optional Should the ids in the payload be validated?, by default True

Returns:

A status response indicating the success of the request and any other details.

Return type:

StatusResponse

async store_all_registry_records(validate_record: bool = True) → ProvenaInterfaces.SharedTypes.StatusResponse[source]

Applies the store record endpoint action across a list of ItemModelRuns ‘: which is found by querying the registry model run list endpoint directly.

Parameters:: validate_record (bool) – Optional Should the ids in the payload be validated?, by default True
Returns:: A status response indicating the success of the request and any other details.
Return type:: StatusResponse

class provenaclient.modules.prov.Prov(auth: provenaclient.modules.module_helpers.AuthManager, config: provenaclient.modules.module_helpers.Config, prov_client: provenaclient.clients.ProvClient)[source]

Bases: provenaclient.modules.module_helpers.ModuleService

This class interface just captures that the client has an instantiated auth manager which allows for helper functions abstracted for L3 clients.

_prov_client: provenaclient.clients.ProvClient

_auth

_config

_prov_api_client

admin

async get_health_check() → provenaclient.models.general.HealthCheckResponse[source]

Checks the health status of the PROV-API.

Returns:: Response containing the PROV-API health information.
Return type:: HealthCheckResponse

async update_model_run(model_run_id: str, reason: str, record: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) → ProvenaInterfaces.ProvenanceAPI.PostUpdateModelRunResponse[source]

Updates an existing model run with new information.

This function triggers an asynchronous update of a model run. The update is processed as a job, and the job session ID is returned for tracking the update progress.

Parameters:

model_run_id (str) – The ID of the model run to update
reason (str) – The reason for updating the model run
record (ModelRunRecord) – The new model run record details

Returns:

Response containing the job session ID tracking the update

Return type:

PostUpdateModelRunResponse

Example

```python response = await prov_api.update_model_run(

model_run_id=”10378.1/1234567”, reason=”Updating input dataset information”, record=updated_model_run_record

) # Get the session ID to track progress session_id = response.session_id ```

async explore_upstream(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Explores in the upstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async explore_downstream(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Explores in the downstream direction (inputs/associations) starting at the specified node handle ID. The search depth is bounded by the depth parameter which has a default maximum of 100.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_contributing_datasets(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Fetches datasets (inputs) which involved in a model run naturally in the upstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_effected_datasets(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Fetches datasets (outputs) which are derived from the model run naturally in the downstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_contributing_agents(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the upstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the upstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async get_effected_agents(starting_id: str, depth: int = PROV_API_DEFAULT_SEARCH_DEPTH) → provenaclient.models.general.CustomLineageResponse[source]

Fetches agents (organisations or peoples) that are involved or impacted by the model run. naturally in the downstream direction.

Parameters:

starting_id (str) – The ID of the entity to start at.
depth (int, optional) – The depth to traverse in the downstream direction, by default 100.

Returns:

A typed response containing the status, node count, and networkx serialised graph response.

Return type:

CustomLineageResponse

async register_batch_model_runs(batch_model_run_payload: ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunRequest) → ProvenaInterfaces.ProvenanceAPI.RegisterBatchModelRunResponse[source]

This function allows you to register multiple model runs in one go (batch) asynchronously.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration(s).

Parameters:: batch_model_run_payload (RegisterBatchModelRunRequest) – A list of model runs (ModelRunRecord objects)
Returns:: The job session id derived from job-api for the model-run batch.
Return type:: RegisterBatchModelRunResponse

async register_model_run(model_run_payload: ProvenaInterfaces.ProvenanceAPI.ModelRunRecord) → ProvenaInterfaces.ProvenanceAPI.RegisterModelRunResponse[source]

Asynchronously registers a single model run.

Note: You can utilise the returned session ID to poll on the JOB API to check status of the model run registration.

Parameters:: model_run_payload (ModelRunRecord) – Contains information needed for the model run such as workflow template, inputs, outputs, description etc.
Returns:: The job session id derived from job-api for the model-run.
Return type:: RegisterModelRunResponse

async generate_csv_template(workflow_template_id: str, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_csv: bool = False) → str[source]

Generates a model run csv template to be utilised for creating model runs through csv format..

Parameters:

workflow_template_id (str) – An ID of a created and existing model run workflow template.
path_to_save_csv (str, optional) – The path you want to save the csv file at WITH csv file name. If you don’t specify a path this will be saved in a relative directory.
write_to_csv (bool, By default False) – A boolean flag to indicate whether you want to save the template to a csv file or not.

Returns:

str

Return type:

Response containing the csv template text (encoded in a csv format).

async convert_model_runs(model_run_content: str) → ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Converts model run with model_run_content provided as a string.

Parameters:: model_run_content (str) – The model run information containing the necessary parameters for model run lodge.
Returns:: Returns the model run information in an interactive python datatype.
Return type:: ConvertModelRunsResponse
Raises:: Exception – Exception raised when converting string to bytes.

async convert_model_runs_to_csv_with_file(file_path: str) → ProvenaInterfaces.ProvenanceAPI.ConvertModelRunsResponse[source]

Reads a CSV file, and it’s defined model run contents and lodges a model run.

Parameters:: file_path (str) – The path of an existing created CSV file containing the necessary parameters for model run lodge.
Returns:: Returns the model run information in an interactive python datatype.
Return type:: ConvertModelRunsResponse

async regenerate_csv_from_model_run_batch(batch_id: str, file_path: provenaclient.utils.exceptions.Optional[str] = None, write_to_csv: bool = False) → str[source]

Regenerate/create a csv file containing model run information from a model run batch job.

The batch id must exist in the system.

Parameters:

batch_id (str) – Obtained from creating a batch model run.
file_path (str, optional) – The path you want to save the csv file at WITH CSV file name. If you don’t specify a path this will be saved in a relative directory.
write_to_csv (bool, By default False) – A boolean flag to indicate whether you want to save the template to a csv file or not.

Returns:

str

Return type:

Response containing the model run information (encoded in csv format).

async generate_report(report_request: ProvenaInterfaces.ProvenanceAPI.GenerateReportRequest, file_path: str = DEFAULT_RELATIVE_FILE_PATH) → None[source]

Generates a provenance report from a Study or Model Run Entity containing the associated inputs, model runs and outputs involved.

The report is generated in .docx and saved at relative directory level.

Parameters:: report_request (GenerateReportRequest) – The request object containing the parameters for generating the report, including the id, item_subtype, and depth.