cmflib.cmfquery.CmfQuery¶

`cmflib.cmfquery.CmfQuery(filepath='mlmd', is_server=False)` ¶

Bases: object

CMF Query communicates with the MLMD database and implements basic search and retrieval functionality.

This class has been designed to work with the CMF framework. CMF alters names of pipelines, stages and artifacts in various ways. This means that actual names in the MLMD database will be different from those originally provided by users via CMF API. When methods in this class accept name parameters, it is expected that values of these parameters are fully-qualified names of respective entities.

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to the MLMD database file.	`'mlmd'`

`get_pipeline_names()` ¶

Return names of all pipelines.

Returns:

Type	Description
`List[str]`	List of all pipeline names.

`get_pipeline_id(pipeline_name)` ¶

Return pipeline identifier for the pipeline names pipeline_name.

Parameters:

Name	Type	Description	Default
`pipeline_name`	`str`	Name of the pipeline.	required

Returns:

Type	Description
`int`	Pipeline identifier or -1 if one does not exist.

`get_pipeline_stages(pipeline_name)` ¶

Return list of pipeline stages for the pipeline with the given name.

Parameters:

Name	Type	Description	Default
`pipeline_name`	`str`	Name of the pipeline for which stages need to be returned. In CMF, there are no different pipelines with the same name.	required

Returns:

Type	Description
`List[str]`	List of stage names associated with the given pipeline.

`get_all_exe_in_stage(stage_name)` ¶

Return list of all executions for the stage with the given name.

Parameters:

Name	Type	Description	Default
`stage_name`	`str`	Name of the stage. Before stages are recorded in MLMD, they are modified (e.g., pipeline name will become part of the stage name). So stage names from different pipelines will not collide.	required

Returns:

Type	Description
`List[Execution]`	List of executions for the given stage.

`get_all_executions_by_ids_list(exe_ids)` ¶

Return executions for given execution ids list as a pandas data frame.

Parameters:

Name	Type	Description	Default
`exe_ids`	`List[int]`	List of execution identifiers.	required

Returns:

Type	Description
`DataFrame`	Data frame with all executions for the list of given execution identifiers.

`get_all_artifacts_by_context(pipeline_name)` ¶

Return artifacts for given pipeline name as a pandas data frame.

Parameters:

Name	Type	Description	Default
`pipeline_name`	`str`	Name of the pipeline.	required

Returns:

Type	Description
`DataFrame`	Data frame with all artifacts associated with given pipeline name.

`get_all_artifacts_by_ids_list(artifact_ids)` ¶

Return all artifacts for the given artifact ids list.

Parameters:

Name	Type	Description	Default
`artifact_ids`	`List[int]`	List of artifact identifiers	required

Returns:

Type	Description
`DataFrame`	Data frame with all artifacts for the given artifact ids list.

`get_all_executions_in_stage(stage_name)` ¶

Return executions of the given stage as pandas data frame.

Parameters:

Name	Type	Description	Default
`stage_name`	`str`	Stage name. See doc strings for the prev method.	required

Returns:

Type	Description
`DataFrame`	Data frame with all executions associated with the given stage.

`get_artifact_df(artifact, d=None)` ¶

Return artifact's data frame representation.

Parameters:

Name	Type	Description	Default
`artifact`	`Artifact`	MLMD entity representing artifact.	required
`d`	`Optional[Dict]`	Optional initial content for data frame.	`None`

Returns:

Type	Description
`DataFrame`	A data frame with the single row containing attributes of this artifact.

`get_all_artifacts()` ¶

Return names of all artifacts.

Returns:

Type	Description
`List[str]`	List of all artifact names.

`get_artifact(name)` ¶

Return artifact's data frame representation using artifact name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Artifact name.	required

Returns:

Type	Description
`Optional[DataFrame]`	Pandas data frame with one row containing attributes of this artifact.

`get_all_artifacts_for_execution(execution_id)` ¶

Return input and output artifacts for the given execution.

Parameters:

Name	Type	Description	Default
`execution_id`	`int`	Execution identifier.	required

Returns:

Type	Description
`DataFrame`	Data frame containing input and output artifacts for the given execution, one artifact per row.

`get_all_artifact_types()` ¶

Return names of all artifact types.

Returns:

Type	Description
`List[str]`	List of all artifact types.

`get_all_executions_for_artifact(artifact_name)` ¶

Return executions that consumed and produced given artifact.

Parameters:

Name	Type	Description	Default
`artifact_name`	`str`	Artifact name.	required

Returns:

Type	Description
`DataFrame`	Pandas data frame containing stage executions, one execution per row.

`get_one_hop_child_artifacts(artifact_name, pipeline_id=None)` ¶

Get artifacts produced by executions that consume given artifact.

Parameters:

Name	Type	Description	Default
`artifact`	`name`	Name of an artifact.	required

Returns:

Type	Description
`DataFrame`	Output artifacts of all executions that consumed given artifact.

`get_all_child_artifacts(artifact_name)` ¶

Return all downstream artifacts starting from the given artifact.

Parameters:

Name	Type	Description	Default
`artifact_name`	`str`	Artifact name.	required

Returns:

Type	Description
`DataFrame`	Data frame containing all child artifacts.

`get_one_hop_parent_artifacts(artifact_name)` ¶

Return input artifacts for the execution that produced the given artifact.

Parameters:

Name	Type	Description	Default
`artifact_name`	`str`	Artifact name.	required

Returns:

Type	Description
`DataFrame`	Data frame containing immediate parent artifact of given artifact.

`get_all_parent_artifacts(artifact_name)` ¶

Return all upstream artifacts.

Parameters:

Name	Type	Description	Default
`artifact_name`	`str`	Artifact name.	required

Returns:

Type	Description
`DataFrame`	Data frame containing all parent artifacts.

`get_all_parent_executions(artifact_name)` ¶

Return all executions that produced upstream artifacts for the given artifact.

Parameters:

Name	Type	Description	Default
`artifact_name`	`str`	Artifact name.	required

Returns:

Type	Description
`DataFrame`	Data frame containing all parent executions.

`get_metrics(metrics_name)` ¶

Return metric data frame.

Parameters:

Name	Type	Description	Default
`metrics_name`	`str`	Metrics name.	required

Returns:

Type	Description
`Optional[DataFrame]`	Data frame containing all metrics.

`dumptojson(pipeline_name, exec_uuid=None)` ¶

Return JSON-parsable string containing details about the given pipeline.

Parameters:

Name	Type	Description	Default
`pipeline_name`	`str`	Name of an AI pipelines.	required
`exec_uuid`	`Optional[str]`	Optional stage execution_uuid - filter stages by this execution_uuid.	`None`

Returns:

Type	Description
`Optional[str]`	Pipeline in JSON format.

cmflib.cmfquery.CmfQuery¶