๐ป CLI Reference
Quick start with CMF Clientยค
Common Metadata Framework (CMF) has the following components:
- cmflib: A Python library that captures and tracks metadata throughout your ML pipeline, including datasets, models, and metrics. It provides APIs for both logging metadata during execution and querying it later for analysis.
- CMF Client: A command-line tool that synchronizes metadata with the
CMF Server, manages artifact transfers to and from storage repositories, and integrates with Git for version control. - CMF Server with GUI: A centralized server that aggregates metadata from multiple clients and provides a web-based graphical interface for visualizing pipeline executions, artifacts, and lineage relationships, enabling teams to collaborate effectively.
- Central Artifact Repositories: Storage backends (such as AWS S3, MinIO, or SSH-based storage) that host your datasets, models, and other pipeline artifacts.
This tutorial walks you through the process of setting up the CMF Client.
Prerequisitesยค
Before proceeding with the setup, ensure the following components are up and running:
Make sure there are no errors during their startup, as CMF Client depends on both of these components.
Setup a CMF Clientยค
CMF Client is a command-line tool that facilitates metadata collaboration between different teams or two team members. It allows users to pull or push metadata from or to the CMF Server.
Note: The
CMF Clientis automatically installed when you installcmflib. No separate installation is required.
Follow the below-mentioned steps for the end-to-end setup of CMF Client:-
Configuration
- Create working directory
mkdir <workdir> - Execute
cmf initto configure the Data Version Control (DVC) remote directory, Git remote URL, CMF server, and Neo4j. Follow thecmf initfor more details.
How to effectively use CMF Client?ยค
Let's assume we are tracking the metadata for a pipeline named Test-env with a local artifact repository and a CMF Server.
Create a folder
mkdir example-folder
Initialize cmfยค
CMF initialization is the first and foremost step to use CMF Client commands. This command completes the initialization process in one step, making the CMF Client user-friendly. Execute cmf init in the example-folder directory created in the above step.
Basic Usage (Required Parameters Only):
cmf init local --path /path/to/local-storage --git-remote-url https://github.com/user/experiment-repo.git
With Optional Parameters:
cmf init local --path /path/to/local-storage --git-remote-url https://github.com/user/experiment-repo.git --cmf-server-url http://x.x.x.x:80 --neo4j-user neo4j --neo4j-password password --neo4j-uri bolt://localhost:7687
Check cmf init local for more details.
Check status of CMF initialization (Optional)
cmf init show
Track metadata using cmflib
Use Sample projects as a reference to create a new project to track metadata for ML pipelines.
More information is available inside Getting Started Tutorial.
Before pushing artifacts or metadata, ensure that the CMF Server is up and running.
Push artifacts
Push artifacts in the artifact repository initialized in the Initialize cmf step.
cmf artifact push -p 'Test-env'
Push metadata to CMF Server
cmf metadata push -p 'Test-env'
CMF Client with collaborative developmentยค
In the case of collaborative development, in addition to the above commands, users can follow the commands below to pull metadata and artifacts from a common CMF Server and a central artifact repository.
Pull metadata from the server
Execute cmf metadata pull command in the example_folder.
cmf metadata pull -p 'Test-env'
Pull artifacts from the central artifact repository
Execute cmf artifact pull command in the example_folder.
cmf artifact pull -p 'Test-env'
Flow Chart for cmfยค
CMF Client is a command-line tool that facilitates metadata collaboration between different teams or two team members. It allows users to pull/push metadata from or to the CMF Server with similar functionalities for artifact repositories and other commands.