Skip to content

CMF Installation & Setup Guideยค

This guide provides step-by-step instructions for installing, configuring, and using CMF (Common Metadata Framework) for ML pipeline metadata tracking.

Overviewยค

The installation process consists of the following components:

  1. cmflib with CMF Client Installation: A Python library that captures and tracks metadata throughout your ML pipeline, including datasets, models, and metrics.
  2. CMF Server with GUI Installation: A centralized server that aggregates metadata from multiple clients and provides a web-based graphical interface for visualizing pipeline executions, artifacts, and lineage relationships.

Note: Every CMF setup requires a CMF Server instance. In collaborative environments, multiple users working on the same project can share a single CMF Server to centralize metadata and facilitate team coordination.


Common Prerequisitesยค

Before installing cmflib and its components, ensure you have the following:

  • Linux/Ubuntu/Debian

  • Python: Version 3.9 to 3.11 (3.10 recommended)

    Note: If you encounter issues with Python 3.9 on Ubuntu, refer to the Troubleshooting section at the end of this guide.


cmflib with CMF Client Installationยค

Prerequisitesยค

  • Git: Latest version for code versioning

    Make sure Git is properly configured using git config, as it's required for the product. At minimum, set your user identity:

    git config --global user.name "Your Name"
    git config --global user.email "you@example.com"
    
  • Storage Backend: local, S3, MinIOS3, ssh storage or OSDF storage for artifacts.

Installation Stepsยค

Step 1: Set up Python Virtual Environmentยค

conda create -n cmf python=3.10
conda activate cmf
virtualenv --python=3.10 .cmf
source .cmf/bin/activate

Step 2: Install cmflibยค

pip install cmflib
pip install git+https://github.com/HewlettPackard/cmf

CMF Server with GUI Installationยค

Every CMF setup requires a CMF Server instance. In collaborative environments, multiple users working on the same project can share a single CMF Server to centralize metadata and facilitate team coordination.

Prerequisitesยค

  • Docker: For containerized deployment of CMF Server and CMF UI

    1. Install Docker Engine with non-root user privileges.
    2. Install Docker Compose Plugin.

    In earlier versions of Docker Compose, docker compose was independent of Docker. Hence, docker-compose was the command. However, after the introduction of Docker Compose Desktop V2, the compose command became part of Docker Engine. The recommended way to install Docker Compose is by installing a Docker Compose plugin on Docker Engine. For more information - Docker Compose Reference.

  • Docker Proxy Settings: Needed for some of the server packages

    Refer to the official Docker documentation for comprehensive instructions: Configure the Docker Client for Proxy.

Installation Stepsยค

Step 1: Clone the GitHub Repository

git clone https://github.com/HewlettPackard/cmf

Step 2: Navigate to the CMF Directory

cd cmf

Step 3: Create Environment Configuration

Create a .env file in the same directory as docker-compose-server.yml with the following environment variables:

CMF_DATA_DIR=./data                    
NGINX_HTTP_PORT=80                  
NGINX_HTTPS_PORT=443
REACT_APP_CMF_API_URL=http://your-server-ip:80

๐Ÿ“ Note: - CMF_DATA_DIR controls where all data (PostgreSQL, TensorBoard logs, etc.) is stored. Use an absolute path for better control. - REACT_APP_CMF_API_URL should point to your server's accessible address.

Step 4: Start the Containers

๐Ÿ’ก Recommended Approach: Using docker compose starts the CMF Server, PostgreSQL database, and CMF UI together.

Note: It's essential to start the PostgreSQL database before the CMF Server.

docker compose -f docker-compose-server.yml up

๐Ÿ“ Note: Replace docker compose with docker-compose if you're using an older version of Docker.

This command starts all services:

  • PostgreSQL: Database backend for metadata storage
  • CMF Server: API server for metadata management
  • UI: Web interface for visualization
  • TensorBoard: For viewing ML training metrics
  • Nginx: Reverse proxy serving all components

Accessing the CMF UIยค

Once the containers are successfully started, the CMF UI will be available at the URL specified in your .env file:

http://your-server-ip:80

Replace your-server-ip with the actual IP address or hostname configured in the REACT_APP_CMF_API_URL environment variable.

๐Ÿ“ Note: Ensure that port 80 (or your configured NGINX_HTTP_PORT) is accessible and not blocked by firewall rules.

Step 5: Stop the Containers

docker compose -f docker-compose-server.yml stop

Important Notesยค

๐Ÿ’ก Rebuild Required: Rebuild the images for CMF Server and CMF UI after a CMF version update or pulling the latest changes from Git to ensure compatibility.

docker compose -f docker-compose-server.yml build --no-cache
docker compose -f docker-compose-server.yml up

Troubleshootingยค

Python 3.9 Installation Issues on Ubuntuยค

If you are using Python 3.9 on Ubuntu systems, you may encounter installation or virtual environment issues.

Issue: When creating Python 3.9 virtual environments, you may encounter:

ModuleNotFoundError: No module named 'distutils.cmd'

Root Cause: Python 3.9 may be missing required modules like distutils or venv when installed on Ubuntu systems.

Resolution:

  1. Add the deadsnakes PPA (provides newer Python versions):

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
2. Install Python 3.9 with required modules:

sudo apt install python3.9 python3.9-dev python3.9-distutils python3.9-venv
3. Verify the installation:

python3.9 --version
python3.9 -m venv test_env

This ensures Python 3.9 and its essential modules are fully installed and functional.

๐Ÿ’ก Recommendation: If you're starting fresh, we recommend using Python 3.10 to avoid these compatibility issues.