MLOps Pipeline Reference
Free reference guide: MLOps Pipeline Reference
About MLOps Pipeline Reference
The MLOps Tools Reference is a searchable guide to the most widely adopted tools in the machine learning operations ecosystem, organized into five practical categories: Experiment Tracking (MLflow, Weights & Biases, TensorBoard), Pipeline Orchestration (Kubeflow, Apache Airflow, Prefect), Model Serving (BentoML, Seldon Core, Ray Serve), Data Management (DVC, Feast, Great Expectations), and Data Labeling (Label Studio, CVAT). Each tool entry includes the tool name, what it does, and a working code example.
Machine learning engineers, data scientists, and platform engineers use MLOps tools to bridge the gap between model development and production deployment. Experiment tracking tools like MLflow and W&B let teams log hyperparameters, metrics, and model artifacts across hundreds of training runs. Pipeline orchestration tools like Kubeflow and Airflow automate the sequence of data preprocessing, model training, evaluation, and deployment steps. Model serving frameworks like BentoML and Seldon Core handle containerization, REST API generation, and Kubernetes deployment of trained models.
This reference is designed for teams building or evaluating their MLOps stack. Data versioning with DVC solves the reproducibility problem by treating datasets and model files like code in a Git repository. Feature stores like Feast provide consistent feature computation between training and online inference, eliminating training-serving skew. Data quality tools like Great Expectations automate validation checks on incoming data pipelines. Whether you are setting up your first ML pipeline or comparing serving frameworks for production, this reference gives you working examples for each tool.
Key Features
- Five categories: Experiment Tracking, Pipeline Orchestration, Model Serving, Data Management, Labeling
- Experiment tracking tools: MLflow (log_param, log_metric), W&B (wandb.init, wandb.log), TensorBoard (SummaryWriter)
- Pipeline orchestration: Kubeflow (@dsl.pipeline), Airflow (DAG with shift operators), Prefect (@task/@flow)
- Model serving: BentoML (@bentoml.service), Seldon Core (SeldonDeployment YAML), Ray Serve (@serve.deployment)
- Data versioning with DVC: dvc init, dvc add, dvc push/pull for Git-based dataset management
- Feature store setup with Feast: feast init, feast apply, and online feature retrieval
- Data quality validation with Great Expectations checkpoint runs
- Annotation tools: Label Studio (web UI + ML backend) and CVAT (bounding boxes, segmentation, keypoints)
Frequently Asked Questions
What is MLflow and how do I use it for experiment tracking?
MLflow is an open-source platform for tracking machine learning experiments, packaging models, and deploying them. You start a run with mlflow.start_run(), log hyperparameters with mlflow.log_param("lr", 0.01), record metrics with mlflow.log_metric("accuracy", 0.95), and end the run with mlflow.end_run(). The MLflow UI lets you compare runs and visualize metric trends across experiments.
What is the difference between MLflow and Weights & Biases (W&B)?
Both tools track experiments, but W&B is a cloud-hosted SaaS with richer visualization, Sweeps (hyperparameter search), and team collaboration features. MLflow is open-source and self-hostable, making it preferred for organizations with data privacy requirements or on-premises infrastructure. Many teams use both — MLflow for model registry and deployment, W&B for interactive experiment visualization.
What is DVC and how does it solve data versioning?
DVC (Data Version Control) extends Git to handle large data files and ML models. You run dvc init in a Git repository, then dvc add data/train.csv to track a dataset (DVC stores the file in a cache and adds a small .dvc pointer file to Git). Teams share datasets via dvc push to remote storage (S3, GCS, Azure Blob) and teammates retrieve them with dvc pull, ensuring everyone trains on exactly the same data.
What is Kubeflow and when should I use it over Airflow?
Kubeflow is a Kubernetes-native ML platform that runs each pipeline step as a container on a K8s cluster. It is best for organizations already using Kubernetes who want native GPU scheduling, distributed training, and model serving in one platform. Airflow is a general-purpose workflow orchestrator that runs arbitrary Python tasks, making it more flexible but requiring more configuration for ML-specific features like experiment tracking and model registry integration.
What is a feature store and why does Feast matter?
A feature store is a centralized repository for storing, sharing, and serving machine learning features. Feast solves the training-serving skew problem: features computed during training often differ slightly from features computed at inference time due to different code paths. With Feast, you define feature views once, apply them with feast apply, and retrieve the same features both in training batch jobs and online inference with store.get_online_features().
What is the difference between BentoML and Seldon Core for model serving?
BentoML is a Python-first framework focused on developer experience — you save a model with bentoml.sklearn.save_model() and define a @bentoml.service class to serve it, and BentoML handles containerization automatically. Seldon Core is a Kubernetes-native operator where you deploy models by writing a SeldonDeployment YAML manifest. Seldon is better suited for enterprise Kubernetes environments needing canary deployments and A/B testing at scale.
What is Great Expectations used for in an MLOps pipeline?
Great Expectations (GX) is a data quality framework that automatically validates your data against expectations — for example, checking that a column has no nulls, values fall within expected ranges, or schema matches a defined type. You configure a context, define expectations, and run checkpoints in your pipeline. This catches data drift and data quality issues before they cause silent model degradation in production.
When should I use Label Studio vs CVAT for data annotation?
Label Studio is a general-purpose, open-source annotation tool that supports text, audio, images, video, and time-series data. It also supports ML backends that can pre-annotate data automatically. CVAT (Computer Vision Annotation Tool) is specialized for computer vision tasks — it excels at bounding box annotation, polygon segmentation, and keypoint labeling for object detection and pose estimation datasets.