MLOps capabilities
Experiment tracking
Hyperparameter tuning
Dataset and model versioning
Model training instrumentation
Experiment tracking
The platform tracks ML/AI and data engineering experiments, providing users with a holistic set of inside and outside performance indicators and model-specific metrics, including CPU, GPU, RAM, and inference time. These metrics help identify training bottlenecks, performance enhancement, and enable to give cost optimization recommendations.
Multiple tables and graphs visualize the metrics, enabling users to compare runs and experiments effectively, thereby achieving the most efficient ML/AI model training results.
Hyperparameter tuning
Dataset and model versioning
It involves tracking changes to datasets and model versions over time at different points in the ML lifecycle for:
- Reproducibility. By capturing every pipeline step, users can compare model experiment results, find the best candidate, and reproduce the same result.
- Achieving full observability. Dataset and model versioning allow tracking dependencies that affect ML model performance. It helps track the number of models and find the best parameters and hyperparameters.
- Easy rollback to previous and stable versions in case of error or underperformance.
Model training instrumentation
Model training instrumentation is essential for understanding model performance, diagnosing issues, ensuring reproducibility, and facilitating continuous improvement.
With OptScale ML, engineers log metrics such as Accuracy, Loss, Precision, Recall, F1 score, and others at regular intervals during training, record all hyperparameters used in the training process, such as Learning rate, Batch size, Number of epochs, Optimizer type, etc.
OptScale profiles machine learning models and deeply analyzes internal and external metrics to identify training issues and bottlenecks.
Cost and performance tracking for any API call to PaaS or external SaaS services
OptScale profiles machine learning models and deeply analyzes internal and external metrics for any API call to PaaS or external SaaS services. The platform constantly monitors cost, performance, and output parameters for better ML visibility. Complete transparency helps identify bottlenecks and adjust the algorithm’s parameters to maximize ML/AI training resource utilization and the outcome of experiments.
Supported platforms
News & Reports
MLOps open source platform
A full description of OptScale as an MLOps open source platform.
Enhance the ML process in your company with OptScale capabilities, including
- ML/AI Leaderboards
- Experiment tracking
- Hyperparameter tuning
- Dataset and model versioning
- Cloud cost optimization
How to use OptScale to optimize RI/SP usage for ML/AI teams
Find out how to:
- enhance RI/SP utilization by ML/AI teams with OptScale
- see RI/SP coverage
- get recommendations for optimal RI/SP usage