Track Awesome Seml Updates Daily
A curated list of articles that cover the software engineering best practices for building machine learning applications.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 SE-ML/awesome-seml · ⭐ 1K · 🏷️ Computer Science
Feb 11, 2022
Tooling
- Aim - Aim is an open source experiment tracking tool.
Dec 06, 2021
Governance
Oct 19, 2021
Tooling
- REVISE: REvealing VIsual biaSEs (⭐91) - Automatically detect bias in visual data sets.
Oct 05, 2021
Governance
May 06, 2021
Tooling
- Alibi Detect (⭐1.5k) - Python library focused on outlier, adversarial and drift detection.
- PyTorch Lightning (⭐20k) - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
- Robustness Metrics (⭐418) - Lightweight modules to evaluate the robustness of classification models.
- Seldon Core (⭐3.4k) - An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models on Kubernetes.
- Tensorflow Data Validation (TFDV) (⭐674) - Library for exploring and validating machine learning data. Similar to Great Expectations, but for Tensorflow data.
Apr 28, 2021
Governance
Mar 24, 2021
Model Training
Mar 18, 2021
Deployment and Operation
Jan 04, 2021
Governance
Nov 18, 2020
Deployment and Operation
Governance
Oct 21, 2020
Broad Overviews
Data Management
Oct 13, 2020
Tooling
- Archai (⭐373) - Neural architecture search.
- FairLearn - A toolkit to assess and improve the fairness of machine learning models.
- Great Expectations (⭐7.4k) - Data validation and testing with integration in pipelines.
- LiFT (⭐159) - Linkedin fairness toolkit.
- Model Card Toolkit (⭐314) - Streamlines and automates the generation of model cards; for model documentation.
Sep 14, 2020
Governance
Jul 30, 2020
Broad Overviews
Jun 24, 2020
Tooling
- Git Large File System (LFS) - Replaces large files such as datasets with text pointers inside Git.
- OpenML - An inclusive movement to build an open, organized, online ecosystem for machine learning.
- Spark Machine Learning - Spark’s ML library consisting of common learning algorithms and utilities.
Jun 22, 2020
Model Training
Tooling
- Airflow - Programmatically author, schedule and monitor workflows.
- Data Version Control (DVC) - DVC is a data and ML experiments management tool.
- Facets Overview / Facets Dive - Robust visualizations to aid in understanding machine learning datasets.
- HParams (⭐126) - A thoughtful approach to configuration management for machine learning projects.
- Kubeflow - A platform for data scientists who want to build and experiment with ML pipelines.
- Label Studio (⭐11k) - A multi-type data labeling and annotation tool with standardized output format.
- MLFlow - Manage the ML lifecycle, including experimentation, deployment, and a central model registry.
- Neptune.ai - Experiment tracking tool bringing organization and collaboration to data science projects.
- Neuraxle (⭐543) - Sklearn-like framework for hyperparameter tuning and AutoML in deep learning projects.
- TensorBoard - TensorFlow's Visualization Toolkit.
- Tensorflow Extended (TFX) - An end-to-end platform for deploying production ML pipelines.
- Weights & Biases - Experiment tracking, model optimization, and dataset versioning.
May 15, 2020
Governance
Apr 03, 2020
Broad Overviews
Model Training
Deployment and Operation
Mar 29, 2020
Deployment and Operation
Feb 29, 2020
Data Management
Feb 28, 2020
Broad Overviews
Model Training
Feb 26, 2020
Model Training
Feb 20, 2020
Data Management
Model Training
Feb 06, 2020
Deployment and Operation
Feb 03, 2020
Data Management
Social Aspects
Jan 31, 2020
Model Training
Jan 30, 2020
Model Training
Social Aspects
Jan 29, 2020
Data Management
Model Training
Deployment and Operation
Jan 28, 2020
Data Management
Model Training
Deployment and Operation
Social Aspects