Unconventional splitting techniques for time-series datasetsBe careful when time is a dimension in your datasetSep 14, 2022Sep 14, 2022
‘It’s the data, silly!’ How data-centric AI is driving MLOpsThe hidden link between data-centric machine learning and MLOps.Apr 7, 2022Apr 7, 2022
Published inTowards Data ScienceTaking on the ML pipeline challengeWhy data scientists need to own their ML workflows in production.Oct 27, 2021Oct 27, 2021
Published inTowards Data ScienceWhy ML should be written as pipelines from the get-goEliminate technical debt with iterative, reproducible pipelines.Mar 31, 2021Mar 31, 2021
Published inTowards Data Science“Spot” the difference in ML costsOptimize ML training costs on the cloud using spot instances.Jan 28, 20211Jan 28, 20211
Published inTowards Data ScienceAvoiding technical debt with ML pipelinesStart thinking in terms of pipelines rather than scripts to avoid technical debt while developing ML systems.Jan 22, 20212Jan 22, 20212
Published inFeature Stores for MLIs your Machine Learning Reproducible?A breakdown of what reproducible machine learning is, why its important, and what goes into making your ML reproducible.Jan 20, 2021Jan 20, 2021
Why ML in production is (still) broken — [#MLOps2020]Just a few days ago, I was able to share my thoughts on the state of Machine Learning in production, and why it’s (still) broken, on the…Jun 26, 2020Jun 26, 2020
Can you do the splits?One attempt to ensure that ML models generalize in unknown settings is splitting data. This can be done in many ways, from 3-way (train…Jun 11, 2020Jun 11, 2020
Why deep learning development in production is (still) brokenAround 87% of machine learning projects do not survive to make it to production. There is a disconnect between machine learning being done…Mar 1, 2020Mar 1, 2020