Four "R"s to Machine Learning Software Development
Having developed and maintained data-driven products based on ML models in production -- across companies of all shapes and sizes, including internationally (!), I've found that there are 4 themes to ML software development.
Relevance
Do you understand the business question, for which you're developing a promising POC to put into production that meets business and technical requirements?
Robustness
How robust is the data processing? How high do you score on the ML Test Score (Google, 2017)?
Design: Do you have an architecture diagram? Have you defined an input and output spec?
Is there a testing suite?
Reproducibility
Do you use version control?
Docker?
Do you connect to the data source(s) directly?
Readability
(Python) Do you follow PEP 8 guidelines? And a style guide?
Does the code need refactoring? Or is it (relatively) easy to understand what the code is doing and how to modify it?
Is there logging?
Should I not ask about documentation? :)
Keywords: Data products, Machine Learning software development
You may also like: