Blueprint Series: Inextricably Linked – Reproducibility and Productivity in Machine Learning

Speaker: Mark Coleman, VP Marketing at dotscience and Marketing Chairperson for Cloud Native Computing Foundation.

Talk Synopsis: Because it is more complex and has far more moving parts, machine learning is where Software Development was in 1999: people are emailing and Slacking notebooks to each other, due to a lack of appropriate tooling. There are few CI/CD pipelines and model health monitoring is scarce. A lot that could be automated is still manual. And teams are siloed. This causes problems both for productivity: it's hard to collaborate, and reproducibility: which impacts on governance and compliance. In this talk, Mark shares his team’s research comparing the evolution of Software Development & DevOps with that of machine learning. Mark then presents a proposal for an architecture and a set of open source tools to solve both the collaboration and the governance problem in Machine Learning.

Filmed at Skills Matter/Code Node London on 9th May 2019 as part of the Big Data LDN Meetup Blueprint Series.

Meetup sponsored by DataStax.

Recent Posts


Blueprint Series: The Data Landscape at Monsoon/Accessorize


Blueprint Series: Speed Up Your Apache Cassandra Applications – A Practical Guide to Reactive Programming


Blueprint Series: Banking in the Cloud – Ultra-High Reliability Architectures