Industry News Details


In my previous  article, About model governance and holistic model management. I received great response, along with some questions about the differences between Data Science and Software Development  workflow. As a response, this piece highlights the key differences in processes, tools and behavior between Data Science and software engineering teams, as well as best practices we’ve learned from years of serving successful model driven enterprises.

Why Understanding the Key Differences Between Data Science  and Software Development Matters

As Data Science  becomes a critical value driver for organizations of all sizes, business leaders who depend on both Data Science  and Software Development teams need to know how the two differ and how they should work together. Although there are lots of similarities across Software Development  and Data Science , they also have three main differences: processes, tooling and behavior. In practice, IT teams are typically responsible for enabling Data Science teams with infrastructure and tools. Because Data Science  looks similar to Software Development (they both involve writing code, right?), many IT leaders with the best intentions approach this problem with misguided assumptions, and ultimately undermine the Data Science teams they are trying to support.

Data Science  != Software Engineering

I. Process

Software engineering has well established methodologies for tracking progress such as agile points and burndown charts. Thus, managers can predict and control the process by using clearly defined metrics. Data Science  is different as research is more exploratory in nature. Data Science projects have goals such as building a model that predicts something, but like a research process, the desired end state isn’t known up front. This means Data Science  projects do not progress linearly through a lifecycle. There isn’t an agreed upon lifecycle definition for Data Science work and each organization uses its own. It would be hard for a research lab to predict the timing of a breakthrough drug discovery. In the same way, the inherent uncertainty of research makes it hard to track progress and predict the completion of Data Science  projects. View More