Agile Data Science: Kanban or Scrum?

Agility is everywhere in the enterprise nowadays. Most companies want to become more agile and also on C-Level, there are huge expectations on agility. However, I’ve seen much of the analytics (and Big Data) projects being the complete opposite: neither agile nor successful. Often, the reasons for this were different: the setup of the datalake with expensive hardware setup took years, not month and the operations with it turned out to be very inefficient to maintain these systems. Also, a lot of companies expressed their demand for agile analytics. But in fact, with analytics (and big data), we moved away from agility and to a complex waterfall-like approach. But what was worse, is the approach of doing agile analytics and then don’t stick to it (and rather do it somewhere in between).

However, a lot of companies also realised that agility can only be solved with (Biz)DevOps and the Cloud. Really, there is hardly any way around this. And a close coop between data engineering and data science. Once these requirements are fixed, another important question will arise: Kanban or Scrum?

I would say, that this question is a “luxury” problem. If a company has to answer this, it is already at a very high maturity state on data. My ideas on this topic (which, again, is a “it depends” thing) are:

  • Complexity: if the data project is more complex, Scrum might be the better choice. A lot of data science projects are one-person projects (with support of data engineers and devops at some stages) and rather run for some weeks and also not always full-time. In this case (lower complexity), Kanban is the most suiteable approach. Often, the data scientist even works on different projects as the load per project isn’t much at all. Other projects with higher complexity, I would recommend Scrum
  • Integration/Productization: If the integration effort is high (e.g. into existing processes, systems and alike), I would rather recommend to go with Scrum. More people are involved and the complexity is immediately higher. My experience is, that the higher the data engineering part is, the more likely it is that the project is delivered with Scrum

I guess there could be much more indicators, so I am looking forward to your comments on it 🙂

I lead a team of Senior Experts in Data & Data Science as Head of Data & Analytics and AI at A1 Telekom Austria Group. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data & Data Science.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s