Posts

Agility is almost everywhere and it also starts to get more into other hyped domains – such as Data Science. One thing which I like in this respect is the combination with DevOps – as this eases up the process and creates end-to-end responsibility. However, I strongly believe that it doesn’t make much sense to exclude the business. In case of Analytics, I would argue that it is BizDevOps.

Basically, Data Science needs a lot of business integration and works throughout different domains and functions. I outlined several times and in different posts here, that Data Science isn’t a job that is done by Data Scientists. It is more of a team work, and thus needs different people. With the concept of BizDevOps, this can be easily explained; let’s have a look at the following picture and I will afterwards outline the interdependencies on it:

BizDevOps for Data Science

Basically, there must be exactly one person that takes the end-to-end responsibility – ranging from business alignments to translation into an algorithm and finally in making it productive by operating it. This is basically the typical workflow for BizDevOps. This one person taking the end-to-end responsibility is typically a project or program manager working in the data domain. The three steps were outlined in the above figure, let’s now have a look at each of them.

Biz

The program manager for Data (or – you could also call this person the “Analytics Translator”) works closely with the business – either marketing, fraud, risk, shop floor, … – on getting their business requirements and needs. This person has a great understanding of what is feasible with their internal data as well in order to be capable of “translating a business problem to an algorithm”. In here, it is mainly about the Use-Case and not so much about tools and technologies. This happens in the next step. Until here, Data Scientists aren’t necessarily involved yet.

Dev

In this phase, it is all about implementing the algorithm and working with the Data. The program manager mentioned above already aligned with the business and did a detailed description. Also, Data Scientists and Data Engineers are integrated now. Data Engineers start to prepare and fetch the data. Also, they work with Data Scientists in finding and retrieving the answer for the business question. There are several iterations and feedback loops back to the business, once more and more answers arrive. Anyway, this process should only take a few weeks – ideally 3-6 weeks. Once the results are satisfying, it goes over to the next phase – bringing it into operation.

Ops

This phase is now about operating the algorithms that were developed. Basically, the data engineer is in charge of integrating this into the live systems. Basically, the business unit wants to see it as (continuously) calculated KPI or any other action that could result in some sort of impact. Also, continuous improvement of the models is happening there, since business might come up with new ideas on it. In this phase, the data scientist isn’t involved anymore. It is the data engineer or a dedicated devops engineer alongside the program manager.

Eventually, once the project is done (I dislike “done” because in my opinion a project is never done), this entire process moves into a CI process.

As mentioned in the last Post, deployment is a very tricky thing in the Cloud. It can be easy if you run your applications on a PaaS-Service. However, it might be tricky in IaaS Environments.

There is mostly a need for an Operations-Team that takes care of deployment and maintains the Application for failures. In DevOps Environments, we often see the following:

Dev:

„It‘s not my Code, it is your machines!“

Ops:

„It‘s not my machines, it is your Code!“

This means that there are often problems between the two teams. The Development Teams finger-point at the Operations Team and vice versa. Flickr gave some best practices on how to overcome this issue. These Reccomendations are divided into two main points:

  1. Right Tools
  2. Culture
Right Tools consists of the following recommendations:
  • Automated Infrastructure
This consists of different tools such as Puppet for IT Automation, Cobbler for Automated Installation, CFengine for Configuration Management or other Tools
  • Shared Version Control
  • One-Step Build
All Build Actions are done in a Skript, no need to do more command-line-ing
  • Branching
  • Shared Metrics (Easy to read metrics to see improvements)
  • IRC and IM Robots to Communicate

Large Applications such as Amazon, Google, eBay, Facebook and other large Websites have problems when it comes to deploying their Applications. Often, there are some 10,000 servers or even more of them when a new Version needs to be deployed. In Cloud Platforms, we often have similar issues. Even if we need to deploy to 100 instances, it might be difficult to get everything correct.

There is a significant difference between PaaS (Platform as a Service) and IaaS (Infrastructure as a Service) Applications. If we use IaaS, we have to take care of deployment. Each instance needs to have the newest Version. There are different tools available to automate that task or you either write your own tool for that. If you use PaaS, it is easy since most platforms (such as Google App Engine and Windows Azure) usually take care of that.

There are several issues and challenges if you use IaaS Platforms to run your applications:

  • You have to deploy to a lot of Servers – usually 100 or more
  • It is not possible to allow a “maintenance timeout”. In this timeframe, you loose money and your customers will be angry if you have a “maintenance downtime” for 2 hours or more every week
  • Deployment goes to the production System, where probably million users interact with – problems will immediately have effects.

This leads to some problems and the short answer is that there is no easy solution to that. Each deployment strategy must be tailored to the application. However, it is possible to use a specific iteration strategy, which is illustrated below:

Iteration strategies for Cloud deployment

Iteration strategies for Cloud deployment

There are 5 major phases in each deployment:

  1. Develop. Software/Platform is in development phase.
  2. Verify. Everything that was developed in Phase 1 is now tested. This is basically done by the testing department. There are no user tests involved so far and the new release is not deployed to a real environment.
  3. Stage. The new release is put on a staging environment. This is a sandboxed environment where no damage can be done to the real system.
  4. Verify. The new release is verified on the staging environment. This is basically done by beta-users or internal departments
  5. Deploy. The new release is deployed to the final system and all users access it from now on.