The Datalake as driver for digital transformation & data centricity

Everyone (or at least most) companies today talk about digital transformation and treat data as a main asset for this. The question is where to store this data. In a traditional database? In a DWH? I think we should take a step back to answer this question. First of all, a Datalake is not a single piece of software. It consists of a large variety of Platforms, where Hadoop is a central one, but not the only one – it includes other tools such as Spark, Kafka, … and many more. Also, it includes relational Databases – such as PostgreSQL for instance. If we look at how truly digital companies such as Facebook, Google or Amazon solve these problems, then the technology stack is also clear; in fact, they heavily contribute to and use Hadoop & similar technologies. So the answer is clear: you don’t need overly expensive DWHs any more. However, many C-Level executives might now say: “but we’ve

read more The Datalake as driver for digital transformation & data centricity

Open Source Storage Client for various platforms

Since there are many cloud providers out there and I often come across the problem to switch between different platforms (such as Google AppEngine, Amazon S3, …) I have decided to write a single client that will work with all different platforms – or at least as most as possible. I’ve created a project on Google Code here and I will start to write a first draft of interfaces. In the first step, I will include Amazon S3. I hope that more people will join this project and help me creating a great project 😉