The three data sources

To get the most out of your data strategy in an enterprise, it is necessary to cluster the different user types that might arise in an enterprise. All of them are users of data but with different needs and demands on it. In my opinion, they range from different expertise levels. Basically, I see three different user types for data access within a company

Data access on 3 different levels

Three degrees of Data Access

Basically, the different user types differentiate from their level of how they use data and from the number of users. Let’s first start with the lower part of the pyramid – Business Users

Business Users

The first layer are the business users. This are basically users that need data for their daily decisions, but are rather consumers of the data. These people look at different reports to make decisions on their business topics. They could either be Marketing, Sales or Technology – depending on the company itself. Basically, these users would use pre-defined reports, but in the long run would rather go for customized reports. One great thing for that is self-service BI. Basically, theses users are experienced in interpreting data for their business goals and asking questions on their data. This could be about re-viewing the performance of a campaign, weekly or monthly sales reports, … They create huge load on the underlying systems without understanding the implementation and complexity underneath it – and they don’t have to. From time to time, they start digging deeper into their data and thus become power users – our next level

Power Users

Power Users often emerge from Business Users. This is typically a person that is close with the business and understands the needs and processes around it. However, they also have a great technical understanding (or gained this understanding during the process of becoming power users). They have some level of SQL know-how or know the basics of other scripting tools. They often work with the business users (even in the same department) on solving business questions. Also, they work close with Data Engineers on accessing data sources and integrating new data sources. Also, they go for self-service analytics tools to have a basic level of data science done. However, they aren’t data scientists but might get into this direction if they invest significant time into it. This now brings us to the next level – the data scientists

Data access for Data Scientists

This is the top level of our pyramid. People working as data scientists aren’t in the majority – business users and power users are much more. However, they work on more challenging topics then the previous two. Also, they work close with power users and business users. They might still be in the same department, but not necessarily. Also, they work with advanced tools such as R and Python and fine-tune the models the power users built with self-service analytics tools or translate the business questions raised from the business users into algorithms.

Often, those 3 develop in different directions – however, it is necessary that all of them work together – as a team – in order to make projects with data a success. With Data access, it is necessary to also incorporate role based access controls.

This post is part of the “Big Data for Business” tutorial. In this tutorial, I explain various aspects of handling data right within a company.

cloud computing header

Honestly, a data scientist is doing a great job. Literally, they are saving all industries from a strong decline. And those heroes, they are doing all of that alone. Alone? Not fully.

The Data Scientist need the Data Engineer

There are some poor guys that support their success: those, that are called Data Engineers. A huge majority of tasks has been carried out by these guys (and girls) that hardly anyone is talking about. All the fame seems to be going to the data scientists but the data engineers aren‘t receiving any credits.

I remember one of the many meetings with C-Level executives I had. When I explained the structure of a team dealing with data, everyone in the board room agreed on „we need data scientists“. Then, one of the executives raised the question: „but what are these data engineers about? Do we really need them or could we maybe have more data scientists instead of them“.

I kept on explaining and they accepted it. But I had the feeling that they still wanted to go with more Data Scientists than Engineers eventually. This basically comes out of the trend and hype around the data scientists we see. Everyone knows that they are important. But data driven projects only succeed when a team with mixed skills and know-how is coming together.

A Data Science team needs at least the same number of Data Engineers

In all data driven projects I saw so far, it would have never worked without data engineers. They are relevant for many different things – but mainly – and in an ideal world – working in close cooperation with data scientists. If the maturity in a company for data is high, the data engineer would prepare the data for the data scientist and then work with the data scientist again on putting the algorithm back into production. I saw a lot of projects where the later one wasn‘t working – basically, the first steps were successful (data preparation) but the later step (automation) was never done.

But, there are more roles involved in that: one role, which is rather a specialization of the data engineer is the data system engineer. This is not often a dedicated role, but carried out by data engineers. Here, we basically talk about infrastructure preparation and set-up for the data scientists or engineers. Another role is the one of the data architect that ensures a company-wide approach on data and of course data owners and data stewards.

I stated it several times, but it is worth stating it over and over again: data science isn‘t a one (wo)man show, it is ALWAYS a team effort.

This post is part of the “Big Data for Business” tutorial. In this tutorial, I explain various aspects of handling data right within a company. Another interesting article about the data science team setup can be found here.

Data itself and Data Science especially, is one of the drivers of digitalisation. Many companies experimented with Data Science over the last years and gained significant insights and learnings from it. Often, people dealing with statistics started to do this magic thing called data science. But also technical units used machine learning and alike to further improve their businesses. However, for many other units within traditional companies, all of this seems like magic and dangerous. So how to include others not dealing with the topic in detail and thus de-mystify the topic? So what does it take to become data driven?

How to become data driven

First of all, Machine Learning and Data Science isn‘t the revolution. Units started implementing it in order to gain new insights and improve their business results. However, often it is also acquired via business projects from consulting companies. The newer and complex a topic is, the higher the risk is that people will object it. The reasons for that are fear and mis- or not understanding.

When being deep in the topic of data and data science, you might be treated with fame by some. Mainly by those, that think that you are a magician. However, you will also be rejected by others. Both is poisoning in my opinion. The first group will try to get very close to you and expects a lot. However, you are often not capable of meeting their expectations. After a while, they get frustrated by far too high expectations.

In corporate environments, it is very important to filter this group at the very beginning. You need to clearly state what they can expect and what not. It is also important to state towards them what they won‘t get – and saying „No“ is very important to them as well. Being transparent with this group is essential – in order to keep them close supporters to you in a growing environment. You will depend a lot on those people if you want to succeed. So be clear with them.

People fear digitalisation

The other group – which I would say in digitalisation is the bigger group – is the group that will meet you with fears and doubts. This group is the far larger group and it is highly important that you cover them well. You can easily recognise people in this group by not being open towards your topics. Some are probably actively refusing it, others might not be so active and just poison the climate. But be aware: they usually don‘t do it because they hate you for some reasons.

They are just acting human and are either afraid, feel that they are not included or have other doubts about you and your unit. It is highly essential to work on a communication strategy with this group and pro-actively include them. Bringing clarity and de-mystifying your topic in easy terms is vital. It is important that you create a lot of comparisons to your traditional business and keep it simply. Once you gained their trust and interest, you can get much deeper into your topic and provide learning paths and skill development for those people.

If you succeeded in that, you created strong supporters that will come up with great ideas to improve your business even further. Keep in mind: just because you are in a „hot topic“ like big data and data science and you might be treated like a rock star by some, others are also great in doing things and it all boils down to: we are just humans.

No digitalisation without a data strategy

Digitalisation needs trust to succeed. If you fail to deliver trust and don’t include the human aspect, your digitalisation and data strategy is poised to fail – independent of the budget and C-Level support you might have for your initiative. So, make sure to work on that – with high focus! Becoming data driven is the driver for digitalisation in your company!

This post is part of the “Big Data for Business” tutorial. In this tutorial, I explain various aspects of handling data right within a company. Another article I like about data driven organisations can be found on Forbes.

Strategy by Nick Youngson CC BY-SA 3.0 Alpha Stock Images

Digitalisation is a key driver amongst companies since the last 2 years. However, many companies forget that the oil for the digitalisation engine is data. Most companies have no data strategy in place or at least it is very blurry. A lot of digitalisation strategies fail, which is often due to the lack of proper treatment and management of their data. In this blog post, I will write about the most common errors I saw so far in my experience. Disclaimer: I won’t offer answers as of now, but it is relevant to give you an insight into what you should probably avoid doing. The following steps help you to destroy your data strategy.

Step 1: Hire Data Scientists. Really: you need them

Being a Data Scientist is a damn sexy job. It is even considered to be the most sexy job of the 21st century. So why should you not have one? Or two or three? Don’t worry – just hire them. They do the magic and solve almost all of your problems around data. Just don’t think about it, just do it. If you have no Data Scientist for your digitalisation strategy, it isn’t complete. Think about what they can or should do later.

In my experience, this happend a lot in the last years. Only few industries (e.g. banking) have experience with them, as it is natural for them. Over the last years I saw Data Scientists joining companies without a clear strategy. These Data Scientists then had to deal with severe issues:

  • Lack of data availability. Often, they have issues getting to the data. Long processes, siloed systems and commodity systems prevent them from doing so.
  • Poor data quality. Once they get to the data and want to start doing things with it, it becomes even more complex: no governance, no description of the data, poor overall quality.

So, what most companies are often missing out on is the counterpart each data scientist needs: a Data Engineer. Without them, they are often nothing.

But with this, I described actually a status which is almost advanced; often, companies hire data scientists (at high salaries!) and then let them just do BI tasks like reporting. I saw this often and people got frustrated. Frustration led to them leaving the jobs just after some months. The company had no learnings after that and no business benefits. So it clearly failed.

Step 2: Deliver & Work in silence. Let nobody know what you are doing

Digitalisation is dangerous and disruptive. It will lead to major changes in companies. This is a fact, not fiction. And you don’t need science to figure that out. So why should you talk about it? Just do it, let other units continue doing their job and don’t disrupt them.

Digitalisation is a complex topic and humans by nature tend to interpret. Also, they will start to interpret things from this topic to fit to their comfort zone. This will lead to different strategies and approaches, creating even more failed projects and a lot of uncertainty.

The approach here should be to be consistent about communication within the company and to take away fear from different units. Digitalisation is by nature disruptive, but do it with the people, not against them!

Step 3: Build even more silos to destroy the data strategy

Step 2 will most likely lead to different silos. A digital company should be capable of doing and solving their digital products, services and solutions on their own. There is always a high threat that different business units will create data silos. This leads to the fact that there will never be a holistic view on all of your data. The integration is though later on and will burn a lot of money. For businesses, it is often a quick win to implement the one or another solution, but backwards integration of these solutions – especially when it comes to data – is very tricky.

A lot of companies have no 360 degree view of their data. This is due to the mere fact that business units often confront IT departments with “we need this tool now, please integrate”. This leads to issues, since IT departments are anyway often understaffed. So, a swamp in the IT landscape is created, leading to an even bigger swamp of data. Integration then never really happens as it is too expensive. Will you become digital with this? Clearly no.

Step 4: Build a sophisticated structure when the company isn’t sophisticated with this topic yet.

Data Scientists tend to sit in business units. For a data driven enterprise, this is exactly how it should be. However, only a small percentage of companies are data driven. I would argue that traditional companies aren’t data driven, only the Facebooks, Googles and Amazons of our world are.

However, traditional companies now tend towards copying this system and Business units hire data scientists – which are then disconnected to other units and only loosely connected via internal communities. A distributed layout of your company in terms of data only makes sense once the company reached a high level of maturity. In my opinion, it needs to be steered from a central unit first. Once the maturity is going to improve, it can be step-wise decentralised and then put back fully into business units.

One thing: put digitalisation very close to the CEO of the company. It needs to have some fire power as there will always be obstacles.

In my experience, I’ve seen quite a lot of failures when it comes to where to place data units. In my opinion, it only makes sense in a technical unit or – if available – in the digitalisation unit. However, it should never be in business functions. You will definitely succeed and destroy the data strategy with this.

Step 5: Don’t invest into people to destroy your data strategy

Last but not least, never invest into people. Especially Data Scientists – they should be really happy to have a job with you, so why would you also invest into them and give them education?

This is also one challenge I see a lot in companies. They simply don’t treat their employees well, and those that are under high demand (like Data Scientists) tend to leave fast then. This is one of the key failures in Data driven strategies. Keeping the people is a key to a successful strategy and a lot of companies don’t manage this well. To not invest into people is probably one of the most effective ways to destroy a data strategy.

This post is part of the “Big Data for Business” tutorial. In this tutorial, I explain various aspects of handling data right within a company. Now it is about time to twist it around and destroy your competitors with data.

Big Data is considered to be the job you simply have to go for. Some call it sexy, some call it the best job in the future. But what exactly is a Data Scientist? Is it someone you can simply hire from university or is it more complicated? Definitely the last one applies for that.
When we think about a Data Scientist, we often say that the perfect Data Scientist is kind of a hybrid between a Statistician and Computer Scientist. I think this needs to be redefined, since much more knowledge is necessary. A Data Scientist should also be good in analysing business cases and talk to line executives to understand the problem and model an ideal solution. Furthermore, extensive knowledge on current (international) law is necessary. In a recent study we did, we defined 5 major challenges:
Each of the 5 topics are about:

  • Big Data Business Developer: The person needs to know what questions to ask, how to cooperate with line of business (LOB) decision makers and must have good social skills to cooperate with all of them.
  • Big Data Technologist: In case your company isn’t using the cloud for Big Data Analytics, you also need to be into infrastructure. The person must know a lot about system infrastructure, distributed systems, datacenter design and operating systems. Furthermore, it is also important to know how to run your software. Hadoop doesn’t install itself and there is some maintenance necessary.
  • Big Data Analyst: This is the fun part; here it is all about writing your queries, running Hadoop jobs, doing fancy MapReduce queries and so on! However, the person should know what to analyse and how to implement such algorithms. It is also about machine learning and more advanced topics.
  • Big Data Developer: Here it is more about writing extensions, add-ons and other stuff. It is also about distributed programming, which isn’t the easiest part itself.
  • Big Data Artist: Got the hardware/datacenter right? Know what to analyse? Wrote the algorithms? What about presenting them to your management? Exactly! This is also necessary! You simply shouldn’t forget about that. The best data is worth noting if nobody is interested in it because of poor presentation. It is also necessary to know how to present your data.

As you can see, it is very hard to become a data scientist. Things are not as easy as it might seems. The Data Scientist should be a nerd in each of these fields, so the person should be some kind of a “super nerd”. This might be the super hero of the future.
Most likely, you won’t find one person that is good in all of these fields. Therefore, it is necessary to build an effective team.
Header Image Copyright: Chase Elliott Clark