Data Science is often this mystical thing – only a few understand it, finding people doing it is very hard. The skill gap is everywhere and companies are facing issues staffing their projects. However, most companies want to become “data driven” and thus would need to have the skills available.
However, I still think that we are currently doing it somewhat wrong – we need to enable more people to do what data scientists are doing without the need for them to do complex algorithmic things. This is when “self service Analytics” comes into play – giving business users more power and enabling them in doing “data science” with easy tools.
In an ideal world, each business user would have some basic data capabilities and is fully capable of doing her own insights in some way – by driving all decisions with the data, not with gut feelings. There are already some tools out there that enable exactly that – self service Analytics. This would also mean that the processes in companies have to shift a lot – away from traditional processes and data ownership. The goal of self-service analytics is diverse:
- Reducing the FTE input for Data Science. At the moment, we need data scientists to do the job. However, those people aren’t available at the market in large scale and are very hard to find. This leads to several issues in doing data science.
- Reducing the TTM. If for every business question we would need the help of a data scientist, every question will become a project that takes weeks. Decisions often need to be done fast, otherwise they might not be relevant at all.
In my previous paragraph, I was writing about the “ideal world”. Now you might question what is the business reality out there and what needs to be done in order to achieve this? Well, it is easer said then done. Basically, there are some organisational and technical measures that needs to be applied:
- No Silos. People can only work with Data if they have the full view of all available data. There should be no “hidden” data and everyone in the company should be capable of checking data for integrity. Knowledge means power and if one unit possess all the data, they are very powerful. Therefore, data should be “free” within the company.
- Self-service Data access. People and business units in the company should be capable of accessing data in an easy (and self-service) manner. It must be easy for them to find, search, retrieve and visualize data.
- Data thinking and mindset. Everyone in the company – ranging from top managers to business users – need to have a data thinking and mindset. This means that they should use data for all of their daily decisions rather than “gut feelings”. They should challenge their decisions with provability and data.
All of this needs some technical enablers:
- Governance, Metadata Management and Data Catalogs: I keep on repeating myself – but as long as these elementary things aren’t solved, the above one’s are impossible to reach. Most companies only do governance to an extend of legal and regulatory requirements, but they should do much more than that – enabling a self-service environment.
- Data Abstraction / Virtualisation: This is one of the key things to enable easy data access at some level. To all data sources, an easy interface – ideally with SQL-like feeling – should be available. This gives business users an easy tool to access all data, not just parts of it.
You might now think that the data scientist will get jobless? I would argue that it is the contrary. Self-service analytics isn’t made to handle the complex things – it is made for quick insights and proving that a business hypothesis might work. Based on this, much more questions will arise and thus create more work for data scientists. Also, achieving self-service analytics will lead to a lot of work for data engineers that finally have to integrate that data.