In one of my last posts, I wrote about the fact that Cloud is more PaaS/FaaS then IaaS already. In fact, IaaS doesn’t bring much value at all over traditional architectures. There still are some advantages, but they remain limited. If you want to go for a future-proove archtiecture, Analytics needs to be serverless analytics. In this article, I will explain why.
What is serverless analytics?
Just as similar with serverless technologies, serverless analytics also follows the same concept. Basically, the idea behind that is to significantly reduce the work on infrastructure and servers. Modern environments allow us to “only” bring the code and the cloud provider takes care about everything else. This is basically the dream of every developer. Do you know the statement “it works on my machine”? With serverless, this is ways easier. You only need to focus on the app itself, without any requirements on operating system and stack. Also, execution is task- or consumption-based. This means that eventually you only pay for what is used. If your service isn’t utilised, you don’t pay for it. You can also achieve this with IaaS, but with serverless it is part of the concept and not something you need to enable on.
With Analytics, we now also march towards the serverless approach. But why only now? Serverless is around for already some time? Well, if we look at the data analytics community, it always used to be a bit slower than the overall industry. When most tech stacks already migrated to the Cloud, analytics projects were still carried out with large Hadoop installations in the local data center. Also back then, the Cloud was already superior. However, a lot of people still insisted on it. Now, data analytics workloads are moving more and more into the Cloud.
What are the components of Serverless Analytics?
- Data Integration Tools: Most cloud providers provide easy to use tools to integrate data from different sources. A GUI makes the use of this easier.
- Data Governance: Data Catalogs and quality management tools are also often parts of any solution. This enables a ways better integration.
- Different Storage options: Basically, for serverless analytics, storage must always be decoupled from the analytics layer. Normally, there are different databases available. But most of the data is stored on object stores. Real-time data is consumed via a real-time engine.
- Data Science Labs: Data Scientists need to experiment with data. Major cloud providers have data science labs available, which enable this sort of work.
- API for integration: With the use of APIs, it is possible to bring back the results into production- or decision-making systems.
How is it different to Kubernetes or Docker?
At the moment, there is also a big discussion if Kubernetes or Docker will solve this job with Analytics. However, this again requires the usage of servers and thus increases the maintenance at some point. All cloud providers have different Kubernetes and Docker solutions available, which allows an easy migration later on. However, I would suggest to go immediately for serverless solutions and avoid the use of containers if avoidable.
What are the financial benefits?
It is challenging to measure the benefits. If the only comparison is price, then it is probably not the best way to do so. Serverless Analytics will greatly reduce the cost of maintaining your stack – this will go close to zero! The only thing you need to focus on from now on is your application(s) – and they should eventually produce value. Also, it is easier to measure IT on the business impact. You get a bill for the applications, not for maintaining a stack. If you run an analysis, you will get a quote for it and the business impact may or may not justify the investment.
If you want to learn more about Serverless Analytics, I can recommend you this tutorial. (Disclaimer: I am not affiliated with Udemy!)
Leave a Reply
Want to join the discussion?Feel free to contribute!