This is the last part of the interview with Mario Szpuszta, who works with Microsoft as Technical Evangelist for Windows Azure. Mario’s answers are stated with [MS], the questions are marked with [MMH]. The Interview is divided into several parts and will be published over the next weeks.
[MMH] Since you do a lot in Europe and your main partners are here as well. What problems do you see in Europe with Cloud Computing?
[MS] I think in Europe you can most often bring it down to legal and compliance. Everyone brings that on the table and very often uses that as arguments for not going to the cloud.
[MMH] Do you think that some of these problems will go away over time?
[MS] Well, I think it will get easier and it started getting easier, already! Cloud vendors are investing a lot in certifications and the like to make sure they are more compliant with regional data regulation and compliance policies. E.g. on Azure we have recently finalized the ISO 27001 certifications for our core services. There were some recent announcements on SSAE 16 (the successor of SAS70) and even HIPAA. The best way to fully understand those is to take a look at the Windows Azure Trust Center http://www.windowsazure.com/en-us/support/trust-center/.
All of these steps make it easier to drive discussions on cloud also in Europe…
[MMH] Now lets talk a little more about the technical details. I know you are more passionate about the technology. So could you give us some basic design considerations for Software Architecture with distributed Systems?
[MS] We could fill a whole interview just with that topic:)
So I need to be short and precise. First of all when it comes to web apps and web services I think most people should start with simple yet effective things. Still I see so many stateful apps and services. It is really hard to scale with those in a load balanced environment. So the first step in my opinion is to make sure that you get to a stateless design and implementation or at least outsource state into a separate state server or cache, for example. In my opinion that is the first big thing to make sure it’s in the blood of your application. That way you really scale across machines and can improve your performance simply by adding additional servers with your bits deployed.
Another design consideration trying to think and design more in an asynchronous fashion. Leverage queues and outsource complex tasks to background processes whenever possible. That really helps boosting the perceived performance of your application. And it truly helps again to distribute load across multiple nodes in your deployment effectively.
Distribution of load in your app and web server tiers is fine, but if you’re still running on one database in the backend that is going to become your bottleneck and can destroy all the efforts you’ve made on the tiers above. So you should think about distributing load by partitioning/sharding your data across multiple databases. When it comes to scalability having many small databases with load distributed across all of them (running on different servers, of course) is way more scalable than having one really big database that needs to deal with the whole load.
I think these are the practical things you can start thinking about, immediately. Of course there are many other theories that are applied by the truly big internet companies such as Facebook and the like. Many of those large scale, global players think about CAP and BASE instead of ACID transactions when it comes to writing back to the store. Just look at http://en.wikipedia.org/wiki/CAP_theorem if you want to learn more. I don’t cover them in detail because that’s (a) too complex and (b) in my opinion not relevant for most of the traditional ISVs as it goes way to far for many of them. I think most of us are really a big step forward by applying principles I mentioned before: stateless, work in load balanced environments, distribute load across multiple databases and the like. These are practical and most people can implement them sooner as compared to completely rethinking about how you deal with transactions in your system. Of course, if you want to have millions of customers on a global basis with thousands of concurrent users then you should rather think early about CAP and BASE instead of too late…
[MMH] That actually sounds like a huge effort to bring applications to Azure. How can people deal with that?
[MS] One simple thought I tend to follow: stay simple, be pragmatic and work on an architecture that is good enough for your business goals. If you want to address millions of customers then you should rather think about all of these changes and principles I mentioned before sooner. But if you want to stay, let’s say in your region, and your customer base should increase but not up to the millions or your scenario is for specific target groups then many of these CAP and BASE things are just over-engineering. As you can see – the decision on how far you need to go depends on your business goals and business plans;) And in the context of those I tend to stay pragmatic and simple…
[MMH] A final statement: what excites you most about Cloud Computing?
[MS] For me that is super-easy and comes down to one specific point: cloud computing and the principles that are being established with cloud computing brings business and technology closer together than I’ve seen it ever before. Just to give you one example: from a pure technical point-of-view in the past it was really hard, if not impossible, to differentiate an effective architecture from a less effective architecture. Of course I know many people will argue different, but at the end of the day it’s all about opinions in the world of architecture very often. In the context of cloud I can do that much better: an effective architecture leads to less monthly cost for operating an environment in the cloud as compared to a not so effective architecture. Of course that always has to be seen in the context of the business goals and is a bit simplified, but at the end of the day that’s what it is in my opinion. Breaking efficiency of architecture down at that level has been tremendously hard in the past – and now we’re moving into that direction. That excites me most!!