Posts

As described in an earlier post, another concept for cloud computing architectures is the concept of roles. But what exactly is a role? A role is something that shares common functionality. If we talk about a web shop, this web shop might consist of different roles such as:

  • Product presentation role
  • Shopping cart role
  • Order processing workflow role

The initial question of this is: isn’t this called SOA (Service oriented architectures)? The answer is: no. It is very similar to SOA and basically derives a lot of it’s concepts from SOA but it is rather a “SOA of SOA”. A role might contain one or more services, but a service is covered by exactly one role. Let us think about the shopping cart: an architecture for the shopping cart might hold different services. The role itself would be the shopping cart role with services for:

  • Shopping Cart modification Service
  • Checkout Service
  • Payment Service

A role collects different services and makes it easier for us to understand the way how an application is built. With SOA, we can go more into detail and give a better overview for those that have to implement the platform at the end. Different roles should be able to be scaled independently from each other to achieve great elasticity in the cloud. Furthermore, a role should only be in charge of similar tasks. If we think about the shopping cart again, we can definitely say that services in that role are somewhat similar. The illustration below shows how different roles can be built.

Application Role Separation in the cloud

Application Role Separation in the cloud


With the separation of different roles, it gives us more flexibility to built our platform more reliable. If the platform experiences an outage, less important roles might not receive any compute but the business critical roles will still receive the available resources. Therefore, it is possible to built a matrix on how important each role is and adjust scaling/elasticity to this matrix.
Different Roles are more important for others. In the web shop sample, an important role is the shopping cart since it is the place where the platform generates revenue. Not so important (in the case of an outage) is the comment functionality. If there is a shortage of resources or some serious issues with our platform, the platform can select to disable the comment role and use the now free resources for the shopping cart. However, this approach requires a smart system that can deal with the prioritisation of roles.
Picture Copyright by Moyan Brenn

As described in our previous post on cloud computing architectures, I would like to continue with the architectural approach given with Archimate. Archimate comes with two more layers, the application and the technology layer. The Application layer has some core concepts we can use for cloud computing architectures:

  • Services. Services are provided by our platform and expose and endpoint for other applications or services. This is often implemented by a web service. With a service, data is usually exposed to consumers or operations are made available. A typical service could be a customer service, that exposes customer data.
  • Interfaces. Interfaces are built for interoperability between different services or applications. A service typically provides an interface to the consumers. Consumers use the Interface but the implementation is done at the service side.
  • Application Component. An application component is a software application, that provides some way of interaction, maybe via a service. We could use an application component to retrieve data or so.
  • Data Object. This basically describes an object that contains data. This is typically a row in a database.

Of course, this list is not complete. For a more detailed description please refer to the Archimate Documentation. In the following sample, we will model a very simple application layer case.

Archimate application layer concepts

Archimate application layer concepts


In the above sample, I modelled a very abstract customer service. This service is used by a shopping system. The service itself queries Salesforce.com to retrieve the data via the Salesforce API. Salesforce.com is represented as a application component in our architectural sample. The customer service itself uses some data, which is contained in the customer data object.
The lowest layer for Cloud architectures is the technology layer. In this layer, we can model different instances – or let’s say roles (I will describe the concept of roles later on). This is basically useful when using Infrastructure as a Service (IaaS) or Platform as a Service (PaaS). With the technology layer we model the overall components used by our platform in an infrastructural way. Core concepts are:

  • Nodes. A node is normally an instance in Archimate, e.g. a single server. However, we use the node for a complete role. Imagine Netflix would model each of their servers with an individual node – they might end up having such a complex model that nobody would understand the model. A individual role (such as the front ends) are represented by a Node.
  • System Software. A system software is a software that fulfils elementary tasks. A sample for a system software is the apache web server or PHP, MySQL, … System software usually runs on a node. It is also possible to have more system software installed on a node.
  • Network. Individual Roles communicate with each other – either within the Role or with other roles. Therefore, some type of transportation must be available. This is represented by the network.
  • Communication Path. The communication path is another way of communication. This is not the network itself but it is a more abstract way of communication. Normally, the communication path could be represented by a message queue.

In the following sample, I tried to model a very basic architecture for an IaaS-Platform based on the concepts introduced above.

Archimate technology layer concepts for cloud computing architectures

Archimate technology layer concepts for cloud computing architectures


In the sample above, we have three major roles for our cloud platform architecture: front-ends, database and backend. The Front-end is one or more (if load balanced) virtual instances running Linux. On each instance, there is Apache and PHP installed. This is also called a stack, the shortcut for this stack is “LAP”. Normally, we would also use MySQL, which would be named “LAMP” (Linux, Apache, Mysql, Php). However, since our architecture should be scalable, I decided to outsource the database to another role. This gives us the possibility to scale the front-end and database independently. On the database role, we instal MySQL and some sort of reporting services for business intelligence. These two roles are connected via a LAN. The third role is a backend role. On this role, there is a custom application with OpenXML installed. This backend role basically processes data and returns documents. The communication isn’t done directly but it is done via a message queue. I will also explain this in a later post why I don’t communicate directly.
Picture Copyright by Moyan Brenn

As described in the last post, we will now take a look at how it is possible to architect cloud computing applications with the support of Archimate. Archimate was developed by the Open Group and it’s target is to achieve a great enterprise architecture. We will discuss each layer and figure out how we can apply this to cloud computing architectures.
A core concept of Archimate are views. Views in Archimate allows the architect to show different levels for different stakeholders. We won’t focus on that now. If you are interested, you can refer to the Archimate Documentation. We also discussed the different layers – business layer, application layer and technology layer. To model each layer, you can use different tools. I personally prefer “Archi” for this purpose. In the Screenshot below, you can see the main window of Archi.

Archimate design with Archi

Archimate design with Archi


Let us now start with some business layer concepts. In the image below you can find the business layer concepts available with Archimate.
Archimate Business Concepts

Archimate Business Concepts


Actors are individuals or departments that interact with the application. It is very similar to the UML Actor. We can also describe a Role, which is similar to an Actor but with a very specific “job”. If we build a customer support system, the Actor would be a “Support Engineer” but the Role would be “Windows Support Role”. An Actor might have different Roles, since the Role is dedicated and specialised.
Archimate Role and Actor

Archimate Role and Actor


Another important concept is the Process. In a business model, processes are key concepts since they give us an overview on the interactions associated with the Application or a Service. Processes also require triggers, which are modelled with the Event concept. A process may be realised by a Service.
 
Archimate Processes and Events

Archimate Processes and Events


These Concepts are only some of the concepts available in Archimate. However, this is only a short tutorial. The next post will be about the application layer and technology layer in Archimate.
Picture Copyright by Moyan Brenn

To find out about the Cloud Architectures we want to use, it is necessary to know the characteristics of distributed systems. Another important aspect we should consider looking at are the aims of cloud systems. If we know what aims and characteristics are, it might be easier to build a software architecture for the cloud. However, each implementation requires different approaches since each system is different from each other.
Let us start with the characteristics of cloud systems.
Cloud computing systems have a characteristic, that it is a sum of many systems (instances), but it feels like a single system for the end user. This means that it is transparent for end-users and end-users typically don’t see how many servers are involved. End users might even believe that this is running on a single server (but it is not like that). End users should not be confronted with the challenges architects/developers have to face when building a SaaS-Application. Just imagine Facebook, Twitter or Flickr: end users don’t really care about distribution, they simply want to use that platform. This means that our application has to look like a single software and hide the challenges we face for distribution.
A cloud computing system typically consists of different components. Components are various things such as instances, services, … If we build a distributed system, we have to utilise different instances with different roles. Some roles might host the website, other roles might do some background work. Different components is often implemented by SOA (Service Oriented Architectures) which is described later.
Different components as described above have to communicate with each other. Communication can be done with various technologies such as messaging. This requires the use of a message buffer.
Users of a cloud computing system can use the system in a transparent and consistent way. If a user travels, the system should represent the same state as it did before traveling. If the user works on a presentation for a client and saves the presentation, it should look absolutely the same when the user arrives at the customer’s site. Consistency is a challenge we often have to deal with in the Cloud and we will focus on that in a later post.
Cloud computing systems have to be extendable. If we have a SaaS-Application, it might serve 90% of all use-cases, but what happens with the 10% it can’t serve? It should be extented by additional services. If we look at a very important platform – Salesforce – it can be extended by a PaaS-Platform named “force.com”. To achieve extensibility, the platform must offer well-defined interfaces for potential consumers.These interfaces are typically served via an API.
The last characteristic we want to focus on is that distributed systems are often using a middleware. This middleware abstracts things from the operating system and serves interfaces to handle common things. We can see this with PaaS, where the middleware spans a large number of virtual instances. However, we simply don’t care about the operating system that is under the middleware.
Picture Copyright by Moyan Brenn

As referenced in the last post, the book “the art of scalability” provided a very good approach to cloud architectures or what has to be done. They also define some other great principles, resulting in 12 design principles for software architectures. Most of the principles are very interesting for cloud computing as well.
1. N+1 Design

There should be at least one additional System. Normally you should have 3 Systems. Premise: one System for me, one System for the Customer and one System for the Errors

2. Design for Rollback

Applications should always be able to be downgraded. It should be easy to downgrade a System in case of an Error.
3. Design to be Disabled
It must be necessary to disable a System or Parts of it, if there are problems. However, the overall System should not be effected by this.

4. Design to be Monitored

Not only IO or CPU Performance is important, it is more about  „intelligent“ Monitoring. We want to know the following:
  • When does the System act different as normal?
  • What future loads will I have?

 
5. Design for Multiple Live Sites

Backup and Recovery Centers should also carry parts of the load. If you have additional datacenters, you should use them for load, not just for recovery.

6. Use Mature Technologies

No Beta or CTP Versions but use versions that are stable.
7. Asynchronous Design
Asynch operations are more error-prone. However, they are harder to test.

8. Stateless Systems

Statefull Systems affect the system performance in a negative way and make scalability harder.
9. Scale Out, Not Up!
Scaling should be intelligent. Build a smart architecture that reduces load! Adding more Servers doesn‘t always solve the problem!

10. Design for at Least Two Axes of Scale

Divide Systems in different Parts. Scale Horizontal and vertical!
11. Buy When Non Core
Only use core competences that are already in the company. If it is not a core competency, buy it!

12. Use Commodity Hardware

High-End Hardware is more expensive
Picture Copyright by Moyan Brenn

The book “The art of Scalability” describes a very interesting approach to Software Architectures for distributed Systems. A key challenge is that a Software Architecture should be smart. But what exactly is “smart”? The book describes “smart” in a different sense and the letters are capitalised. We talk about a SMART architecture. Each of the letters represents an individual challenge to Software Architectures:

  • Specific
  • Measurable
  • Achievable
  • Realistic
  • Testable

Specific: The Architecture should solve a Problem. It doesn‘t need to be the „coolest“ one.
Measurable: Application basics must be measurable. Sample: the Service must return the data within 1 second, if 1 Million People access it. WRONG: The Service must be fast if a lot of people access it.
Achievable. The goals set by the architecture must be achievable. It makes no sense if the architecture allows everything but can‘t be done by the developers as it is too complex
Realistic. It is necessary to use the potential within an organisation. If the developers in a company use Java, it makes no sense to use other technology since they might fail.
Testable. Results must be testable.
Picture Copyright by Moyan Brenn

Whenever we think about scaling our Applications, we basically think about building a software architecture that supports scaling and selecting a technology such as IaaS or PaaS Platforms to achieve that goal. But scaling is more compliated than it seems. It is not only a thing that needs to be achieved in technology or software architectures.
An important question is, what needs to be scaled in an enterprise. It is not only the software architecture but also other factors:

  • Websites
  • Applications
  • Teams
  • Organisations

When talking about scaling organizations, some questions may arise:

  • How easy is it to add a Person to a Company/Team or remove a Person?
  • How can the work force be measured within the organisational structure?
  • What effort has to be made if a new Person is added to a company?
  • Does the company structure allows rapid organisational growth?

The Output of a team is not proportional to the number of people in a team. This is similar with Applications!
 

Scaling teams in a software project

Scaling teams in a software project


To achieve scalability, it is not only necessary to built an architecture that is made for scale but also to think about how to scale a team. Imagine you start with 5 employees and your start-up becomes super-famous. Your team might grow to 1,000 employees in some years. You need to think about how to solve this problem.
The following picture demonstrates how scaling problems might start in a company:
Productivity inhibitors in an IT project

Productivity inhibitors in an IT project


 
 

Picture Copyright by Moyan Brenn

For Cloud Solutions, Scalability and Elasticity are key requirements. In private Cloud Solutions, this should also be supported, even if scalability and elasticity might has a lower border as we see in the public Cloud. Scaling applications means that we can add a new instance of Linux or a Windows Server. Elasticity is something “more advanced” to that as described by Reuven Cohen, an opinion leader in Cloud Computing (Cohen, 2010). Reuven describes scalability as the possibility to “grow to the demands of the users on a platform” whereas he states that elasticity is something that reflects real-time conditions. A platform might have millions of users, but if this platform is only available in the United States, there might be significant fewer load on the servers during night. The load will be much higher at peak times and elasticity means that unnecessary instances are shut down if the load is lower or that new instances are started if the load is higher. (Owens, 2010) defines Elasticity as „the golden nugget of Cloud Computing“ and a key inhibitor to move to Cloud Environments. A very similar definition on what Cohen defined as elasticity is also provided by the National Institute of Standards and Technology  (Mell & Grance , 2011):
 

“Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.” –  (Mell & Grance , 2011)

 

Elasticity is something that can lower costs, but requires a lot of up-front work. Elasticity has to be supported by the software architecture and resource automation has to be put in place.

Picture Copyright by Moyan Brenn