Posts

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

Networking with CloudStack

Networking with CloudStack can be achieved with two topologies: the first topology is handled like with Amazon Web Services (AWS). This enables guest isolation via IP-Filtering. More networking possibilities are delivered with the “advanced” networking options. The advanced option allows multiple networks in a zone. Each individual network in an advanced setup needs to have a specific network type. They can be guest mode, management mode, public mode and storage mode.

Multi-tenancy

CloudStack provides multi-tenancy with the concept of Accounts, Domains and Users. An account is typically a tenant. Each Account may contain more users. A Domain allows the datacenter provider to group similar account types and to ease management of them. CloudStack may be extended by LDAP services such as Active Directory. Another concept is the “Project”. A project is a group of users working on similar tasks. Within a marketing department might be different project such as “product launch web site”. Several users might need to work on this project. Billing can be based either on the user’s consumption or on the project consumption, which allows even more detailed billing on a project basis. Project can also be limited in resource usage.
Header Image Copyright by Horia Varlan

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

The Management Server

The Management Server is the entry point to the CloudStack Cloud. It manages all nodes and it exposes the API as well as the graphical user interface (GUI). Typically, the Management Server runs on a dedicated machine or virtual machine. The Management Server uses Tomcat and a MySQL Database for persistence. The Management Server also assigns public and private IP addresses and it also deals with the allocation of storage to the guests as virtual disks. CloudStack allows the management of snapshots, templates and ISO images, which is also provided by the Management Server.

Cloud Infrastructure

The Cloud Infrastructure consists of several layers. The lowest level is the host itself, which is a node where virtual instances run on. Nodes usually get added to a cluster. A cluster contains several nodes and has a primary storage attached. Clusters are part of a Pod, which is typically a hardware rack including a layer-2 switch and a secondary storage. Pods are now part of a “Zone”, which represents a datacenter.

CloudStack Organisation

CloudStack Organisation


Zones are the largest entity in a CloudStack deployment. A zone normally represents a datacenter. Building various zones has the same benefits as building more datacenters: it enables replication and redundancy. CloudStack distinguishes between public and private zones. With this concept, it is possible to provide a public zone to all users and several private zones to specific users like the marketing or accounting department. When a new instance gets started, the user must select in which zone it should be launched. Clusters provide the ability to group similar nodes. They normally share the same or a very similar hardware, the same hypervisors, are in the same subnet and they share a primary storage. In large datacenters, clusters can be built for different hardware groups such as nodes with high memory, others with high CPU and or GPU-based Nodes. There are plenty of possibilities to distinguish between different hardware with the concept of clusters. ISCSI or NFS servers provide primary Storage and it is shared within a cluster. The primary storage stores all disk images of running virtual machines within the cluster. Secondary storage is associated with the zone and it’s purpose is to store templates, snapshots and ISO images.
Header Image Copyright by marya

CloudStack is currently available in the Version 4.0 and was usually initiated by Cloud.com, which was later acquired by Citrix. The source code for CloudStack is available open source and it is maintained as an Apache Project. The target of CloudStack is similar to the other 3 described projects: provide an Infrastructure as a Service Software. CloudStack supports both commercial hypervisors as well as open source hypervisors. From the commercial side, CloudStack currently implements Citrix XenServer and VMware vSphere and as for open source hypervisors there is support for XEN and KVM running on Ubuntu or CentOS. CloudStack is built to run tens of thousands of virtual Servers in geographically distributed regions. There is one managing server for all clusters, which makes cluster-wide management servers unnecessary. CloudStack configures each node automatically regarding storage and networking. Internally managed virtual appliances take care of firewalling, routing, DHCP, VPN access, console proxy, storage access, and storage replication. CloudStack also offers a graphical user interface (GUI) to ease configuration. The CloudStack API also supports Amazon Web Services (AWS) EC2 and S3. CloudStack provides an extensibility API, allowing solution providers to extend the capabilities of CloudStack. CloudStack consists of two major components: the Management Server and the Cloud Infrastructure. The Management Server controls the Cloud Infrastructure and there is typically one of that kind. The Cloud Infrastructure consists of various nodes running virtual Instances and the Management Server manages each of them. The Cloud Infrastructure consists of one or more dedicated Servers, but in a minimal installation it can also be run on the same machine as the Management Server.

CloudStack Overview

CloudStack Overview


Header Image Copyright by Alexandre Dulaunoy

OpenStack has some main components: Interfaces & API, Groups and Users, Networking, Storage, Hosts and Clusters.

  • Interfaces & API. The two main interfaces for interaction with OpenNebula are the Command Line Interface (CLI) and the Graphical User Interface (GUI). The Graphical User Interface is also known as “Sunstone”. OpenNebula offers different APIs for Developers to extend the functionality or built on top of OpenNebula. These APIs are currently available as Amazon Web Services (AWS) APIs and OCCI Implementation.
  • Groups and Users. OpenNebula allows different groups and users. It is also possible to integrate with different services such as LDAP and Microsoft’s Active Directory. Multi-tenancy is possible by default, which eases billing and accounting. OpenNebula comes with the following standard users: administrators, regular users, public users and service users. Administrators are in charge of administrative tasks within OpenNebula, regular users can use the functionality of OpenNebula in the self-service Portal. Public users are restricted users that may only use a subset of the functionality and service users are users that can use the APIs or Interfaces in OpenNebula.
  • Networking. The Networking interface in OpenNebula is fully extensible, which allows almost any integration in existing data centers. There is also support for VLAN and Open vSwitch.
  • Storage. OpenNebula supports different storage systems such as the file system storage, distributed network storage or block storage.
  • Hosts. OpenNebula supports the following hypervisors: Xen, KVM and VMware on the host. A host has three main components: the host management, the cluster management and the host-monitoring component. The host management is implemented by the “onehost” and allows common operations on hosts such as initial setup or the machine lifecycle management. The cluster management allows placing a host in a specific cluster. This is implemented by the “onecluster” command. Host monitoring is done with the information driver (IM). Monitoring allows administrators to gather information about the health of a host.
  • Clusters. Clusters are a pool of hosts that share networking and data stores. A Cluster can be compared to a zone. Clusters are typically fulfilling different needs such as the production/testing differences.

OpenNebula allows the grouping of different Hosts into a virtual data center (VDC) within a cluster. Different Hosts can also be grouped into zones that allow better administration for similar hosts.
Header Image Copyright by European Southern Observatory

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.
OpenNebula is an open source Software for Infrastructure as a Service Solutions, which started as a research project in 2005. The first public release was available in 2008. Ubuntu, Debian and OpenSuse currently support OpenNebula. The project is funded by European Institutions. OpenNebula provides Amazon Web Services (AWS) EC2 and Elastic Block Storage (EBS) APIs, as well as OGF OCCI (Open Cloud Computing Interface) APIs. OpenNebula also provides a self-service Portal to their users. OpenNebula has several third-party tools for Software Stack automation and it is easy to integrate a marketplace for applications in OpenNebula platforms. Administrators have their own portal, which is called “Sunstone”, and OpenNebula provides a Unix-inspired command line interface (CLI). OpenNebula Marketplace allows virtual appliances to be managed and run in OpenNebula environments.
Billing is basically easy as there is a fine-grained accounting and monitoring system available. Account Controls and quota management allows administrators to set limits on compute, storage and network utilization. To enable this, OpenNebula supports multi-tenancy built into the system. OpenNebula can be extended by popular directory services such as LDAP or Active Directory.
OpenNebula distinguishes between clusters and virtual data centers. Clusters are a pool of hosts that share data stores. Clusters also support virtual networks dedicated to load balancing, high availability and high performance computing. Virtual data centers are isolated virtual infrastructures where an administrator can manage the compute, storage and network capacity. OpenNebula is built for high availability with a persistent database as a backend.
A key challenge for OpenNebula is to allow the management of large enterprise data centers. To fulfill these needs, a complete life cycle for virtual resource management is possible and can be extended with a hooking system. The virtual infrastructure can be controlled, monitored and accounted to the correct tenants.
Header Image Copyright by Bob Familiar

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

Walrus

Walrus is also called “WS3” and is the storage service provided by Eucalyptus. The Storage Service provides simple storage functionality, which is exposed by ReSTful and Soap APIs. Walrus takes care of storing the virtual machine images, storing the snapshots and serving Files. As with all other public facing Services in Eucalyptus, these Services are based on the Amazon Web Services API.
Containers in Walrus Storage are called „Buckets“ and they have to be unique across accounts, just like it is with Amazon Web Services (AWS). Some naming restrictions are:

  • Containers can contain lowercase letters, numbers, periods (.), underscores (_), and dashes (-)
  • Container Names must start with a number or letter
  • The Length of a Name must be between 3 and 255 characters long
  • It is not allowed to use an IP-Address as Name (e.g., 265.255.5.4)

The maximum File Size in a Walrus Container is 5 Terabytes and Files can either be public or private. If the Container should be deleted, a container must be empty, which means that all files have to be deleted prior to deleting the container. Files are identified via unique Keys represented by Uniform Resource Identifiers (URIs).
Common Actions performed on the Walrus storage are the creation of containers, store data in containers, download data and grant or deny permissions. These Actions can be performed via the ReSTful or SOAP Interfaces. The Walrus Storage distinguishes two major read options: consistent read or eventually consistent read. The later one is faster but might server inconsistent data whereas the first one might have higher latency but data is always consistent.

Storage Controller

The Storage Controller is comparable to the Elastic Block Storage (EBS) for Amazon Web Services. Elastic Block Storage is a fast storage for virtual Image Files. The Storage Controller takes care of the creation of persistent EBS devices. Block Storage Devices are typically provided over over the ATAoverEthernet or iSCSI protocol to the instances.
The header image is provided by  jar (away for a while) under the creative commons licence.

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

Cloud Controller

The Cloud Controller – also known as CLC – is the highest level in Eucalyptus. There is one Cloud Controller per infrastructure. The Cloud Controller is in charge of the following tasks:

  • Connect to virtual instances via SSH
  • Provide a Front end for the Web Services that are EC2 and S3 compatible
  • The Cloud Controller acts as a Meta Scheduler for the Cloud Infrastructure and determines which infrastructure to use.
  • The Cloud Controller collects resource information from Cluster Controllers

The Cloud Controller runs per default on same machine as Walrus und the Storage Controller.

Eucalyptus architecture

Eucalyptus architecture


The Cloud Controller acts as the main Element for a Eucalyptus Cloud. Each Eucalyptus-based Cloud starts with the Cloud Controller. Different Zones or Regions are realized with a Cluster Controller. There is exactly one Cloud Controller.

Cluster Controller

The Cluster Controller (CC) comes next in hierarchy after the Cloud Controller (CLC). There is exactly one Cluster Controller per location. A location could be compared to an Availability Zone within a Region in Amazon Web Services. The Cluster Controller is basically in charge of receiving requests from the Cloud Controller to deploy new virtual Instances. The Cluster Controller decides which Node is used for the new virtual Instance. The Cluster Controller also maintains virtual Networks available to the instances and collects information about the Node Controllers registered. This information is reported to the Cloud Controller. Each Cluster can have exactly one Cluster Controller.

Eucalyptus process

Eucalyptus process


When a new Instance is started, the Cloud Controller is instructed with the Image, Instance Type and Instance Number. The Cloud Controller looks up a Cluster Controller with enough available resources and selects one to start the instance. The Cloud Controller now itself looks up Node Controllers with enough resource availability and instructs the Node Controller to launch a new virtual Instance. If the Image requested is not available on the Node, the Node Controller looks up the Image by asking the Cloud Controller. The Cloud Controller now provides the Image via Walrus to the Node.

Node Controller

The Node Controller is the lowest Level in the Eucalyptus Stack. Node Controllers run on each physical instance, where virtual machines should run on. Node Controllers support XEN and KVM for virtualization purposes. A Node Controller is in charge of collecting data on the resources available on each instance. It also reports the utilization of the Node the Cluster Controller, to inform the Cluster Controller about the utilization and availability of the instance. The Node Controller also takes care of Instance life cycle management.
The header image is provided by  jar (away for a while) under the creative commons licence.

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.
Eucalyptus was developed at the University of California, Santa Barbara (UCSB) and is provided under the GNU GLP v3 Open Source License. The name Eucalyptus stands for  “Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems”. Its main target is to enable the execution and control of virtual instances with Xen or KVM under Linux and to provide an API that is compatible to Amazon Web Services (AWS). Since Eucalyptus is basically built upon the Amazon APIs, it is great for hybrid Cloud Solutions. The first version of Eucalyptus was released in 2008.

Platform Description

Each Eucalyptus component runs as UNIX service and communication between the components is based on SOAP Web services. Eucalyptus infrastructure may consist of one or more locations, which represent different datacenters. Eucalyptus consists the Cloud Controller, Cluster Controller, Node Controller, Walrus and the Storage Controller. The Platform provides Tools that are called “Euca2ools”, which are written in Python. The command-line tools distributed by Amazon Web Services inspire Euca2ools. There are two major Tools:

  • api-tools (Command Line interface to EC2)
  • ami-tools (Command Line interface to work with Amazon Machine Image)

With Euca2ools, it is possible to:

  • Do queries on availability zones (i.e. clusters in Eucalyptus)
  • Do SSH key management (add, list, delete)
  • Manage virtual Instances (start, list, stop, reboot, get console output)
  • Configure and Manage Security groups
  • Configure and Manage Volumes and snapshots (attach, list, detach, create, bundle, delete)
  • Manage Images (bundle, upload, register, list, deregister)
  • Manage IP addresses (allocate, associate, list, release)

All Configuration Elements for Eucalyptus are stored in a config-file as Key-Value Pairs. To start Eucalyptus, the configuration must be finished. Eucalyptus needs to connect to Clients (End Users) and Cloud Components (CC, Walrus, etc.). Therefore, network management is essential. Eucalyptus knows the following networking topologies:

  • Managed Mode. With Manged Mode, Eucalyptus provides all Networking Features such asVM Network Isolation, Security Groups, Elastic IPs and Metadata Service. A Cluster Controller must be in the same broadcasting Domain as the Node Controllers with Managed Mode. Furthermore, all Cluster and Node Controllers must be configured.
  • Managed Mode without VLAN. This is basically the same as Managed, but no VLAN is used. The Connectivity must be made by Ethernet and all Cluster Controllers and Node Controllers must be in the same Broadcast Domain.
  • System Mode. Eucalyptus mostly stays out of the way in terms of VM networking and basically relies on DHCP service to configure VM networks On all Cluster Controllers, VNET_MODE=”SYSTEM“ and on a Node Controller, a Bridge must be specified.
  • Static Mode. Eucalyptus DHCP Server „issues“ the Network Configuration. Nodes must be configured with VNET_MODE=”STATIC”.

The header image is provided by  jar (away for a while) under the creative commons licence.

In the next blog posts, I will describe some major Open Source Cloud Computing platforms. I will cover the 4 major platforms, including:

  • OpenStack
  • Eucalyptus
  • OpenNebula
  • CloudStack

This series will run alongside the self service IT series. By the end of the series, I will compare these 4 platforms with the self service attributes I will evaluate during the series. So keep on reading all of them 🙂