This reference architecture is specifically for users who require a production Eucalyptus cloud environment composed of a relatively small set of both commodity and enterprise grade hardware components, and for which the ultimate size of the deployment is limited to the capacity described later in the document.
Note: This reference architecture has a Scalable Web Services: Small (non-HA) variation.
Reference Architecture Sections
Jump to a specific section below:
- Use Case: Scalable Web Services
- Physical Resources
- Deployment Topology
- Data Center Management
- Summary Considerations
The benefits of choosing this architecture are:
- Composed of small collection of servers, networks, and storage devices for low-cost deployments
- Provides self-service environment for fast, stable allocation and execution of scalable Web services workloads and applications
- Designed using few independent components, in order to keep the overhead of system monitoring/management low
- Highly available, so that no single point of resource failure will disrupt Eucalyptus cloud services
This document is intended for readers who have familiarized themselves with AWS terminology, the Eucalyptus Installation/Admin/User Guides, and have experience implementing production data center solutions based upon Linux environments.
The following diagrams outline the general logical view, as well as the layout of physical servers, networks, and the Eucalyptus components of this reference architecture.
The purpose of this deployment is to cover a large (or starts small, capable of growing to larger) scalable Web services (SWS) use case, using enterprise grade hardware. Generally, for scalable Web services, the deployment will support a number of production, largely independent web-service applications running in parallel within the Eucalyptus cloud.
The SWS use case is one where there are several applications that need to run simultaneously within the cloud, each composed of potentially many different varieties of instances (LAMP stack, for example). These applications may be designed to scale up and down as application workload demand changes over time. There are expected to be several application stacks running within the cloud simultaneously, with each belonging to a relatively small number of cloud accounts. While there is expected to be a fair amount of 'churn' for each application with regard to spinning up and tearing down individual virtual machine instances as workload fluctuates, the SWS application use case is expected to be one where the application itself (minimum number of VM instances needed to consider the application as 'available') is long-lived. In other words, this deployment is designed to permit long-running application services with fluctuating sizes (number of VMs composing the entire application).
This use case is expected to have a 'medium to large' number of virtual machine images (EMIs), a 'medium to large' reliance on EBS volumes to store/provide access to static application data, and a 'small' number of Boot from EBS instances, which should only be used in cases where a certain aspect of the application demands a static server that is not intended to be a part of the dynamic 'scaling' of applications themselves (i.e. static servers in support of several applications, etc.).
The following list describes the workload capacity that deployments based upon this reference architecture can support. It should be noted that some of these capacity boundaries can be exceeded by deviating from the architecture, with readers being encouraged to Contact Eucalyptus for information on designing production Eucalyptus deployments.
- Max of 128 running virtual machine instances
- One cluster composed of maximum 16 nodes
- Max of 8 virtual machines per node
- Max of 128 simultaneous attached elastic block storage volumes
- Max of 256 independent active users (ex: max of 32 accounts, each with max of 8 users)
The following is the minimum resource requirements for the physical servers, networks, and storage that are needed to support this architecture. For each category of physical resource, more resources than the minimum will not have a negative impact on the deployment (more cores, more RAM, more local disks, faster interfaces, higher bandwidth networking, more disk capacity, etc.).
Minimum Front-end/Middle-tier Server Configuration
- 4 or more modern cores
- 16 or more GB of RAM
- 80 or more GB RAID 1/5/6 local disk for OS/Eucalyptus (see below for special storage requirements, based on component)
- Network: (see below for special network requirements, based on component)
- Network 1: Cloud Controller/Walrus/Cluster Controllers/Storage Controllers on 'public' user network. Cluster Controller and Walrus connected at 10GB, others are connected at 1GB. If more 10GB networking is available, it should be placed on the Storage Controller, Node Controllers, other components (in that order).
- Network 2: Cluster Controller/Node Controllers on 'private' 1GB cluster network
- Network 3: Node Controller/SAN network is 1GB
- Local RAID disk storage for Walrus, Storage Controller, and Node Controllers
- Walrus: 500 or more GB RAID1/5/6
- Walrus capacity impacts the total number of template images and S3 accessible data that is available
- Node Controller: 200 or more GB RAID1/5/6. RAID volumes are separate from OS disk (partition/volume) for all servers with Eucalyptus accessible RAID volumes
- Node Controller capacity impacts the total size of all instance store images that can run concurrently on a single node and the total number of images that can be cached on a single node
The following is a description of the Eucalyptus platform topology atop physical resources. For this use case, the topology is designed to allow for a minimum of servers used for the Eucalyptus platform, while providing enough capacity to give acceptable performance up to the specified maximums defined at the beginning of this document.
Eucalyptus Component Topology
Each server in the above physical model diagram will be running one or more Eucalyptus software components which together form the Eucalyptus platform. Listed here are the mappings of physical server to Eucalyptus component, where each server is configured to conform to at least the minimum requirements for servers defined previously.
- Front-end server 1: Cloud Controller/Walrus/User Console
- Front-end server 2: Cloud Controller/Walrus/User Console
- Cluster server 1: Cluster Controller/Storage Controller
- Cluster server 2: Cluster Controller/Storage Controller
- Cluster SAN: access to EMC/NetApp/Equallogic
- Node server 1-16: Node Controller x 16
Eucalyptus Configuration Options
The Eucalyptus platform is highly configurable, covering a wide variety of data center topologies, devices, software management systems and network/security policies. For this reference architecture, we list here certain fundamental configuration options which will provide the necessary service of the reference architecture balanced against minimal performance and management overhead. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these configurations.
- Networking mode: MANAGED-NOVLAN
- Public addresses: 196 (maximum number of virtual machines + 64 for allocatable elastic IPs)
- Security group size: minimum 32, maximum 128
- Storage Controller driver: SAN
- High Availability: yes
- Linux Distribution: CentOS 6 + KVM
- Java components (Cloud Controller, Walrus, Storage Controller): configured to run with increased heap size (60% of total available memory)
- Walrus DRBD configured to sync over 10GB network between Walruses
Eucalyptus includes a number of features that are in place to support specific aspects of production deployments that may or may not be required based on the user's preferences and constraints. Listed here are descriptions of some of these features as they apply to this particular reference architecture. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these features, if required.
- Reporting feature should be lightly used (configured to either be disabled or to poll at infrequent intervals) for this architecture. If it is a requirement of the deployment to supply fine grained or long-term reporting information, a data warehouse topology (extra machine) should be added to the deployment, with tooling in place to enforce periodic export/flush of reporting data to the warehouse.
- LDAP integration should be implemented only if required.
The sections that have been covered up to this point in this reference architecture have been outlining the design of a Eucalyptus software deployment along with a definition of minimum physical resource capacity and configurations. Next, we address additional technologies and techniques that surround the Eucalyptus software/hardware itself which are required to run a complete Eucalyptus private cloud in production.
Services provided by the Eucalyptus private cloud software platform:
- EC2-compatible private cloud virtual machine management platform
- S3-compatible storage platform
- Eucalyptus end-user Web based GUI console
- Eucalyptus end-user and admin CLI tools
- Service of creating, managing, and cleaning up virtual machines and related resource artifacts (EBS volumes, virtual networks, etc.)
- Eucalyptus service troubleshooting and problem resolution
Additional required services:
- Data-center server, network, storage, OS installation system
- Physical machine health and status monitoring
- Automatic resource performance monitoring and load-balancing
- Virtual machine, storage, network performance optimization
- Linux Distribution OS software and configuration management
- Dynamic deployment topology/physical infrastructure re-configuration
The Eucalyptus cloud platform software that provides AWS-compatible infrastructure as a service must be integrated with standard data center configuration, management, and monitoring software for production use. Each Eucalyptus component runs as a Linux process that must be configured through both configuration files and run-time configuration parameters, and must additionally be monitored along with physical resource health and status characteristics. There are a variety of User Interfaces that are available for use with your Eucalyptus deployment, including those that are included as part of the Eucalyptus platform as well as third party API, command-line and graphical interface software that is AWS compatible.
While the Eucalyptus software does not currently include the deployment of configuration management or system health/status monitoring solutions itself, there are several third party solutions that existing production deployments rely upon to perform these functions.
Production deployments based on this reference architecture should include the use of a third-party configuration management system in order to ensure that Eucalyptus configuration is correct both for initial deployment as well as under cases where a particular Eucalyptus server and software must be re-deployed.
Several options exist, and here we list those which are produced by organizations who have partnered with Eucalyptus to provide high quality integrations.
- Ansible configuration management and orchestration tool (find examples here on Github)
- Puppet Labs configuration management system
- Opscode Chef configuration management system
For an example of how to use/integrate Eucalyptus with Puppet, please refer to the following resources:
In addition to automated/controlled configuration management, a production Eucalyptus deployment based on this reference architecture should also be monitored via a third party solution to watch the health and status of the deployment, as well as to notify the cloud administrator when unexpected conditions are occurring. Basic monitoring includes but is not limited to:
- Physical resource availability (network ping and/or ssh access to physical servers running Eucalyptus components)
- Physical resource load
- Physical resource faults (as indicated by Linux fault notification mechanisms)
- Eucalyptus component faults (please refer to the Eucalyptus Admin Guide for information on monitoring for Eucalyptus faults)
For an example on how to set up an integrated Nagios with Eucalyptus environment, please refer to the following resource:
There are several other solutions for monitoring physical and software components of a data center, and here we list those which are developed by Eucalyptus partners:
As an AWS-compatible platform, Eucalyptus offers both a variety of user interface tools as well as the option to use third party AWS-compatible interfaces that interoperate with AWS and Eucalyptus. For information on installing and using the interfaces that are included by default with Eucalyptus, please refer to the Eucalyptus Install, Admin and User Guides.
- Eucalyptus Admin CLI tools (included with Eucalyptus, see the Admin Guide, and Command-line Reference Guide)
- Euca2ools CLI Guide (included with Eucalyptus)
- Eucalyptus User Console Guide (included with Eucalyptus)
- Eucalyptus and Enstratius (included with Eucalyptus subscription)
There are many other AWS-compatible user interface tools targeted at specific feature sets that are compatible with Eucalyptus: * Scalr * Example of using Scalr with Eucalyptus * HybridFox * s3cmd - for managing AWS S3 and CloudFront, and Eucalyptus Walrus services.
In addition to monitoring and managing the deployment's physical resources, application workload images and workflows must also be managed and configured. The Eucalyptus platform offers AWS-compatible APIs and services which allow external workload management systems to interoperate with AWS and Eucalyptus, and works to ensure that VM image environments between AWS and Eucalyptus are interoperable.
- Eucalyptus User Guide (see the Using Images section)
- Eucalyptus Starter Images
- AMI2EMI Project: AWS image to Eucalyptus image conversion tools
The reference architecture presented here is meant to encapsulate a bounded production Eucalyptus system. As with all use cases, there are variations that cannot reasonable be generalized, but we add here some comments and observations that will help to tune the individual use case variations to achieve efficient, stable performance within Eucalyptus.
- Keep individual system load low. If physical systems are over-provisioned with virtual machines (whether it be too many VMs running on a single system or few but resource intensive VMs that interfere with one another), the underlying operating system and Linux dependencies can become fragile and difficult to debug. Eucalyptus has many features that are designed to function even if the underlying system is underperforming and/or misbehaving, but it is always best to provide Eucalyptus and your workload environments enough resources to function smoothly.
- Consider bottlenecks. When designing a deployment, deciding on capacity to be provided to your applications and making capacity and performance hardware decisions, it is best to consider the data paths that Eucalyptus either provides or works in concert with at run-time. Please refer to the Datapaths series of diagrams to aid in identification of potential shared resource bottlenecks.