This reference architecture is specifically for users who require a production Eucalyptus cloud environment to serve a medium to large dev/test use case using a combination of low cost commodity hardware augmented with enterprise grade hardware components where necessary. The ultimate size of the deployment is limited to the capacity described later in the document, and can be expanded through careful deployment and hardware placement planning.
Reference Architecture Sections
Jump to a specific section below:
- Use Case: Dev/Test
- Physical Resources
- Deployment Topology
- Data Center Management
- Summary Considerations
The benefits of choosing this architecture are:
- Composed of a combination of commodity and enterprise grade servers, networks, and storage devices for high performance, robust deployments
- Provides self-service environment for fast, stable allocation and execution of dev/test plans
- Designed to be able to scale up as workload/application demands increase
This document is intended for readers who have familiarized themselves with AWS terminology, the Eucalyptus Installation/Admin/User Guides, and have experience implementing production data center solutions based upon Linux environments.
The following diagrams outline the general logical view, as well as the layout of physical servers, networks, and the Eucalyptus components of this reference architecture.
This particular reference architecture is intended to implement a Eucalyptus cloud for a dev/test use case. For dev/test, there are a variety of virtual machines used as development environments (dev) as well as virtual machines that are instantiated in order to perform efficient testing from a known environment (test).
For the dev/test use case, so called 'dev' virtual machine servers (instances) are expected to have short to medium lifetimes, while 'test' instances are expected to have relatively short lifetimes. Generally, this architecture is intended to provide fast self-service instantiation of development environments, followed by instantiation of test environments against newly developed software that was produced as a result of development work. Under this type of general workflow, static data (data that needs to persist beyond the lifetimes of instantiated virtual machines) should be identified and de-coupled from the instances themselves as much as possible, resulting in low to medium usage of the static data storage facilities of Eucalyptus. Boot from EBS (bfEBS) instances are expected to be rarely utilized in deployments based on this architecture. However, the deployment does support limited use of bfEBS. If, for example, certain necessary servers that rely on static data are required to be available to service the needs of the other more transient dev/test environments, these would be candidates for use as bfEBS instances.
The following list describes the workload capacity that deployments based upon this reference architecture can support. It should be noted that some of these capacity boundaries can be exceeded by deviating from the architecture, with readers being encouraged to Contact Eucalyptus for information on designing production Eucalyptus deployments.
- Max of 2048 running virtual machine instances
- Up to 4 clusters composed of maximum 32 nodes each
- Max of 16 virtual machines per node
- Max of 128 simultaneous attached elastic block storage volumes per cluster
- Max of 1024 independent active users (max of 64 accounts, each with max of 16 users)
The following is the minimum resource requirements for the physical servers, networks, and storage that are needed to support this architecture. For each category of physical resource, more resources than the minimum will not have a negative impact on the deployment (more cores, more RAM, more local disks, faster interfaces, higher bandwidth networking, more disk capacity, etc.).
Minimum Front-end/Middle-tier Server Configuration
- 4 or more modern cores
- 32 or more GB of RAM
- 80 or more GB RAID 1/5/6 local disk for OS/Eucalyptus (see below for special storage requirements, based on component)
- Network: (see below for special network requirements, based on component)
Minimum Node Server Configuration
- 8 or more modern cores
- 32 or more GB of RAM
- 80 or more GB RAID 1/5/6 local disk for OS/Eucalyptus (see below for spevial storage requirements, based on component)
- Network: (see below for special network requirements, based on component)
- Network 1: Cloud Controller/Walrus/Cluster Controllers/Storage Controllers on 'public' user network. Cluster Controller and Walrus connected at 10GB, others are connected at 1GB. If more 10GB networking is available, it should be placed on the Storage Controller, Node Controllers, other components (in that order).
- Network 2: Cluster Controller/Node Controllers on 'private' 1GB cluster network, which is 'vlan clean'
- Network 3: Node Controller/SAN network is 10GB
- Local RAID disk storage for Walrus, Storage Controller, and Node Controllers
- Walrus: 1000 or more GB RAID1/5/6
- Walrus capacity impacts the total number of template images and S3 accessible data that is available
- Node Controller: 400 or more GB RAID1/5/6. RAID volumes are separate from OS disk (partition/volume) for all servers with Eucalyptus accessible RAID volumes
- Node Controller capacity impacts the total size of all instance store images that can run concurrently on a single node and the total number of images that can be cached on a single node
The following is a description of the Eucalyptus platform topology atop physical resources. For this use case, the topology is designed to allow for a minimum of servers used for the Eucalyptus platform, while providing enough capacity to give acceptable performance up to the specified maximums defined at the beginning of this document.
Eucalyptus Component Topology
Each server in the above physical model diagram will be running one or more Eucalyptus software components which together form the Eucalyptus platform. Listed here are the mappings of physical server to Eucalyptus component, where each server is configured to conform to at least the minimum requirements for servers defined previously.
- Front-end server 1: Cloud Controller
- Front-end server 2: Walrus
- Front-end server 3: User Console
- Cluster server 1: Cluster Controller
- Cluster server 2: Storage Controller
- Cluster accessible SAN: EMC, NetApp, Equallogic
- Node server 1-128 (32 nodes x 4 clusters): Node Controller x 128
Eucalyptus Configuration Options
The Eucalyptus platform is highly configurable, covering a wide variety of data center topologies, devices, software management systems and network/security policies. For this reference architecture, we list here certain fundamental configuration options which will provide the necessary service of the reference architecture balanced against minimal performance and management overhead. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these configurations.
- Networking mode: MANAGED
- Public addresses: 2304 (maximum number of virtual machines + 64/cluster for allocatable elastic IPs)
- Security group size: minimum 32, maximum 512
- Storage Controller driver: SAN
- High Availability: no
- Linux Distribution: CentOS 6 + KVM
- Java components (Cloud Controller, Walrus, Storage Controller): configured to run with increased heap size (60% of total available memory)
Eucalyptus includes a number of features that are in place to support specific aspects of production deployments that may or may not be required based on the user's preferences and constraints. Listed here are descriptions of some of these features as they apply to this particular reference architecture. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these features, if required.
- Reporting feature should be lightly used (configured to either be disabled or to poll at infrequent intervals) for this architecture. If it is a requirement of the deployment to supply fine grained or long-term reporting information, a data warehouse topology (extra machine) should be added to the deployment, with tooling in place to enforce periodic export/flush of reporting data to the warehouse.
- LDAP integration should be implemented only if required.
The sections that have been covered up to this point in this reference architecture have been outlining the design of a Eucalyptus software deployment along with a definition of minimum physical resource capacity and configurations. Next, we address additional technologies and techniques that surround the Eucalyptus software/hardware itself which are required to run a complete Eucalyptus private cloud in production.
Services provided by the Eucalyptus private cloud software platform:
- EC2-compatible private cloud virtual machine management platform
- S3-compatible storage platform
- Eucalyptus end-user Web based GUI console
- Eucalyptus end-user and admin CLI tools
- Service of creating, managing, and cleaning up virtual machines and related resource artifacts (EBS volumes, virtual networks, etc.)
- Eucalyptus service troubleshooting and problem resolution
Additional required services:
- Data-center server, network, storage, OS installation system
- Physical machine health and status monitoring
- Automatic resource performance monitoring and load-balancing
- Virtual machine, storage, network performance optimization
- Linux Distribution OS software and configuration management
- Dynamic deployment topology/physical infrastructure re-configuration
The Eucalyptus cloud platform software that provides AWS-compatible infrastructure as a service must be integrated with standard data center configuration, management, and monitoring software for production use. Each Eucalyptus component runs as a Linux process that must be configured through both configuration files and run-time configuration parameters, and must additionally be monitored along with physical resource health and status characteristics. There are a variety of User Interfaces that are available for use with your Eucalyptus deployment, including those that are included as part of the Eucalyptus platform as well as third party API, command-line and graphical interface software that is AWS compatible.
While the Eucalyptus software does not currently include the deployment of configuration management or system health/status monitoring solutions itself, there are several third party solutions that existing production deployments rely upon to perform these functions.
Production deployments based on this reference architecture should include the use of a third-party configuration management system in order to ensure that Eucalyptus configuration is correct both for initial deployment as well as under cases where a particular Eucalyptus server and software must be re-deployed.
Several options exist, and here we list those which are produced by organizations who have partnered with Eucalyptus to provide high quality integrations.
- Ansible configuration management and orchestration tool (find examples here on Github)
- Puppet Labs configuration management system
- Opscode Chef configuration management system
For an example of how to use/integrate Eucalyptus with Puppet, please refer to the following resources:
In addition to automated/controlled configuration management, a production Eucalyptus deployment based on this reference architecture should also be monitored via a third party solution to watch the health and status of the deployment, as well as to notify the cloud administrator when unexpected conditions are occurring. Basic monitoring includes but is not limited to:
- Physical resource availability (network ping and/or ssh access to physical servers running Eucalyptus components)
- Physical resource load
- Physical resource faults (as indicated by Linux fault notification mechanisms)
- Eucalyptus component faults (please refer to the Eucalyptus Admin Guide for information on monitoring for Eucalyptus faults)
For an example on how to set up an integrated Nagios with Eucalyptus environment, please refer to the following resource:
There are several other solutions for monitoring physical and software components of a data center, and here we list those which are developed by Eucalyptus partners:
As an AWS-compatible platform, Eucalyptus offers both a variety of user interface tools as well as the option to use third party AWS-compatible interfaces that interoperate with AWS and Eucalyptus. For information on installing and using the interfaces that are included by default with Eucalyptus, please refer to the Eucalyptus Install, Admin and User Guides.
- Eucalyptus Admin CLI tools (included with Eucalyptus, see the Admin Guide, and Command-line Reference Guide)
- Euca2ools CLI Guide (included with Eucalyptus)
- Eucalyptus User Console Guide (included with Eucalyptus)
- Eucalyptus and Enstratius (included with Eucalyptus subscription)
There are many other AWS-compatible user interface tools targeted at specific feature sets that are compatible with Eucalyptus:
- s3cmd - for managing AWS S3 and CloudFront, and Eucalyptus Walrus services.
In addition to monitoring and managing the deployment's physical resources, application workload images and workflows must also be managed and configured. The Eucalyptus platform offers AWS-compatible APIs and services which allow external workload management systems to interoperate with AWS and Eucalyptus, and works to ensure that VM image environments between AWS and Eucalyptus are interoperable.
- Eucalyptus User Guide (see the Using Images section)
- Eucalyptus Starter Images
- AMI2EMI Project: AWS image to Eucalyptus image conversion tools
The reference architecture presented here is meant to encapsulate a bounded production Eucalyptus system. As with all use cases, there are variations that cannot reasonable be generalized, but we add here some comments and observations that will help to tune the individual use case variations to achieve efficient, stable performance within Eucalyptus.
- Keep individual system load low. If physical systems are over-provisioned with virtual machines (whether it be too many VMs running on a single system or few but resource intensive VMs that interfere with one another), the underlying operating system and Linux dependencies can become fragile and difficult to debug. Eucalyptus has many features that are designed to function even if the underlying system is underperforming and/or misbehaving, but it is always best to provide Eucalyptus and your workload environments enough resources to function smoothly.
- Consider bottlenecks. When designing a deployment, deciding on capacity to be provided to your applications and making capacity and performance hardware decisions, it is best to consider the data paths that Eucalyptus either provides or works in concert with at run-time. Please refer to the Datapaths series of diagrams to aid in identification of potential shared resource bottlenecks.