I have been beating the Maintainability drum lately, and highlighting what the latest Eucalyptus did in that regard. I'm not done yet. This time I want to change the angle of approach, focusing more on the Map Workload to Cloud Resources step, using examples and some back-of-the-envelope calculations.
Will my Internet be Faster?
Back in the day, I helped a user going through the installation of an ancient Eucalyptus (I think it was 1.4), and after some hurdles (there was no Faststart back then), he finally got instances up and running on his two nodes home setup. Then he asked "Will my Internet connection be faster now?".
I think the question points more to how Cloud Computing has been a buzzword, and has been perceived as a panacea to all IT problems. But it is also a reminder of the need to understand the underlying hardware Infrastructure, in order to achieve the desired performance. An on-premise Cloud is able to go as fast as the underlying Infrastructure, and will have as much capacity as the hardware supporting it. There is a tremendous flexibility that the Cloud provides, yet there is also the risk to under-perform, if the Physical Infrastructure is not prepared for the workload.
Case In Point: Cloud Storage
At the end of the day, all Cloud Resources map to physical resources: RAM, CPU, network, disks. I will now focus on the Cloud Storage story because of its importance (anyone cares about their data?), and because historically it is where there have been some interesting issues. In particular Eucalyptus 3.1.2 was needed because high disk load caused seemingly random failures.
From my chicken scratch of the first figure, you should see how Ephemeral Storage resides on the NCs (Node Controller), while EBS (Elastic Block Storage) is handled by the SC (Storage Controller). Let's quickly re-harsh when the two are used:
- Ephemeral: both instance-store and bfEBS (boot from EBS instances) uses Ephemeral, although instance-store use Ephemeral Storage also for root, and swap;
- EBS: any instance with an attach Volume uses EBS, and bfEBS uses Volume for root, and swap.
and which kind of contention there is:
- Ephemeral: any instance running on the same NC will compete for the same storage;
- EBS: all instances using EBS within the same availability zone (Eucalyptus parlance for cluster) will access and use the same SC.
I used a simple spreadsheet to aid my examples. Feel free to copy it, play with it, enhance it, but please consider it a learning tool and not a real calculator: way too many variables have been left behind for the sake of simplicity.
In my examples I will measure the underlying storage speed with IOPS.
|IOPS values may vary dramatically. The above may be
used only to have an indication of the expected performance.
In the following examples, I will make the very unreasonable assumption that instances will access equally all their storage (both Ephemeral, and EBS), and that will either use it 20% or 100% of the time. Moreover in the 20% case, an oracle minimizes the concurrent disk access of all instance (ie if there are less than five running instances, they will not compete at all and see the full speed of the storage).
Thus one is a very light scenario, where the instances are mainly idles, while the other (100%) assumes the instances running benchmarks. Starting instances is a fairly disk intensive process, first because Eucalyptus needs to prepare the image to boot (which involve copying multi-GB files), and then because the OS will have to read the disk while booting. I added a column to the spreadsheet to show the impact of starting instances on the light workload.
A small Cloud installation will most likely have the SC backed by local storage. Let's use an IOPS calculator to estimate the performance. Here I will use 2 Seagate Cheetah 15K rpm, and RAID 0, which gives about 343 IOPS (I will round it to 350). For the NC, I will assume 150 IOPS which should be a reasonably fast single disk (non SSD).
For a Home setup three NCs seems a good number to me. Each NC should have enough cores and RAM to allow more than ten instances running (12-24 cores, 12-24 GB RAM should do). If I run one instance-store, one bfEBS instance, and have one Volume per NC, the very unrealistic calculator gives
|Light load on the home setup: slowest
storage is still comparable to a 5400RPM disk.
Not bad for my Home setup. Even if the instances were to run iozone on all the disks, I can still see a performance of a slow 5400RPM disk. Now, let me create more load: four instance-store, four bfEBS, and have two Volume used per NC
|The home setup with a heavier load doesn't do thatwell:
instances may see performance as slow as a floppy drive.
That's a bit more interesting. If the instance are very light in disk usage, they will see the performance of a 7200RPM disk, but under heavier load, they will be using something barely faster then a floppy. Ouch!
A More Enterprisy setup
From the previous example, is fairly obvious why bigger installations tend to use a SAN for their SC storage back-end. For this example I will use a Dell Equallogic. I will use a setup that gives a 5000 IOPS. Correspondingly, the number of NCs are increased to 10.
Let's start with a light load: one instance-store, one bfEBS, and one Volume per NC (similarly to the Home setup, although now there is a total of 20 running instances).
|A SAN backed cloud with a
light load: pretty good all around.
The results are pretty good with access to EBS around 250 IOPS under heavy load, and very fast access on the light load. Even Ephemeral compares well with a 3.5" desktop-class disk.
Now I will run more instances: four instance-store, four bfEBS, and have 4 Volumes per NC. .
|A SAN backed cloud with an heavier load: EBS
is now comparable to a 5400 RPM under heavy load.
Ephemeral still takes a beating: as in the Home setup case, there are eight different instances using the same local disk (bfEBS has access to Ephemeral too, and in my simplistic approach all disks are used at the same rate) . EBS slowed down quite a bit, and now it compares to a slow desktop-class disk. Although the instances should still have enough IOPS to access storage, perhaps it is time to start thinking about adding a second availability zones to this setup.
The above examples didn't consider Snapshots at all. Snapshots allows to back-up Volumes, and to move them across availability zones (Volumes can be created from Snapshot in any availability zone). Snapshots resides on Walrus, which means that every time a Snapshot is created, a full copy of the Volume is taken on the SC, and sent to Walrus. If Snapshots are frequent on this Cloud, it is easy to see how the SC, Walrus, and the Network can become taxed serving them.
I would take all the above numbers as a best case scenario under their relative cases. A lot of variables have been ignored, starting from network, as well as others disk access. For example, Eucalyptus provides swap by default to instance-store, and the typical linux installation creates swap (ie bfEBS instances will most likely have swap), hence any instance running out or RAM, will start bogging down the respective disk.
There was also the assumption that not only the load is independent, but the instance co-operate to make sure they play nice with the disk. Finally in a production Cloud, a certain mix of operation is to be expected, thus, starting, terminating, creating volumes, creating snapshots, will increase the load of both Storage (Ephemeral and EBS) accordingly.
As I mention in my Maintainability post, having a proper workload example, will allow you to properly test and tune the cloud to satisfy your users.
Making Internet Faster
In the above examples, I pulled off some back-of-the-envelope calculations which do not consider the software Infrastructure at all (ie they don't consider Eucalyptus overhead). Eucalyptus impact on the Physical Infrastructure has been constantly decreasing. Just to mention few of the improvements, before Eucalyptus 3, the NC would make straight copies multi-GB file, now it use device mapper to minimize the disk access to bare minimum, And the SC alongside with the SAN plugins, now has DASManager (Direct Access Storage, ie a partition or a disk), which allow to bypass the file system when dealing with Volumes.
There has been a nice performance boost with Eucalyptus 3, but there is still room for improvements, and no option has been left unexplored, from using distributed file systems as back-end, to employing SDN. Although Eucalyptus may not be able to make Internet faster yet, it is for sure trying hard.