Measuring latency in the cloud

- January 25, 2013

Hey,

I started 10 cc2.8xlarge in Amazon EC2 using the spot instance market.

The cc2.8xlarge hourly rate sits at 0.270 $ / h. The on-demand rate is 2.400 $ / h.

Results

Table 1: Cloud latencies. Number of instances was 10 and the instance type was cc2.8xlarge.

MPI ranks	MPI ranks per instance	Average roundtrip latency (microseconds)
10	1	235.200785
20	2	313.960899
40	4	365.403384
80	8	473.863469
120	12	563.779653
160	16	322.942884
200	20	258.164757
250	25	220.151894
320	32	280.532563

Instances

The specification of 1 cc2.8xlarge is:

Cluster Compute Eight Extra Large Instance

60.5 GiB of memory
88 EC2 Compute Units (2 x Intel Xeon E5-2670, eight-core)
3370 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
EBS-Optimized Available: No*
API name: cc2.8xlarge

(that's 32 VCPUs with hyperthreading !)

These are the 10 instances in my placement group:

i-b078b9c0: ec2-174-129-96-90.compute-1.amazonaws.com
i-ba78b9ca: ec2-54-234-6-223.compute-1.amazonaws.com
i-b278b9c2: ec2-107-21-145-1.compute-1.amazonaws.com
i-b678b9c6: ec2-204-236-254-68.compute-1.amazonaws.com
i-b878b9c8: ec2-50-16-171-238.compute-1.amazonaws.com
i-bc78b9cc: ec2-54-234-75-41.compute-1.amazonaws.com
i-a678b9d6: ec2-23-22-107-180.compute-1.amazonaws.com
i-a878b9d8: ec2-23-22-77-54.compute-1.amazonaws.com
i-aa78b9da: ec2-23-21-6-134.compute-1.amazonaws.com
i-9678b9e6: ec2-50-16-135-211.compute-1.amazonaws.com

Software used to measuring latency

I will test the latency using latency_checker.

Latency checker uses a any-to-any communication pattern and the default message size is 4000 bytes (excluding envelope).

First, we have to setup keys.

I then tested the 10 links:

[ec2-user@ip-10-156-160-75 ~]$ cat hosts.txt
ec2-174-129-96-90.compute-1.amazonaws.com
ec2-54-234-6-223.compute-1.amazonaws.com
ec2-107-21-145-1.compute-1.amazonaws.com
ec2-204-236-254-68.compute-1.amazonaws.com
ec2-50-16-171-238.compute-1.amazonaws.com
ec2-54-234-75-41.compute-1.amazonaws.com
ec2-23-22-107-180.compute-1.amazonaws.com
ec2-23-22-77-54.compute-1.amazonaws.com
ec2-23-21-6-134.compute-1.amazonaws.com
ec2-50-16-135-211.compute-1.amazonaws.com

[ec2-user@ip-10-156-160-75 ~]$ for i in $(cat hosts.txt ); do ssh $i hostname; done
ip-10-156-160-75
ip-10-156-210-34
ip-10-156-164-15
ip-10-156-210-142
ip-10-156-209-237
ip-10-156-210-243
ip-10-156-160-214
ip-10-156-161-58
ip-10-156-224-53
ip-10-156-225-18

That's great, let's install packages on each instance:

[ec2-user@ip-10-156-160-75 ~]$ sudo yum install -y git openmpi openmpi-devel gcc make; exit

[ec2-user@ip-10-156-160-75 ~]$ mkdir software; cd software
[ec2-user@ip-10-156-160-75 ~]$ git clone git://github.com/sebhtml/latency_checker.git
Cloning into latency_checker...
remote: Counting objects: 51, done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 51 (delta 27), reused 49 (delta 25)
Receiving objects: 100% (51/51), 25.14 KiB, done.
Resolving deltas: 100% (27/27), done.

[ec2-user@ip-10-156-160-75 ~]$ cd latency_checker/
[ec2-user@ip-10-156-160-75 latency_checker]$ load_openmpi
[ec2-user@ip-10-156-160-75 latency_checker]$ make
mpicc -O3 -Wall -ansi -DASSERT -c main.c -o main.o
mpicc -O3 -Wall -ansi -DASSERT -c process.c -o process.o
mpicc -O3 -Wall -ansi -DASSERT main.o process.o -o latency_checker

The last step before the actual tests is to copy the executable on all 10 instances.

[ec2-user@ip-10-156-160-75 ~]$ for i in $(cat hosts.txt ); do scp software/latency_checker/latency_checker $i:; done

Measuring latency

[ec2-user@ip-10-156-160-75 ~]$ /usr/lib64/openmpi/bin/mpiexec -hostfile hosts.txt -n 10 ./latency_checker -exchanges 10000 -message-size 4000|grep "average of average roundtrip latencies"
Rank 0 -> average of average roundtrip latencies: 235.200785 microseconds

Software versions:

latency_checker v1.0.0
gcc (GCC) 4.6.2 20111027 (Red Hat 4.6.2-2)
openmpi 1.5.4 mockbuild@gobi-build-31004.sea31.amazon.com
Linux 3.2.30-49.59.amzn1.x86_64

Comments

Unknown said…

This is measuring latency between all 10 instances and themselves right? Are they all in the same availability zone? Seems pretty high numbers if that's the case.

Friday, January 25, 2013 at 10:43:00 PM EST

sebhtml said…

Hey,

> This is measuring latency between
> all 10 instances and themselves
> right?

Yes. However there are more than 1 MPI rank on each instance, and each MPI rank talks to all the others including itself (see table).

This any-to-any communication pattern is known to be a bad design, so it's a good test to stress instances.

For my research project, I route my messages using a polytope to avoid the insane any-to-any communication pattern except when I am using good interconnects such as Cray XE6, Blue Gene/Q, or Intel QLogic.

Finally, the messages here are not empty shells -- they are 4000-byte messages in both direction.

The latency is for roundtrip transit, so you have to divide by 2 to get the real thing I suppose.

> Are they all in the same availability zone?

I assume they were because I in my spot requests I provided a placement group. But hey, all I know is that it's running in the cloud.

> Seems pretty high numbers if that's the case.

Yes, I think that these are high numbers too. Maybe I did something wrong.

What numbers were you expecting ?

Monday, January 28, 2013 at 10:08:00 AM EST

Search This Blog

DSKernel: AI and Strength Training

Measuring latency in the cloud

Results

Instances

Software used to measuring latency

Measuring latency

Software versions:

Comments

Popular posts from this blog

The Thorium actor engine is operational now, we can start to work on actor applications for metagenomics

Learning to solve the example 1 of puzzle 3aa6fb7a in the ARC prize

Adding ZVOL VIRTIO disks to a guest running on a host with the FreeBSD BHYVE hypervisor