2012-11-06

Cost Effectiveness Analysis (CEA) of running Ray on Amazon EC2

Sample: SRA001125
URL: http://trace.ddbj.nig.ac.jp/DRASearch/submission?acc=SRA001125
DNA reads: 34911784 (2 * 17455892)
Read length (nt): 36
Technology: Illumina Genome Analyzer

API name: m1.large
2 Ray processes
Running time: 05:28:46
Pricing: 0.260 $ / h
Cost: 1.560 $

API name: m3.xlarge
4 Ray processes
Running time: 02:31:34
Pricing: 0.580 $ / h
Cost: 1.730 $

API name: cc2.8xlarge
32 Ray processes
Running time: 00:54:06
Pricing: 2.400 / h 
Cost: 2.400 $

Conclusions:

1. You get your results faster if you pay more.

2. For cc2.8xlarge, 33% (00:19:40) of the time was loading sequences from EBS.
That's a lot !

3. The scalability on this problem is not that good because the
problem size is not very large.

4. Amazon EC2 is really affordable for de novo assemblies of bacterial genomes. 
 
 
 
If you want to try these tests yourself => http://github.com/sebhtml/Ray-in-Amazon-EC2-CLOUD

No comments:

There was an error in this gadget