2015-08-03

The Spark Notebook from creator Kate Matsudaira

I received my Spark Notebook. What is the Spark Notebook, you may ask.
The Spark Notebook combines form and function. The Spark Notebook project raised funding on Kickstarter too.

The way I see it, the Spark Notebook is an agenda (a 6-month agenda) with additional features. These features include (from the guide) the yearly planning pages, accomplishments, the monthly planning pages, the weekly planning pages, the inspiration pages, and the note pages. The monthly planning pages include something called the 30-day challenge. According to creator Kate Matsudaira, the 30-day challenge feature is useful to help start (or break) a habit.

The Spark Notebook comes with a guide. On this website, some of the text is white with a background image which makes it hard to read. For example, consider the screenshot below.







How did I learn about the existence of the Spark Notebook ?

One of my hobbies consists in watching videos on the Amazon Web Services (AWS) YouTube channel. I like these videos because they are usually easy to understand although the subject may be a bit abstract.

In particular, I watch the videos about AWS products. One example of this is the video called Introduction to Amazon S3 which explains in 3 minutes the general idea of the Internet storage service called S3 (Simple Storage Service).

The other types of videos that I like on the AWS YouTube channel are those about satisfied AWS customers. The customer experience videos are usually animated by AWS Chief Evangelist Jeff Barr.

One of the videos I watched was Kate Matsudaira, CTO of Decide.com. I like the customer experience videos because they usually mention what the customer is doing with AWS (value proposition / business model) and also the AWS building blocks that the customer is using to make things happen are also listed.

So, in that video, I got to know more about Decide.com. Then, I continued my adventure. I searched for Decide.com to try their offering. However, I was not able to do so because Decide.com was acquired in 2013.

The next logical step was to look for the next accomplishment of Kate Matsudaira because I was not able to experiment with Decide.com because it had been acquired. That's when I found popforms.

The company popforms provides "bite-size career development
for the modern leader" through online courses. These courses are called sparks. I found this interesting, because I think self-reflection is important for improving who we want to be.

This company (popforms) was also acquired by a bigger fish (Safari Books Online).

At that point, I was thinking that this entrepreneur probably has a secret sauce, and that perhaps she shared it in the form of a book. This is how I discovered the Spark Notebook.

Also, when I discussed the Spark Notebook concept with my significant other, she said that she has been reading Kate Matsudaira's blog for years. So, Kate Matsudaira is an entrepreneur, technologist, creator, and also a role model.

I think that the Spark Notebook is a great product.

Product: Spark Notebook
Price: $US 28.00
Purchase links: Manufacturer or Amazon
Score: 9/10

Pros:
- innovative form, impressive function
- compact size
- self-contained / self-explanatory
- online guide at http://www.thesparknotebook.com/guide
Cons:
- expensive for a 6-month agenda
- only 6 months

2015-07-24

The convergence of HPC and cloud computing

An exascale computer is an imaginary system that can sustain one exaflops (10^18 floating point operations per second.) Such an object is needed in science and engineering, mostly for simulating virtual versions of objects found in the real world, such as proteins, planes, and cities. Important requirements for such a computer are 1) memory bandwidth, 2) floating point operation throughput, 3) low network latency, and so on.

2 of the many challenges for possibly having exascale supercomputers by 2020 are 1) improving fault-tolerance and 2) lowering energy consumption. (see "No Exascale for You!" An Interview with Berkeley Lab's Horst Simon).

One typical solution to implement fault tolerance in HPC is the use of the checkpoint/restart cycle whereas in most cloud technologies fault tolerance is instead implemented using different principles/abstractions such as load balancing and replication (see the CAP theorem). The checkpoint/restart can not work at the exa scale because there will almost always be a failing component at this scale. So, an exascale computation would need to survive such failures. In that regard, Facebook is a very large system that is fault-tolerant and that is based on cloud technologies rather than HPC.

The fact that fault tolerance has been figured out for a while now in cloud technologies allowed the cloud community to solve other important problems. One active area of development in cloud computing in 2015 has been without a doubt that of orchestration and provisioning. HPC is still making progress on solving the fault-tolerance problem in the HPC context.

Abstractions


A significant body of research output is coming endlessly from UC Berkeley's AMPLab and other research groups and also from Internet companies (Google, Facebook, Amazon, Microsoft, and others). The "cloud stack" (see all the Apache projects, like Spark, Mesos, ZooKeeper, Cassandra) is covering a significant part of today's market needs (datacenter abstraction, distributed databases, map-reduce abstractions at scale). What I mean here is that anyone can get started very quickly with all these off-the-shelf components, typically using high levels of abstractions (such as Spark's Resilient Distributed Datasets or RDD). Further, in addition to having these off-the-shelf building blocks available, they can be deployed very easily in various cloud environments, whereas this is rarely the case in HPC.

One observation that can be made is that HPC always want the highest processing speed, usually on bare metal. This low level of abstraction comes with the convenience that things are built on a very low number of abstractions (typically MPI libraries and job schedulers).

On the other hand, abstractions abound in the cloud world. Things are evolving much faster in the cloud than in HPC. (see "HPC is dying, and MPI is killing it").



But... I need a fast network for my HPC workflow

One thing that is typically associated to HPC and not with the cloud is the concept of having a very fast network. But this fast-network gap is closing, and the cloud is catching on in that regard. Recently, Microsoft added RDMA in Windows Azure. Thus, now the cloud technically offers a low latency (in microseconds) and high bandwidth (40 Gbps). This is no longer an exclusive feature of HPC.

The network is the computer

In the end, as Sun Microsystems's John Cage said, "The Network is the Computer."  The HPC stack is already converging to what is being found in the web/cloud/big data stack (see this piece). There are significant advances in cloud networking too (such as software-defined networks, convenient/automated network provisioning, and performance improvements. So, the prediction that can perhaps be made today is that HPC and cloud will no longer be 2 solitudes in a not-so-distant future. HPC will benefit from the cloud and vice-versa.

What the future hold in this ongoing convergence will be very exciting.


References
-----------------

Daniel A. Reed, Jack Dongarra
Exascale Computing and Big Data
Communications of the ACM, Vol. 58 No. 7, Pages 56-68, 10.1145/2699414
http://cacm.acm.org/magazines/2015/7/188732-exascale-computing-and-big-data/fulltext
This survey paper is very comprehensive and highlights how HPC (called exascale computing even though there is no operational exascale computer as of today) and cloud can meet at the crossroads.

Tiffany Trader
Fastest Supercomputer Runs Ubuntu, OpenStack
HPCwire  May 27, 2014
http://www.hpcwire.com/2014/05/27/fastest-supercomputer-runs-ubuntu-openstack/

This article reports on a very large supercomputer that is running OpenStack instead of the classic HPC schedulers (like MOAB, SGE, Cobalt, Maui).

Jonathan Dursi
HPC is dying, and MPI is killing it
R&D computing at scale, 2015-04-03
http://www.dursi.ca/hpc-is-dying-and-mpi-is-killing-it/

This piece is a provocative, yet realistic, depiction of the current state of popularity of various hpc and cloud technologies (surveyed using Google Trends).
There was an error in this gadget