2012-10-31

Ray on the Raspberry Pi

Jean-Francis Roy (Funtoo Core Team member) compiled and tested Ray on the Raspberry Pi.

The patch against Open-MPI 1.6.2 (based on this blog post):

Raspberry-Pi-openmpi-1.6.2.patch



Ray version: 2.1.0

RayPlatform 1.1.0


Command:

mpiexec -n 8 Ray \
 -test-network-only \
 -o \
 popo


Result:

# average and mode round trip latency in microseconds (10^-6 seconds) when requesting a reply for a message of 4000 bytes
# MessagePassingInterfaceRank    Name    ModeLatencyInMicroseconds    AverageLatencyInMicroseconds    NumberOfExchanges
# AverageForAllRanks: 1747
# StandardDeviation: 30.749
0    funtoo-pi    149    1775    1000
1    funtoo-pi    149    1734    1000
2    funtoo-pi    155    1798    1000
3    funtoo-pi    151    1779    1000
4    funtoo-pi    160    1738    1000
5    funtoo-pi    153    1706    1000
6    funtoo-pi    155    1717    1000
7    funtoo-pi    155    1729    1000


2012-10-24

Programme des Séminaires de l’axe des maladies infectieuses et immunitaires du CHUQ


2012~2013

Programme des Séminaires de

l’axe des maladies infectieuses et immunitaires du CHUQ

Tous les jeudis à 15 pm à l'amphithéâtre du Bloc T
Responsable : Sachiko Sato, poste 48647
______________________________________________________________________
18 octobre 2102 Groupe de Jean Sévigny

Dr Jean Sévigny Le dernier membre de la famille des ectonucléotidases,
La NTPDase8 - de son identité à la souris KO
______________________________________________________________________
25 octobre 2012 Groupe de Denis Leclerc

Dr Denis Leclerc La papaye qui prévient, vaccine et traite les maladies infectieuses ou
la papaye qui traite tous les maux
______________________________________________________________________
1 novembre 2012
Dr Adnane Sellam
Institut de Recherche en Immunologie et en Cancérologie (IRIC), Montréal

Régulation transcriptionnelle des processus infectieux
de la levure polymorphique Candida albicans
Invité par AMII
______________________________________________________________________
8 novembre 2012
Dr Réjean Lapointe
Centre de recherche du CHUM, (CRCHUM Notre-Dame) et
Institut du Cancer de Montréal (ICM)

Contrôle de la réponse anti-tumorale : question d'équilibre?
Invité par Dr D. Leclerc
______________________________________________________________________
14 novembre 2012 MERCREDI
14PM-15PM
Dr Didier Mazel (Paul H. Roy)
Directeur du Département Génomes et Génétique
Unité "Plasticité du Génome Bactérien" - CNRS UMR3525
Institut Pasteur
Intégrons, Réponse SOS et Transferts horizontaux : connexion intimes
Invité par Dr P. H. Roy
______________________________________________________________________
______________________________________________________________________
15 novembre 2012
Annulé (pour CECRI carrier day)
______________________________________________________________________
22 novembre 2012 Groupe de Marc Pouliot

Dr Marc Pouliot Le neutrophile et l'inflammation : identification des voies endogènes de résolution
______________________________________________________________________
29 novembre 2012
Dr Martin Pelletier
Laboratoire de Dr Richard Siegel NIAMS/NIH

Étude du métabolisme bioénergétique et des interactions cellulaires
lors de la réponse inflammatoire
Invité par AMII
______________________________________________________________________
6 décembre 2012

Dr Mohammad-Ali Jenabian
Chronic Viral Illness Service, McGill University Health Centre

Rôle des Lymphocytes T régulateurs dans l’infection par le VIH : pathogènes ou protecteurs?
Invité par AMII
______________________________________________________________________
13 decembre 2012
Dre Jasna Kriz
Dept. Psychiatry and Neuroscience, CHUL-CHUQ Québec
Live imaging of innate immune response: galectins as emerging immunomodulatory molecules in the injured brain
Invitée par Dr S. Sato
______________________________________________________________________

2012-10-20

Message passing, MPI ranks, threads, and mini-ranks

Hey,

The name's Sébastien Boisvert (Sebastian GreenWood in English).

It's been a while since my last significant post. That's because I was busy.

What I have been up to

I have been busy for the last 4 months (July, August, September, October 2012) with these major tasks:

- preparing a manuscript about scalable metagenomics;
- submitting (and resubmitting owing to editorial rejections) my manuscript;
- coding (Ray plugins, RayPlatform engine);
- participating to a 1-week workshop in Utah, U.S.A. in August 2012;
- visiting researchers at Argonne National Laboratory in October 2012;
- preparing the 2013 Compute Canada proposal for my director;
- helping with a Genome Canada grant application based on Ray plugins and RayPlatform;
- buying a new computer (my Samsung NP-NF210 lost some keys);
- working on a contract for the CLUMEQ super computing center (they are picky about confidential information).


Manuscript publication means more coding

I am about to submit my revised manuscript to the editor. This should give me more coding time once the manuscript ships.

New computer

My new computer is a Lenovo Thinkpad X230 with Fedora 17. I ditched Gentoo because I wanted systemd, the latest gcc (4.7.2) and the latest buggy Linux kernel (3.6.2). In contrast with Unity in Ubuntu which I dislike, I really like Gnome 3. One of the important things is to have categories for applications instead of a brain-dead endless list of applications.

Hardware is hierarchical like society

Most of today's processor architectures are not as flat as a slice of bread. Indeed, these electronics blueprints exhibit nested and hierarchical designs. For example,  the node of a super computer has usually one or more sockets. Each of these sockets accommodate a single processor. In turn, a processor has one or more cores. And finally, a core has execution threads.

Software is (or should be) hierarchical

Any modern operating system abstracts 3 major components:

processors => processes;
physical memory => virtual memory;
storage appliances => virtual file systems.

When running Linux, each hardware execution thread is reported as a virtual processor in /proc/cpuinfo. However, although a process can more or less run on a thread, the software ecosystem also provides a nested way of devising computing tasks. Within any process managed by the operating system, their can be any number of threads running.

MPI (message-passing interface) is not hierarchical

MPI (stands for message-passing interface) is a standard for passing messages between processes. These processes need not to be on the same machine -- they can be relatively remote processes. But because message passing is between processes, there is no room for hierarchy.

Figure 1: The MPI programming model.

            +--------------------+
            |   MPI_COMM_WORLD   |           MPI communicator
            +---------+----------+
                      |
    +------+------+---+--+------+------+
    |      |      |      |      |      |
  +---+  +---+  +---+  +---+  +---+  +---+
  | 0 |  | 1 |  | 2 |  | 3 |  | 4 |  | 5 |    MPI ranks
  +---+  +---+  +---+  +---+  +---+  +---+


Meeting Professor Rick Stevens at Argonne National Laboratory

The one single hour spent in the office of Rick Stevens was very productive.

My opinion is that explicit hybrid models (MPI+OpenMP, MPI+pthreads, MPI+ Windows threads) are nice and all, but they fall short in their lack of uniformity. The programmer has to deal with two programming models (MPI and threads) which are both arguably difficult on their own.

A pure thread-only application can not scale beyond one node. And a pure MPI application does not use threads.

So the question was: is there a way to write your application with only message passing in mind (that's easier than with threads because it's lockless), but at the same time to require that running some of the computation in threads instead of in processes ?

Thus was born the concept of mini-ranks in software!

Mini-ranks

The hierarchical design of mini-ranks was introduced in 2008 in the field of hardware memory subsystem design ("Mini-rank: Adaptive DRAM architecture for improving memory power efficiency", IEEE, 2008).

I did not get everything in this paper as I am no expert in that field.

But for mini-ranks in distributed programming, the idea is fairly simple. The application is coded as usual, using only one message inbox and one message outbox per rank. But instead of mapping each rank to a process, each rank is actually a mini-rank running in a thread. See the figure below.

Figure 2: The MPI programming model, with mini ranks.

            +--------------------+
            |   MPI_COMM_WORLD   |           MPI communicator
            +---------+----------+
                      |
    +------+------+---+--+------+------+
    |      |      |      |      |      |
  +---+  +---+  +---+  +---+  +---+  +---+ 
  | 0 |  | 1 |  | 2 |  | 3 |  | 4 |  | 5 |    MPI ranks   (1 VirtualMachine.cpp instance per rank)
  +---+  +---+  +---+  +---+  +---+  +---+                    with the main for MPI calls
  |   |  |   |  |   |  |   |  |   |  |   |
  | 0 |  | 4 |  | 8 |  |12 |  |16 |  |20 |  |                
  | 1 |  | 5 |  | 9 |  |13 |  |17 |  |21 |  | => mini ranks
  | 2 |  | 6 |  |10 |  |14 |  |18 |  |22 |  |
  | 3 |  | 7 |  |11 |  |15 |  |19 |  |23 |  |  (1 Minirank instance per minirank (in 1 pthread))
  |   |  |   |  |   |  |   |  |   |  |   |
  +---+  +---+  +---+  +---+  +---+  +---+       (will wrap Machine.cpp and ComputeCore.cpp)



First, I tested the ability of spinlocks to synchronize everything.

Yesterday, I completed the port of RayPlatform to this mini-ranks programming model.

Porting Ray plugins to the new RayPlatform will be straightforward.

With this model, there is one MPI rank per node. One of the hardware thread does the communication and for the rest,  there is one mini-rank per hardware thread.


There was an error in this gadget