Skip to main content

BSC's MareNostrum (2013)

Recently i visited Barcelona Supercomputing Center and UPC for the Computing Systems Week and the Block Review of Euroserver project.

During my visit, I had the opportunity to visit the MareNostrum supercomputer. MareNostrum is a supercomputer in the Barcelona Supercomputing Center initially build in 2004. The initial setup was containing around 3564 cores based on BM 64-bit PowerPC 970MP processors running at 2.3 GHz. In November 2006 its capacity was increased due to the large demand of scientific projects. MareNostrum increased the calculation capacity until reaching 94.21 Teraflops. The system was based on 2560 JS21 blade computing nodes, based on the same processor for 10,240 CPUs in total. The machine is using Myrinet interconnect for communication.

In 2013 MareNostrum upgraded and changed the CPU architecture. With the last upgrade, MareNostrum has a peak performance of 1,1 Petaflops, with 48896 Intel Sandy Bridge processors in 3056 nodes, and 84 Xeon Phi 5110P in 42 nodes, with more than 100.8 TB of main memory and 2 PB of GPFS disk storage. At  May 2014, MareNostrum was positioned at the 34th place in the TOP500 list of fastest supercomputers in the world. The machine provides 925.1 usable TFlops and consumes 1016 kW of electrical power.

The beauty of the machine is not because of the computation power or the efficiency of power consumption, but because it is located inside a chapel! Here are some pictures from inside:


Computation Node based on Intel processors.
Computation Node based on Intel processors.

Power and cooling distribution.
Power and cooling distribution.

View from the top of the Supercomputer.
View from the top of the Supercomputer.

View from the side. Orange and yellow cables are Myrinet optical links, blue cables are management GBit ethernet.
View from the side. Orange and yellow cables are Myrinet optical links, blue cables are management GBit ethernet.

View from the top.

View of Barcelona Supercomputer from the top. Note the cooling pumps at the end.
View of Barcelona Supercomputer from the top. Note the cooling pumps at the end.

Popular posts from this blog

Static linking with gcc and g++

In this tutorial, we will explain what the static linking is, how this affect the size of final binary, and why statically linking with g++ sometimes is pain. By definition, a statically compiled binary is a group of programmer ‘s routines, external functions, and variables which are packed into the final binary executable. The compiler or the linker produces the final object and embeds all the functions and variables and the linking phase.  There are two reasons of using dynamic linking and shared libraries: 1) Avoid creating a huge binary, if all the programs use a standard set of libraries why not having the operating system providing to them 2) Compatibility on operating system and machine dependant characteristics: sometimes the libraries must be implemented based on the architecture or the operating system and using dynamic linking is an easy way to avoid this catch. On the other hand, static linking is the ideal way of distributing one software product, pay...

Processing Milky Way in RawTherapee

This text is an analysis of a video I did some months ago how to process photos of our Milky Way in RawTherapee. You can find the picture here . The photo was taken by wiegemalt. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Editing: Step 1: Fixing lighting The first thing someone notices when opening the picture is the extreme noise due to high ISO used (1600 - check the picture on the right). The first thought is to de-noise the picture, however, if you do that you are going to loose the details of the night sky. The main reason for the high noise is the additional exposure Rawtherapee adds when the 'Auto' button is pressed. In particular, the RT adds +2.4 EV to equalize the picture. This is Wrong! What we want is to keep the noise down and at the same time bring the stars up. That's why we are going to play with the Tone Curve of the RT. To adjust the light properly we increase the cont...

Auto - Vectorization with little help from GCC!

This tutorial helps the programmers to benefit the progress of the auto-vectorization algorithms that are implemented in modern compilers, such as gcc. Before you start playing with the vectorization of your code i assume that you don't have any bottleneck in you code (like dynamic memory allocation etc) in the critical path. In this tutorial we will use the gcc 4.4.1, but the same steps can be applied to newer or older versions.  First of all there are two issues with auto vectorization:  1) gcc must know the architecture (eg what SIMD instructions are available)  2) The data structures must by properly aligned in memory The first step is to find the architecture of your processor and point it to gcc using the flags -mtune=... / -march=... you specify the architecture.  For example, my laptop is core2Duo so i put -march=core2. You can find more more information  here .  The next problem we must solve is knowledge of memory alignment. ...