Skip to main content

BSC's MareNostrum (2013)

Recently i visited Barcelona Supercomputing Center and UPC for the Computing Systems Week and the Block Review of Euroserver project.

During my visit, I had the opportunity to visit the MareNostrum supercomputer. MareNostrum is a supercomputer in the Barcelona Supercomputing Center initially build in 2004. The initial setup was containing around 3564 cores based on BM 64-bit PowerPC 970MP processors running at 2.3 GHz. In November 2006 its capacity was increased due to the large demand of scientific projects. MareNostrum increased the calculation capacity until reaching 94.21 Teraflops. The system was based on 2560 JS21 blade computing nodes, based on the same processor for 10,240 CPUs in total. The machine is using Myrinet interconnect for communication.

In 2013 MareNostrum upgraded and changed the CPU architecture. With the last upgrade, MareNostrum has a peak performance of 1,1 Petaflops, with 48896 Intel Sandy Bridge processors in 3056 nodes, and 84 Xeon Phi 5110P in 42 nodes, with more than 100.8 TB of main memory and 2 PB of GPFS disk storage. At  May 2014, MareNostrum was positioned at the 34th place in the TOP500 list of fastest supercomputers in the world. The machine provides 925.1 usable TFlops and consumes 1016 kW of electrical power.

The beauty of the machine is not because of the computation power or the efficiency of power consumption, but because it is located inside a chapel! Here are some pictures from inside:


Computation Node based on Intel processors.
Computation Node based on Intel processors.

Power and cooling distribution.
Power and cooling distribution.

View from the top of the Supercomputer.
View from the top of the Supercomputer.

View from the side. Orange and yellow cables are Myrinet optical links, blue cables are management GBit ethernet.
View from the side. Orange and yellow cables are Myrinet optical links, blue cables are management GBit ethernet.

View from the top.

View of Barcelona Supercomputer from the top. Note the cooling pumps at the end.
View of Barcelona Supercomputer from the top. Note the cooling pumps at the end.

Popular posts from this blog

Processing Milky Way in RawTherapee

This text is an analysis of a video I did some months ago how to process photos of our Milky Way in RawTherapee. You can find the picture here . The photo was taken by wiegemalt. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Editing: Step 1: Fixing lighting The first thing someone notices when opening the picture is the extreme noise due to high ISO used (1600 - check the picture on the right). The first thought is to de-noise the picture, however, if you do that you are going to loose the details of the night sky. The main reason for the high noise is the additional exposure Rawtherapee adds when the 'Auto' button is pressed. In particular, the RT adds +2.4 EV to equalize the picture. This is Wrong! What we want is to keep the noise down and at the same time bring the stars up. That's why we are going to play with the Tone Curve of the RT. To adjust the light properly we increase the cont...

RTL-SDR Blog V3 Arrived

I recently bought a new RTL dongle that supports Direct Sampling that allows frequencies less than 40Mhz. In particular, the model I ordered was the RTL-SDR BLOG V.3. DONGLE that was really cheap compared with other solution. It still can't reach the quality of the other more expensive receivers, but it still a step up. The Dongle comes with a long external antenna. The RTL dongle. So, the question now is how better is from my old SDR. I did a check with the RTL power tool to see what is the difference. My old SDR Dongle (Fitipower FC0013) has coverage from 22 to 1100 MHz. The new Dongle RTL Blog V3 a has Rafael Micro R820T has coverage from 24 - 1766 MHz, but it also contains Direct Sampling that allows for High Frequencies. A word of warning here, the reception using Direct Sampling is very bad, especially if you connect the antenna without a filter band or/and preamplifier. I did experiments using the rtl_power, and the results showed much more gain for the ne...

Auto - Vectorization with little help from GCC!

This tutorial helps the programmers to benefit the progress of the auto-vectorization algorithms that are implemented in modern compilers, such as gcc. Before you start playing with the vectorization of your code i assume that you don't have any bottleneck in you code (like dynamic memory allocation etc) in the critical path. In this tutorial we will use the gcc 4.4.1, but the same steps can be applied to newer or older versions.  First of all there are two issues with auto vectorization:  1) gcc must know the architecture (eg what SIMD instructions are available)  2) The data structures must by properly aligned in memory The first step is to find the architecture of your processor and point it to gcc using the flags -mtune=... / -march=... you specify the architecture.  For example, my laptop is core2Duo so i put -march=core2. You can find more more information  here .  The next problem we must solve is knowledge of memory alignment. ...