The truth is rarely pure and never simple

Go fast: memory-backed storage and how to use it

Hard drives are slow. It does not matter whether we are talking HDD or SDD drives, it matters what we compare it to. In the context of high-performance computing, this means memory (RAM). To set this into perspective, here are some common approximate timings alongside a more intuitive “human scale” where a fast lookup is on the order of a fast human lookup of one second.

WhatActualHuman scale
Duration of light to travel 30 cm
1 ns
L1 cache2 ns
1 s
One core to another70 ns
35 s
Memory random access
100 ns
50 s
Infiniband random access1 mus
8 m
SDD random access100 mus
14 h
Read 1MB from memory250 mus
1.4 d
HDD random access8 ms
46 d
Read 1MB from lustre8 ms
46 d
Read 1MB from network10 ms
57 d
Read 1MB from disk30 ms
0.5 a

Since many codes have a long history, they may expect to be able to write to disk without repercussions. With multicore machines becoming standard however, it becomes exceedingly difficult (and expensive!) to provide a fast filesystem that is disk based. Since not all computing centres have enough funding to set up a hierarchical file system comprised of layers of memory (fast IO), SSD (fast persistent) and HDD (cheap persistent), it is necessary to have an alternative approach that scales with the number of compute cores involved.

The solution is called a ramdisk or a memory-backed filesystem. This is like a regular disk, with the difference that the contents are stored in memory only, so they are never persisted to disk. The ephemeral nature of this storage means that it is mostly useful for temporary files that are frequently read from or written to. This could be e.g. the checkpoint files written by Gaussian or the wavefunction files written by CP2K or Molpro. Even fast semiempirical codes like mopac can be sped up this way. For IO-heavy codes, the speed up can be several orders of magnitude. In reality, one often obtains about 20-fold increase and significantly more reliable performance.

How to use it

Modern machines should offer /dev/shm as a virtual device. Using a ramdisk for your operations is easy: create a folder on /dev/shm/ with e.g. your username and use it as temporary directory just like you would use scratch directories. Please make sure to delete any files you do not need any more, because they will occupy memory until the machine is rebooted otherwise. If you use this mechanism in a compute cluster environment, please keep in mind that you need to allocate the memory of the ramdisk on top on the memory requirements of your code. Only this ensures that no code crashes as side effect of your actions.

Should you work with legacy code that requires static paths for temporary storage, you can use symlinks. This is a unix feature to place individual directories or folders on a different file system. In this case, this means that to userspace applications, this folder then appears as if it was on disk while it actually resides in memory. To do that, first create a directory in memory:

$ mkdir /dev/shm/DEMO
$ cd /path/to/folder/containing/tempdir
$ rm -f name_of_tempdir
$ ln -s /dev/shm/DEMO name_of_tempdir

The last line actually create the symlink. Note that deleting the symlink does not delete the contents held in memory. Those you need to delete from /dev/shm separately.

Many small writes or reads are a problem

Individual read or write operations are measured as IOPS (input / output operation per second). The following numbers give an idea why hard drives get so slow on accessing many files:

Consumer grade HDD50 IOPS
Enterprise HDD200 IOPS
Consumer grade SSD20.000 IOPS
Enterprise SSD400.000 IOPS
Memory10.000.000 IOPS

This shows that cases (for example post-processing of QM data, parsing of log files) works best when the data is kept in memory. This is particularly true if you plan to follow a random access pattern (read: jumping between log lines), because disks are much better suited for sequential read (all lines in order). Therefore, in many cases it is significantly faster to copy data to a ramdisk first, analyse it there and write the results back to disk for persistence.

Caveat: All numbers reported on this page are only guidance not hard numbers. All benchmark numbers heavily depend on the environment and the problem at hand. The numbers are, however, typical for many cases and serve as an example.

Leave a comment

Your email address will not be published.