SWZ  >  
Research Projects  >  
Cloud-Efficient Modelling and Simulation of Magnetic Nano Materials

Cloud-Efficient Modelling and Simulation of Magnetic Nano Materials

This project applies cloud technology in material science and has an strong emphasis on algorithmic changes. It deals with the modelling and numeric simulation of magnetic nano materials by means of a compute cloud, which is not yet efficiently possible. Applications are the development and construction of non-volatile bit memories.

Magnetic materials can be described by the Maxwell equations and by the so-called Landau-Lifschitz-Gilbert equation (LLGE). The LLGE models the static and dynamic behaviour of small magnetic dipoles, which are created and manipulated in a material under the influence of an external magnetic write field. The key feature of the project is that the algorithms and solvers needed for the LLGE are explicitly tailored to the specific properties compute clouds have, which are high intra-server and low inter-server communication performance. As a consequence, new algorithmic approaches must be devised, implemented and tested for solving the LLGE in a cloud in an efficient way. The goal is that all simulations needed for research on non-volatile nano-scale computer memories can be carried out in a standard compute cloud, because substantial algorithmic improvements in the solvers for the LLGE were made. The project will thus open-up the simulation and construction of future computer memories to a huge community of researchers, because sophisticated tools, such as scanning tunnel microscopes, supercomputers or parallel computers, are reduced.

Motivation and Technical background

Magnetic nano materials have attracted significant interest in computer design and organization, because bit memories which are based on such materials have ultimate storage densities and deliver permanent data storage without electric power supply. Magnetic nano materials are considered to be the future main memories of all computers, that will substitute contemporary silicon memories, which have in turn replaced core memories four decades ago.

If computer models of magnetic nano materials are precise enough, and if their simulation is easy and quick to conduct, expensive laboratory set-ups, such as scanning tunnel microscopes, can be significantly reduced. The time and money spent in labs can be focused on those questions, which simulations can not answer. Additionally, the results obtained from simulations will direct experimental researchers into the most promising directions and will thus accelerate the progress in this field. One of the biggest advantages of simulations, however, are that all materials, initial and boundary conditions and physical geometries are easily realizable, just by varying the input data for the simulator.

In parallel to this promising development for computer design and organisation, cloud computing technology has become an other ubiquitous paradigm for data storage and processing, because of its pay-as-you-go accounting, its inherent elasticity with respect to the number of users that are served by the cloud, and because of the cloud‘s flexibility in virtual machine customizing. Furthermore, companies, institutions and individuals are profiting from cloud computing by off-loading their computing and storage needs to commercial cloud service providers (CSPs). Corresponding to that, CSP services range from simple data backups to entire virtual data and computing centres. These advantages make cloud computing attractive for scientists, because researchers do not need to provide and maintain an own IT-infrastructure, but can out-source their IT to a CSP, who fulfils their needs by means of virtualized IT. Because of these advantages, scientists want also to use clouds for HPC and simulation.

The merging of magnetic nano materials with cloud technology appears to be promising, because it would allow a huge community of researchers to simulate and thus construct future computer memories with the means they already have at hand. Otherwise, research in this topic would be restricted to those scientists who have access to sophisticated laboratory equipment and to a supercomputer or at least to a parallel computer. Unfortunately, it turns out that nano materials and cloud computing have both their own problems that must be solved, before a broader research community can enter the field of nano-scale bit-memories, which is why also this project exists.

High Performance Computing and Simulation with Cloud Computing

Standard clouds are getting problematic if used for High Performance Computing (HPC) and simulation. The reason for that is that many existing HPC and simulation codes do not exhibit a sufficient efficiency when being executed in a cloud instead of a supercomputer or a parallel computer. Additionally, many codes are not scalable, because of their slow speed-up, even if the number of computing elements has been significantly increased. As a consequence, efficient HPC code-execution is not easily possible on a standard cloud, and in many cases it is just not worth to run the code on a cloud. The main reason for that behavior is the poor interconnect the cloud has with respect to its inter-server bandwidth and latency. A cloud is basically a distributed system, comprising of a set of servers that are coupled by Ethernet, i.e. by a local area network. As a consequence, a standard cloud is not a supercomputer and not even a parallel computer.

Furthermore, it was found-out by us that it is insufficient to upgrade a cloud‘s communication system from Ethernet to Infiniband or Myrinet, for example, in order to turn the cloud into a parallel computer, because it is not possible to integrate the new hardware in an efficient way into the cloud operation system. We have measured that even a 40 Gbit/s Infiniband inter-server coupling has the same latency as a 1 Gbit/s Ethernet and only about 3 Gbit/s effective data rate, which means that Infiniband alone does not mitigate the performance differences between cloud und parallel computer or supercomputer. Instead, some other measures must be applied in order to make clouds suited for HPC, as described in the SimPaas predecessor project. One of the measures results from the insight that a cloud has a thee threelevel hierarchy with respect to bandwidth and latency for inter-process communication: the fastest communication exists between the cores of the same CPU, because this is accomplished by shared-memory via the level 2 or level 3 cache of the CPU. It allows 16 bytes to be transferred in about 1 ns. The second fastest communication exists between the CPUs inside of the same server, because it is performed by the shared main memory of the server. It allows 1 byte to get transferred in about 1 ns, which is 1/16 of the top-level speed. The slowest communication takes place between servers, because it requires about 1 ns for 1 bit only. Thus, a performance difference of approximately 128 is present between intra-server and inter-server communication. This is about one order of magnitude more than a parallel computer has and about two orders of magnitude more compared to supercomputers. Numerical algorithms must take this into account to be cloud-efficient. This means that cloud-effective algorithms require low inter-server communication, while the intra-server data-exchange should be high at the same time.

Finally, the participating computers in a cloud are to some extent high-performance multi-core servers. This is an important difference to distributed computing, where no statement can be made about the achievable intra-server and inter-server data-exchange performance, because the computers  are heterogenous and are coupled via the Internet. Thus, bandwidth and latency in the computes and between them can vary a lot.

Because of the described three-level performance-hierarchy in a standard compute cloud, existing algorithms and solvers for mathematical, physical, chemical and biological problems should be revisited and substantially modified to exploit local inter-process communication as much as possible, and to provide for only few remote data exchanges between servers. In the general case, an existing HPC code that was written for a supercomputer or a parallel computer can adapt to this 3-level hierarchy only by substantial conceptional and algorithmic changes. In this projekt, the process of adaptation will be demonstrated by means of the LLGE.

Reference List

  1. H. Richter, Vortrag, About the Suitability of Clouds in High-Performance Computing, International Supercomputer Conference Cloud&Big Data, Sept. 28–30, Frankfurt, Germany, 2015 auf Einladung der Veranstalter (Dr. Gentzsch).
  2. H. Richter and A. Keidel and R. Ledyayev, Über die Eignung von Clouds für das Hochleistungsrechnen (HPC), in IfI Technical Report Series ISSN 1860-8477, IfI-15-03, www.in.tu-clausthal.de/forschung/technical-reports/ifi1503richter.pdf, editor: Department of Computer Science, Clausthal University of Technology, Germany, 2015.
  3. P. Ivanovic, H. Richter, A. Bozorgmehr, Cloud-Efficient Modelling and Simulation of Magnetic Nano Materials, Bericht 2015-2016, Simulationswissenschaftliches Zentrum Clausthal-Göttingen, editors: A. Herzog, T. Hanschke, www.simzentrum.de, 2017.
  4. P. Ivanovic, H. Richter, High-Performance Computing and Simulation in Clouds, Clausthal-Göttingen International Workshop on Simulation Science (SimScience 2017), Apr 27, 2017, Göttingen, Germany.

Involved Scientists