This page lists software and hardware components that are produced by CoE / FETHPC projects. This page is updated regularly.
|CoE||Centre of Excellence for Global Systems Science (CoeGSS, http://www.coegss-project.eu/)|
|Components||Applications & Services|
The Centre of Excellence for Global Systems Science encompasses three pilot applications that develop synthetic information systems for three different kinds of case studies (Green Growth in the Automotive Sector,
The initial prototype of the CoeGSS Portal has been deployed in June 2016, an extended version is expected in May 2018. The pilots will release first versions in March 2018 in order to demonstrate their achieved results.
|Description||Nu+ is an open-source GPU-like compute core, being developed by CeRICT in the framework of the MANGO FETHPC project. The main objective of Nu+ is to enable resource-efficient HPC based on special-purpose customized hardware. In MANGO, the GPU-like core is meant to be used to support architecture-level exploration for low-end, massively parallel manycore systems, but as one of its primary objectives, Nu+ also targets FPGA-accelerated HPC systems. In that respect, Nu+ will provide an FPGA overlay solution, used to readily build tailored processing elements preserving software support, guaranteeing improved resource efficiency, yet avoiding the development of a dedicated accelerator from scratch through the support for familiar programming models.
The core is developed in the System Verilog hardware description language (HDL) and will be released as an open-source project. The release of the first version of Nu+ is scheduled late in 2016. On top of the customized hardware core, CeRICT is also developing a Nu+ compiler backend relying on the LLVM infrastructure. This branch of the activity is also carrying out preliminary evaluations to define the most viable choices at the level of programming model support. The ultimate aim is to comply with standard models borrowed from the software domain, particularly OpenCL. Furthermore, to support the hardware development, CeRICT is also developing a software emulator. This was initially developed internally for validation purposes, but CeRICT now plans to release a stable ISA simulator, to be integrated in the gem5 framework, at a later stage of the project.
The first release of Nu+ is scheduled late in 2016. The core will be released as an open-source project freely available for research and non-commercial purposes.
As soon the open-source project will be released in its first version, the development will be open to the external community. Any contributions and feedbacks will be welcome at all levels of the project: hardware design, development of the emulator, compiler infrastructure, application development and test.
We are particularly interested in HPC developers providing use cases and specific kernel which may demonstrate the potential of the customized GPU-like acceleration support.
|Description||Algorithmic improvements for increased performance, scalability and exascale readiness of major high order, open source computational fluid dynamics codes.|
|Availability||Algorithms will be released as part of open source, community CFD codes towards the end of the project.|
|Comments||The project is looking for potential users.|
|Description||Software tools for generating synthetic populations as well as their agent-based execution and visualisation will be provided as an outcome of the project.|
|Availability||Beginning of 2018, Open Source licenses.|
|Description||Description: A series of Best Practice Guides for writing interoperable programs, focusing on GASPI + MPI; MPI + OpenMP; MPI + OmpSs; OpenMP/OmpSs/StarPU + Multi-threaded Libraries. INTERTWinE has also produced a brief Best Practice Guide for Minimising Gender Bias in HPC Training, which is equally applicable to training in most other fields too.|
|Comments||The project is looking for potential application developers.|
|Description||High-Order discontinuous-Galerkin (Finite-Element-type) solver for hyperbolic PDE systems, working on tree-structured adaptive Cartesian meshes. In particular, applications from astrophysics (merging of rotating binary neutron stars) and seismology (regional earthquake simulation, etc.) are developed.
Hybrid MPI+TBB parallelisation ; C++ framework with C++, Fortran and Assembler kernels; designed to be a compute-bound code (innermost kernels tailored towards small matrix multiplications).
|Availability||First release of the engine planned for spring 2017; production-ready applications expected for 2018/19.|
|Comments||The project is looking for contributors on task-based programming models that offer resilience/dynamic scheduling/…
The project is also looking applications based on hyperbolic systems of PDEs; in particular computational astrophysics (Einstein equations) and seismology (earthquake simulation, exploration).
|Description||OpenStream is an Open Source, task-parallel, data-flow programming model implemented as an extension to OpenMP, designed for efficient and scalable data-driven execution. Arbitrary dependence patterns can be used to exploit task, pipeline and data parallelism. The OpenStream runtime system (library, also Open Source) can be re-used and provides optimized implementations of various performance-critical algorithms for scheduling and synchronization (e.g., work-stealing, work-pushing, barriers) and for NUMA-aware memory allocation.
Compared to the more restrictive data-parallel and fork-join concurrency models, task-parallel models enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and as a side effect, reduced power consumption. In the OpenStream programming model, each data-flow dependence is semantically equivalent to a communication and synchronization event within an unbounded FIFO queue. OpenStream takes advantage of the information provided by programmers on task dependences to aggressively optimize memory locality through dynamic task and data placement.
|Comments||The project is looking for users with Open Source shared memory parallel applications (e.g., OpenMP) and experiencing scalability issues, in particular in the case of memory-bound applications with non-trivial parallelism, or looking to improve performance.|
|Description||Firmware and operating system support for exascale systems based on the Unimem memory architecture.
OS: Linux (esp. drivers to support platform functionalities) language: mostly C/C++ (several programming models – e.g. MPI, task-parallel, dataflow).
|Availability|| End of 2018 (early prototype already available and in use)
Conditions: [ subject to each partner’s stipulations in the ExaNoDe DoW ]
|Description||Fast KVM Virtual Machine memory snapshot based on post-copy and incremental checkpoints.
Within the ExaNoDe project KVM memory snapshot capabilities will be extended in order to introduce memory snapshot based on post-copy. This technique will reduce the overall stall time of a VM during its snapshot, thus strongly minimizing the effect on the guest applications. The second feature that will be implemented is incremental automatic checkpointing, to periodically save the state of a VM in an incremental fashion to reduce the amount of data saved at each checkpoint . The combination of these two features will improve the resiliency level of the system because it will enable finer grain control over the state of VMs.
|Availability||The two KVM extensions will be released in OpenSource and published to the relevant communities (i.e., KVM, QEMU). Interested users will find the code patches either in the qemu-devel mailing lists threads or in the main QEMU code source tree once upstream.|
|Description||E-CAM will produce a repository of generic software “modules” and training materials for use by computational scientists interested in advanced material science simulations.|
|Availability||Under development – available as open source under creative commons licence or similar.|
|Comments||Subject to quality controls all relevant contributions will be gratefully received.
Looking for anyone interested in using or developing modules and/or training materials, but particularly industrial users who are interested in pilot projects with academia.
|Description||Programming environment for C/C++ programs to exploit nested recursive parallelism with a unique API based on C++ templates.
The programming environment supports C/C++ only. Input programs must be without any previous parallelization such as MPI or OpenMP. Parallelism is expressed with AllScale API (C++ template).
|Comments||We are looking for contributors to extend and maintain the C++ AllScale compiler with novel analysis, optimizations and transformations. For instance, we are looking for someone to extend the AllScale compiler with a polyhedral library.
We are also looking for users that provide additional use cases using the AllScale API to provide applications with potential for nested recursive task parallelism.
|Description||SAGE public deliverables to provide an idea of our co-design activities, our updates on programming models, analytics stack and runtimes for SAGE. A Technical White paper will also be available to the community.
The research software outcomes are still being developed and we will have more updates on this in the second year.
|Availability||Software outcomes are expected towards the end of 2017|
|Comments||More inputs on this in 2017.|
|Description|| GPI-2 is an open source implementation of the GASPI standard, freely available to application developers and researchers. GASPI stands for Global Address Space Programming Interface and is a Partitioned Global Address Space (PGAS) API. It aims at extreme scalability, high flexibility and failure tolerance for parallel computing environments. GASPI aims to initiate a paradigm shift from bulk-synchronous two-sided communication patterns towards an asynchronous communication and execution model. To that end, it leverages remote completion and one-sided RDMA driven communication in a Partitioned Global Address Space. The asynchronous communication allows a perfect overlap between computation and communication. The main design idea of GPI is to have a lightweight API ensuring high performance, flexibility and failure tolerance.
Supported network technologies: Infiniband, Cray, Ethernet supported hardware: x86, Intel Xeon Phi, GPUs features: PGAS, asynchronous, one-sided communication, thread-safe, fault-tolerance, standardized API
|Availability||Available now at: http://www.gpi-site.com/gpi2/ GPI-2 is an open source implementation of the GASPI standard and freely available to application developers and researchers. For commercial users we offer a comercial license and support.|
|Comments||We are interested in users who want to reach scalability of their application with a light-weight communication API.|
|Description||This is a chemogenomics dataset which can be used for building models to predict compound activity.|
|Availability||2017 March. It will be free to download.|
|Comments||We would like to establish collaborations with researchers who work in chemoinformatics, bioinformatics area, or in general machine learning researchers who are looking for a challenging data set to work with.|
|Description|| OmpSs is an effort to integrate features from the StarSs programming model developed by BSC into a single programming model. In particular, OmpSs aims to extend OpenMP with new directives to support asynchronous parallelism and heterogeneity (devices like GPUs). However, it can also be understood as new directives extending other accelerator based APIs like CUDA or OpenCL. Our OmpSs environment is built on top of our Mercurium compiler and Nanos++ runtime system.
In ExaNoDe, the cluster version of OmpSs is being extended with optimisations for the Unimem global shared memory architecture and to manage data locality and placement in distributed memory systems.
|Availability||Open source software (GPL)|
|Comments||The project is looking for OmpSs: C/C++ runtime system developers and PhD students.
The project is also looking for collaboration with application owners.
|Components||INTERTWinE Resource Manager|
|Description||The INTERTWinE Resource Manager coordinate access to CPUs and GPUs resources between different runtime systems and APIs to avoid both oversubscription and undersubscription situations. The INTERTWinE project team is developing three different APIs:
· An offloading API -based on OpenCL- to call parallel kernels (e.g. a piece of code written in OpenMP or StarPU) from one runtime system to another one
· A dynamic resource sharing API – also based on OpenCL- to indicate what specific CPUs can be used to execute a parallel kernel
· A task pause/ resume API so different runtimes in the same process can lend and borrow CPUs between them when they are not using all the resource or they need more resources respectively.
|Availability||Available for potential collaborators|
|Comments||The INTERTWinE Resource Manager is a technology preview, intended to aid research and development of Exascale software and applications. It is not a production-ready tool.|
|Components||INTERTWinE Directory/ Cache|
|Description||The INTERTWinE Directory/ Cache allows task-based runtime systems to efficiently run distributed applications, while being able to consistently manage data stored in distributed memory or in local caches. The Directory/ Cache API allows runtimes to be completely independent from the physical representation of data and from the type of storage used. Access to data is through a common interface to an extendable list of memory segment implementations based on different communication libraries (GASPI, MPI).
For applications based on runtimes a switch to a fully distributed version is almost transparent with the help of the directory cache
|Availability||Available to potential collaborators|
|Comments||The INTERTWinE Directory/ Cache is a technology preview, intended to aid research and development of Exascale software and applications. It is not a production-ready tool.|