The SOLHAR project is organized around four Work Packages
Work Package 1 is devoted to the development of direct solvers for GPU-equipped computing systems. Specifically, this task will be focused on the development and analysis of sparse factorization algorithms which are suited for the architectural features of modern computing platforms and for the programming model and API of modern runtime systems. The achievement of these objectives entails a number of challenges mostly related to (i) the mapping of extremely irregular computational patterns onto heterogeneous architectures; (ii) the granularity of computational tasks the movement of data between the CPU and GPU memories; and (iii) the development of GPU efficient linear algebra kernels suited to the type of computations performed in direct, sparse algorithms.
One of the main innovative aspects of the SOLHAR proposal lies in the use of a generic runtime system to relieve the sparse direct solvers from part of the scheduling work. The key for the success of such an approach is thus to devise the right level of generality of such a runtime system, which is the purpose of Task 2. This problem can be expressed as finding: Subtask 2.1) How does the runtime get to perform its job efficiently enough? Subtask 2.2) How does the runtime get enough freedom of action to provide the necessary adaptiveness to enable effective performance portability? Subtask 2.3) How does the runtime obtain the necessary information from the solver codes? In an ideal world, the runtime system would get unlimited information and freedom of action to perform its job in a perfect fashion. In a realistic world, however, the limiting factor will be the impact on the solver codes themselves, which should be kept low.
This workpackage is devoted to the scheduling challenges that need to be addressed during the SOLHAR project. Very specific scheduling problems will arise because of its specificities. First, contrarily to what is common in scheduling research, we have at our disposal a precise description of the application, and we have to come up with a new and precise model: the application may be described by a tree of large malleable tasks, and each of these malleable tasks may be split in smaller tasks corresponding to dense linear kernels for a given block size. Furthermore, we will have actual data-sets at our disposal to check assumptions made on the task graphs and validate proposed scheduling algorithms. Second, the target platforms exhibit a lot of constraints for scheduling algorithms: they are heterogeneous, cores/nodes have limited memory and complex communication networks are used to transfer data between computing cores. Third, the project aims at designing a valid software prototype, and thus required scheduling efforts are not only theoretical, but have to propose usable and (if possible) guaranteed scheduling algorithms or heuristics to be included in the final prototype.
The development of production-quality scientific code is at the crossing between research on numerical algorithms, mathematics and software engineering. Besides being highly efficient and numerically reliable, a scientific software must possess a number of other properties that make it suited to users who are not generally computer science experts. A numerical software has to be easy to deploy on a production machine, must have a clear and easy to use interface, must have a detailed and understandable documentation, must be portable across different computing platforms, must be robust and reliable and must provide clear error messages, etc. In order to assess the efficiency of the tools developed in the previous tasks, this task evaluates the impact of the proposed approaches on problems arising in applications designed by our industrial partners: CEA-CESTA and EADS Innovation Works. We will evaluate our solvers on problems coming from the electromagnetic field (CEA-CESTA) and aero-acoustics area (EADS IW).