ASCCED: Asynchronous Scientific Continuous Computations Exploiting Disaggregation


The design of efficient and scalable scientific simulation software is reaching a critical point whereby continued advances are increasingly harder, more labour-intensive, and thus more expensive to achieve. This challenge emanates from the constantly evolving design of large-scale high-performance computing systems. World-leading (pre-)exascale systems, as well as their successors, are characterised by multi-million-scale parallel computing activities and a highly heterogeneous mix of processor types such as high-end many-core processors, Graphics Processing Units (GPU), machine learning accelerators, and various accelerators for compression, encryption and in-network processing. To make efficient use of these systems, scientific simulation software must be decomposed in various independent components and make simultaneous use of the variety of heterogeneous compute units.

Developing efficient, scalable scientific simulation software for these systems becomes increasingly harder as the limits of parallelism available in the simulation codes is approached. Moreover, the limit of parallelism cannot be reached in practice due to heterogeneity, system imbalances and synchronisation overheads. Scientific simulation software often persists over several decades. The software is optimised and re-optimised repeatedly as the design and scale of the target hardware evolves at a much faster pace, as impactful changes in the hardware may occur every few years. One may thus find that the guiding principles that underpin such software are outdated.

The ASCCED project will fundamentally change the status quo in the design of scientific simulation software by simplifying the design to reduce software development and maintenance effort, to facilitate performance optimisation, and to make software more robust to future evolution of computing hardware. The key distinguishing factor of our approach is to structure scientific simulation software as a collection of loosely coupled parallel activities. We will explore the opportunities and challenges of applying techniques previously developed for Parallel Discrete Event Simulation (PDES) to orchestrate these loosely coupled parallel activities. This radically novel approach will enable runtime system software to extract unprecedented scales of parallelism and to minimise performance inefficiencies due to synchronisation. Additionally, based on a speculative execution mechanism, it will uncover parallelism that has not been feasible to extract before.

The computational model proposed by ASCCED will, if successful, initiate a new direction of research within programming models for high-performance computing that may dramatically impact not only the performance of scientific simulation software, but can also reduce the engineering effort required to produce efficient scientific simulation software. It will have a profound impact on the sciences that are highly dependent on leadership computing capabilities, such as climate modeling and cancer research.

RAPID: ReAl-time Process ModellIng and Diagnostics: Powering Digital Factories

This projects aims to develop algorithms for real-time processing for analytics in digital factories. A particular use case is the design of hardware circuits, where silicon can easily be damaged during manufacturing. The production defects that arise negatively affect yield and/or quality of the devices. Uncovering these defects, and how they may be mitigated through tunable process parameters, is a demanding process, especially considering the high production rate and voluminous metrics that are collected.

The project considers the design of sketching algorithms, transprecise computing and their efficient implementation on modern high-throughput hardware such as graphics processing units.

RAPID is sponsored by EPSRC.

Project members:

SoftNum: Software-Defined Number Formats

Computing devices implement computer arithmetic as basic functionality, and they implement the same, standardized number formats in order to support software portability. However, with Moore’s Law ending, we question whether it remains the best approach to achieve high performance and low energy consumption by applying the same standardized number formats for all applications. We explore how to make number formats, generally considered to be hard-wired functionality, software-defined. Software-defined number formats have the advantage of high performance, low energy consumption, and ensure sufficient while not excessive precision.

SoftNum is Amir’s individual Marie Sklodowska Curie Fellowship, sponsored by the European Commission.

DiPET: Distributed Stream Processing on Fog and Edge via Transprecise Computing


The DiPET project investigates models and techniques that enable distributed stream processing applications to seamlessly span and redistribute across fog and edge computing systems.

The goal is to utilize devices dispersed through the network that are geographically closer to users to reduce network latency and to increase the available network bandwidth. However, the network that user devices are connected to is dynamic. For example, mobile devices connect to different base stations as they roam, and fog devices may be intermittently unavailable for computing.

In order to maximally leverage the heterogeneous compute and network resources present in these dynamic networks, the DiPET project pursues a bold approach based on transprecise computing.

Transprecise computing states that computation need not always be exact and proposes a disciplined trade-off of precision against accuracy, which impacts on computational effort, energy efficiency, memory usage and communication bandwidth and latency. Transprecise computing allows to dynamically adapt the precision of computation depending on the context and available resources.

This creates new dimensions to the problem of scheduling distributed stream applications in fog and edge computing environments and will lead to schedules with superior performance, energy efficiency and user experience.

The DiPET project will demonstrate the feasibility of this unique approach by developing a transprecise stream processing application framework and transprecision-aware middleware. Use cases in video analytics and network intrusion detection will guide the research and underpin technology demonstrators.

Please refer to the website for the details of the project:

This project is sponsored by CHIST-ERA and EPSRC.