Invited Talk – Efficient Computation through Tuned Approximation by David Keyes

21 February 2024

Abstract

Numerical software is being reinvented to provide opportunities to tune dynamically the accuracy of computation to the requirements of the application, resulting in savings of memory, time, and energy.  Floating point computation in science and engineering has a history of “oversolving” relative to expectations for many models. So often are real datatypes defaulted to double precision that GPUs did not gain wide acceptance until they provided in hardware operations not required in their original domain of graphics.  Computational science is now reverting to employ lower precision arithmetic where possible. Many matrix operations allow for lower precision considered at a blockwise level without loss of accuracy, adapting to the magnitude of the norm of the block. Furthermore, many blocks can be approximated with low-rank near equivalents to a prescribed accuracy, adapting to the smoothness of the coefficients of the block.  This leads to smaller memory footprint, which implies higher residency on memory hierarchies, leading in turn to less time and energy spent on data copying, which may even dwarf the savings from fewer and cheaper flops.  We provide examples from several application domains, including Gordon Bell Prize-nominated research in 2022 in environmental statistics and in 2023 in seismic processing.

Bio

David Keyes directs the Extreme Computing Research Center at the King Abdullah University of Science and Technology (KAUST), where he was a founding Dean in 2009 and currently serves in the Office of the President. He is a professor in the programs of Applied Mathematics, Computer Science, and Mechanical Engineering. He is also an Adjunct Professor of Applied Mathematics and Applied Physics at Columbia University, where he formerly held the Fu Foundation Chair. He works at the interface between parallel computing and PDEs and statistics, with a focus on scalable algorithms that exploit data sparsity. Before joining KAUST, Keyes led multi-institutional scalable solver software projects in the SciDAC and ASCI programs of the US Department of Energy (DoE), ran university collaboration programs at US DoE and NASA institutes, and taught at Columbia, Old Dominion, and Yale Universities. He is a Fellow of SIAM, the AMS, and the AAAS. He has been awarded the Gordon Bell Prize from the ACM, the Sidney Fernbach Award from the IEEE Computer Society, and the SIAM Prize for Distinguished Service to the Profession. He earned a B.S.E. in Aerospace and Mechanical Sciences from Princeton in 1978 and a Ph.D. in Applied Mathematics from Harvard in 1984.

Invited Talk – Fine-Grained and Phase-Aware Frequency Scaling for Energy-efficient Computing on Heterogeneous Multi-GPU Systems by Lorenzo Carpentieri

9 May 2025

Abstract

As computing power demands continue to grow, achieving energy efficiency in high-performance systems has become a key challenge. One of the most promising software techniques for energy efficiency is Dynamic Voltage and Frequency Scaling (DVFS) which optimize the energy-performance trade-off by changing hardware frequencies. 

This presentation introduces two complementary approaches that advance the state-of-the-art in energy-efficient heterogeneous computing through fine-grained and phase-aware frequency tuning.

The first approach, SYnergy, leverages a novel compiler- and runtime-integrated methodology built upon the SYCL programming model to enable fine-grained frequency scaling on heterogeneous hardware. SYnergy allows developers to specify energy goals for each individual kernel such as minimizing Energy-Delay Product (EDP) or achieving predefined energy-performance tradeoffs. Through compiler integration and a machine learning model, the frequency of each kernel is statically optimized based on the specific energy goal. To extend this fine-grained control to large-scale systems, SYnergy includes a custom SLURM plugin that enables execution across all available devices in a cluster, ensuring scalable energy savings.

While fine-grained frequency scaling at the kernel level can significantly improve energy efficiency, it also introduces overhead due to frequent frequency changes—an overhead that can, in some cases, outweigh the potential benefits. To address this, we propose a novel Phase-aware method that detects different phases through application profiling and DAG analysis and sets an optimal frequency for each phase. Our methodology also considers MPI programs, where the overhead can be hidden by overlapping frequency-change with communication. 

Bio

Lorenzo Carpentieri received his master’s degrees from the University of Salerno, Italy in 2022. He is now a PhD student in the Department of Computer Science at University of Salerno, Italy, under the supervision of Prof. Biagio Cosenza. His research interests include high-performance computing, compiler technology, and programming models having a particular interest in energy efficient and approximate computing.

Invited Talk – A Constraint Programming Solver You Can Trust (But Don’t Have To) by Ciaran McCreesh

28 August 2025

Abstract

Constraint programming is a declarative way of solving hard combinatorial, scheduling, resource allocation, and logistics problems. We specify a problem in a high-level language, give it to a solver, and the solver thinks for a while and then gives us the optimal answer. Unfortunately, even the best commercial and academic solvers contain bugs, and will occasionally give a wrong answer, potentially with devastating effects. One way of avoiding this situation is through proof logging, where solvers are modified to output a mathematical proof of correctness alongside their solution. This proof can then be independently audited by a very simple (and potentially even formally verified) proof checking tool, giving us complete confidence in the correctness of solutions (although not the solvers themselves). I’ll explain how proof logging works in general, and give an overview of the challenges and fun involved in bringing it to constraint programming. Ultimately, the aim here is to make algorithms something people can trust with their lives and livelihoods, just as engineers have already done with bridges, planes, and lifts.

Bio

Ciaran McCreesh is a Royal Academy of Engineering Research Fellow working in the Formal Analysis, Theory and Algorithms group in the School of Computing Science at the University of Glasgow. His research looks at practical parallel algorithms, particularly in relation to hard subgraph problems. His publications cover combinatorial search, parallel algorithms, and constraint programming.

Research Fellow Position in Kelvin Living Lab for Sustainability

We have a research fellow position.

Please visit the following link for full description and applying: https://hrwebapp.qub.ac.uk/tlive_webrecruitment/wrd/run/ETREC107GF.open?VACANCY_ID=098532NfiX&WVID=6273090Lgx&LANG=USA

Deadline: 15/07/2024

For more information about the Kelvin Living Lab project, please refer to: https://blogs.qub.ac.uk/dipsa/the-kelvin-living-lab/

QClique: Optimizing Performance and Accuracy in Maximum Weighted Clique – Euro-Par 2024

30th International European Conference on Parallel and Distributed Computing (Euro-Par 2024)

DOI: 10.1007/978-3-031-69583-4_7
PDF Version

Abstract

The Maximum Weighted Clique(MWC) problem remains challenging due to its unfavourable time complexity.In this paper, we analyze the execution of exact search-based MWC algorithms and show that high-accuracy weighted cliques can be discovered in the early stages of the execution if searching the combinatorial space is performed systematically.

Based on this observation, we introduce QClique as an approximate MWC algorithm that processes the search space as long as better cliques are expected. QClique uses a tunable parameter to trade-off between accuracy vs. execution time and delivers 4.7-$82.3 time speedup in comparison to previous state-of-the-art MWC algorithms while providing 91.4% accuracy and achieves a parallel speedup of up to 56x on 128 threads.

Additionally, QClique accelerates the exact MWC computation by replacing the initial clique of the exact algorithm. For WLMC, an exact state-of-the-art MWC algorithm, this results in 3.3x on average.

Code

https://github.com/DIPSA-QUB/QClique

Four Lecturer/Senior Lecturer Positions

Deadline: 26/02/2024

Lecturer/Senior Lecturer in Distributed Computing

https://hrwebapp.qub.ac.uk/tlive_webrecruitment/wrd/run/ETREC107GF.open?VACANCY_ID=844083M0Nn&WVID=6273090Lgx&LANG=USA

Lecturer/Senior Lecturer in Emerging Computing Technologies

https://hrwebapp.qub.ac.uk/tlive_webrecruitment/wrd/run/ETREC107GF.open?VACANCY_ID=566107M0HD&WVID=6273090Lgx&LANG=USA

Lecturer/Senior Lecturer in High Performance Computing

https://hrwebapp.qub.ac.uk/tlive_webrecruitment/wrd/run/ETREC107GF.open?VACANCY_ID=128409M0GW&WVID=6273090Lgx&LANG=USA

Lecturer/Senior Lecturer in Programming Languages & Compilers

https://hrwebapp.qub.ac.uk/tlive_webrecruitment/wrd/run/ETREC107GF.open?VACANCY_ID=125161M0Gm&WVID=6273090Lgx&LANG=USA

Accelerating scientific discovery using domain adaptive language modelling (PhD Thesis)

Thesis on QUB Pure Portal
Thesis in PDF Format

Author: Dimitrios Christofidellis

Research has been conducted for numerous centuries but recent advances in technology have facilitated and accelerated the process keeping the research budget and the required effort at manageable levels. Scientific and technical corpora, such as papers and patents, are great written sources of already existing research knowledge and information. The abundance of such documents, in addition to their exponential growth, set them as a unique source of knowledge offering a great opportunity to push the research boundaries even further. Yet, this information’s volume and growth rate are so large that it is unmanageable for researchers to study all of it. Realizing the potential of incorporating this knowledge efficiently into the discovery process and that the recent advances in the NLP domain provide us with a powerful methodological base, our work aims to establish methods that can speed up parts of the discovery process relying on scientific and technical corpora. We focus on but do not limit our work on patent corpora as methods to leverage such documents in discovery pipelines are limited so far. Our contributions focus on three specific cases: the domain definition of a given corpus in the form of a metagraph; the domain definition of a given corpus in the form of keywords, focusing on the patent classification case; and the semi-automated reporting of a discovery artifact in the form of a patent. In all three cases, we rely on transformer-based Language Models and adhere to domain adaptive techniques to achieve our goals by providing efficient methods in terms of both performance and needed training/inference requirements. Concluding our work, we discuss the importance of our contributions. We demonstrate how our proposed methods can be incorporated into discovery pipelines, combined, and complement existing methods. We conclude with a discussion of promising future directions derived from our work.

Three PhD Positions in Data Analytics – RELAX Doctoral Network

We have 3 recruitment opportunities in a Marie Curie Doctoral Training Network on data analytics. These are PhD opportunities with a research assistant contract:

(1) Application-Aware Relaxed Synchronisation for Distributed Graph Processing, (offered)
(2) Interactive and intelligent exploration of big complex data, (offered) and
(3) Efficient and Responsible Analytics for Urban Mobility and Allied Applications (offered).

Application Deadlines
7 May 2023

RELAX Doctoral Network
The RELAX Doctoral Network brings together 5 cross-disciplinary research groups working across data science, data management, distributed computing and computing systems to pursue a fundamentally new approach to this problem by leveraging the semantics or correctness conditions of applications, with the goal of enhancing scalability, response times, and availability. The Doctoral Network provides a bespoke technical and non-technical training programme and fosters cross-disciplinary and third-party collaborations.

Funding Information
This project is funded by the Engineering and Physical Sciences Research Council grant number EP/X029174/1.

To be eligible for consideration for a RELAX Doctoral Candidate position (covering tuition fees and a basic salary with pension of approx. £33,001 per annum), a candidate must satisfy all the eligibility criteria based on transnational mobility and academic qualifications. The Studentship is open to all nationalities.

Applicants MUST be doctoral candidates, i.e. not already in possession of a doctoral degree at the date of the recruitment (understood as the recruitment call deadline) and undertake transnational mobility (see mobility rule below). Researchers who have successfully defended their doctoral thesis but who have not yet formally been awarded the doctoral degree will not be considered eligible.

Mobility Rule
Researchers must not have resided or carried out their main activity (work, studies, etc.) in the United Kingdom for more than 12 months in the 36 months immediately before their date of recruitment. Compulsory national service, short stays such as holidays, and time spent as part of a procedure for obtaining refugee status under the Geneva Convention are not taken into account.

Academic Requirements
The minimum academic requirement for admission is normally an Upper Second Class Honours degree from a UK or ROI Higher Education provider in a relevant discipline, or an equivalent qualification acceptable to the University.

More Information
Applicants may additionally consider applying to positions with the partner universities of the network: http://www.relax-dn.eu/