Integrated Data Analysis Pipelines for Large-Scale Data Management, HPC, and Machine Learning

DAPHNE

Integrated data analysis pipelines combine tasks from data managment, high performance computing and machine learning and have gained increased prevalence and importance over the last decades. While these domains do share many compilation and runtime managment techniques and share many requirements, like effciently utilizing moderen heterogenous hardware, these domains largely diverge in reagards to programming paradigms, resource managment systems and data formats. Daphne aims to provide a unified Interface for these domains via either DaphneLib, a Phyton library, or DaphneDSL. The LLVM based Daphne compiler alongside the Daphne runtime aim to provide efficient schedueling and distribution amoung modern hardware such as SIMD capable multicore CPUs, GPUs, FPGAs and Computional Storage Devices.

Project runtime: 11/2020 - 11/2024

Funder: EU

Partners: TUD, ITU, KNOW-Center, Intel