CEDAR’s mission is to create high-value datasets to enhance and combine the existing CEDS and develop solutions for a more transparent public governance in Europe.
CEDAR is a 36-month Horizon Europe-funded project, started in January 2024, that involves 31 partners with interdisciplinary knowledge and whose key goal is to promote transparent and accountable public governance in Europe. By sharing high-quality datasets, developing secure connectors for European data repositories, and employing innovative technologies for efficient big data management and analysis, CEDAR aims to promote better, evidence-based decision-making, combat corruption, and reduce fraud in public administration.
What the CEDAR project will do
CEDAR will identify, collect, fuse, harmonise, protect, and share new high-quality datasets. This will involve digitising data from public administration archives and generating synthetic data to improve real-world data quality. The project also aims to harmonise and standardise different public and private data sources into new unified datasets. Furthermore, it seeks to enable fair and secure data access to these datasets and integrate them with Common European Data Spaces available in Europe.
CEDAR will develop methods, tools, and guidelines to digitise, protect, and integrate data to address significant issues like corruption, aligning with the European Strategy for Data and the development of Common European Data Spaces (CEDS), and the European Data Act. This will lead to improved transparency in public governance, promoting European values and rights in the digital world, and enriching the European data ecosystem and economy.
CEDAR technologies
CEDAR project aims to use state-of-the-art artificial intelligence (AI) and big data technologies to counter corruption and improve transparency in involved European sectors. At its core, CEDAR implements advanced machine learning pipelines, that includes LLMs and NLP, to analyse multilingual text data. Additionally, CEDAR incorporates state-of-the-art multimedia processing technologies for video understanding and deep fake detection, paired with advanced audio processing for speech enhancement and keyword spotting in noisy environments. Innovative graph-based analysis and econometric methods further improve CEDAR’s multi-modal approach, enabling the detection of complex corruption patterns in financial transactions and socioeconomic data.
Additionally, CEDAR technologies are integrated within a robust DataOps and MLOps infrastructure, aiming to develop interoperable and secure connectors and APIs to use and enrich CEDs.
The CEDAR use cases
CEDAR focuses its validation activities on three specific use cases in three different European countries. The three pilots will: Be run on vast volumes of complex data, provided by end users and gathered through various open data platforms; involve several CEDS and related ecosystems; include all data life cycle phases from collection to sharing; and generate a variety of positive impacts for Europe.
1. Monitoring national RRP funds in Italy
Consequently, and despite the digitisation of public procurement and the checks in force with the use of eAppaltiFVG platform in Italy, the risk of corruption and mafia infiltration in the procurement processes remains high. The infiltration of organised crime in the management of public funds poses a significant problem for both the economy and society.
Ultimately, it is crucial that steps are taken to prevent organised crime from infiltrating the management and use of recovery funds to protect the well-being of society as a whole and to support sustainable recovery and economic prosperity.
The Italian pilot’s objectives
The CEDAR project will upgrade the eAppaltiFVG platform with:
1. A data space containing relevant data coming from different areas (e.g., tenders)
2. A set of AI-powered tools to enable efficient and diligent monitoring of activities during all phases of the procurement process, supporting digital and physical control, and provide for a preventive intelligence for an early detection of anomalies.
Validation
The eAppaltiFVG will be integrated with CEDAR through dedicated connectors and APIs, and thereby extended with analytics-ready datasets and custom AI-powered services.
2. Transparent management of Slovenian public healthcare funds
In the public procurement in Slovenia, low-value tenders are specifically problematic. This is because such tenders are less regulated and consequently done in a less standardised manner.
Two aspects are problematic, namely the preparation of the tenders and the preparation of the bids. Without the availability of high-quality datasets in their electronic form, advanced, cost-effective, and user-friendly technologies to manage and process them, it is not possible to ensure accountable governance of healthcare funds, and it is thus not possible to provide public healthcare services of the highest possible quality.
The Slovenian pilot’s objectives
The pilot aims to digitise the current archive of past tenders and bids that comprises documents in different formats in different locations, transform them into rich metadata, integrate them with external sources, and thereby enable their analysis to identify patterns that may indicate fraudulent activities. With this, CEDAR will digitise the procurement process for low-value tenders in the healthcare sector, ensuring a more transparent governance of public funds.
This will enable real-time monitoring of public procurement, enhancing the ability to detect any events that may suggest fraudulent or corrupt practices before the tenders are even published, and after the associated bids are received.
Validation
We will digitise the data from Slovenian archives and store them in local systems. In parallel, we will utilise CEDAR connectors, APIs, and other data technologies to further consume data from other private and public sources, and pre-process and analyse the data with advanced data management, data analytics, and machine learning (ML) tools.
3. Transparent management of foreign aid for rebuilding Ukraine
Ukraine is currently resisting an aggressive Russian invasion, and is a recipient of unprecedented amounts of foreign aid for infrastructure restoration and rebuilding projects. For the success of the restoration goals and donor support, it is of utmost importance to ensure the integrity of the foreign aid distribution and prevent corrupt practices in rebuilding projects.
Ukraine has the political will to fight corruption, and since 2016 already uses the electronic system Prozorro for public procurement. However, without extensions of this system with the tools for efficient control by civil society and donors, it has limited potential to further improve the situation with the corruption.
The Ukrainian pilot’s objectives
The pilot aims to help the Ukrainian government and donors to make better use of their own data, and secure practices to better manage public funds and foreign aid, including eliminating potential corruption risks in procurement procedures. CEDAR will work on solutions for multi-factor risk analysis of legal entities, and the key people behind them, to search for potential links with Russia and identify bids that carry high risk for corruption. Moreover, we will use advanced data technologies and ML algorithms to monitor active projects after they have been approved.
Validation
Two existing platforms from Ukrainian partners will be utilised, for the analysis of legal entities and PEPs, respectively. These will be integrated with new data sources, extended with new data analytics and ML algorithms, and improved with the CEDAR.
Please note, this article will also appear in the 19th edition of our quarterly publication.