Detecting Dataset Reuse and Modification in Data Spaces via Structure-Aware Similarity Analysis

We all know that once data leaves your building, it’s almost impossible to track if people are actually following the rules. This new research, authored by our partners Christos Panagiotou and Kyriakos Stefanidis from Athena Research Center, introduces a ‘Digital Private Eye’ for data sharing.

Using a mix of structural analysis and AI, this tool can scan a dataset and instantly tell if it’s a stolen or slightly altered version of your original work—even if the labels have been changed. We tested it on real-world energy and automotive data with over 90% accuracy.

Essentially, we’re providing the ‘teeth’ for data contracts, ensuring that when companies share information, they finally have a way to prove if someone is playing foul.