FDI & EPO Research Project

Data Collection Principles

Our Project Team has established the following principles as a foundational framework to bridge the data-trust gap. These principles are not only guidelines but also our commitment to achieving transparent and reliable data management. We publish these principles as part of our commitment to deliver public benefits.

Sourcing from Primary Sources
We strictly source foundational data from official primary sources, avoiding opaque third-party data. The definition of a primary source can vary depending on context and domain—for example, a company register might be a primary source for legal-entity existence but not for details like asset lists or ownership structures. Clear source attribution is essential, as datasets without this clarity are unreliable.

Transparency in Data Processes
Hidden biases in data—arising from selection, mapping, or transformation—can significantly distort analysis. We commit to making all data processes transparent, including the assumptions and decisions involved in data handling, and openly discussing any identified biases in our source data.

Comprehensive Audit Trails
Trust, risk mitigation, quality, and utility in data use—especially within compliance, legal, master data, or investigative contexts—depend critically on full provenance. Our project ensures complete end-to-end, attribute-level audit trails, providing the necessary assurance and context for our users.

Embracing Standards
We use open and established standards wherever possible to minimize friction for users and ensure consistency. In cases where standards are lacking or insufficient, our project engages with others to develop and promote suitable standards, leveraging our expertise and insights.

Real-Time Data Collection
In the fast-paced global environment, collecting data at outdated intervals is inadequate. Our data is gathered and made available in near real-time, ensuring minimal latency between source changes and database updates.

Dataset Perspective on Data Quality
Modern data-driven approaches require viewing data quality from a dataset perspective, not just at the individual record level. We continuously evaluate our dataset as a whole, identifying duplicates, contextual discrepancies, and systematic quality issues that are not apparent from individual records alone.

Openness as a Quality Metric
We advocate for transparency by default. Using proprietary IDs or opaque data models limits the quality of data by introducing biases and restricting its utility. Our project is committed to open data principles, enhancing accessibility and usefulness for a broader audience.

Data Collection Principles

Sourcing from Primary Sources

Transparency in Data Processes

Comprehensive Audit Trails

Embracing Standards

Real-Time Data Collection

Dataset Perspective on Data Quality

Openness as a Quality Metric