Data provenance refers to the ability to reconstruct all the history of a specific calculation, or scientific result, knowing all steps that brought to it and all parameters used in the intermediate calculations.
In AiiDA, we implemented a very sophisticated data repository to overcome these limitations and give the user both flexibility and query efficiency, which is based on a hybrid relational and non-relational database tailored for materials science computations. In it, each calculation, data object, code, and computer is stored as a node in a graph, with an arbitrary number of key=value pairs to describe its properties, and links to describe the logical relationships between data and calculations (inputs or outputs).
These nodes become part of an acyclic directed graph, storing the entire history of any data object and any calculation. This occurs transparently, without the need of any user intervention except for the definition of the input parameters. Finally, it is possible to browse and query the provenance information in the AiiDA database without the need to know the SQL language, but only using a high-level python user interface.