Automated Interactive Infrastructure and Database for Computational Science

AiiDA is a flexible and scalable informatics' infrastructure to manage, preserve, and disseminate the simulations, data, and workflows of modern-day computational science. Able to store the full provenance of each object, and based on a tailored database built for efficient data mining of heterogeneous results, AiiDA gives the user the ability to interact seamlessly with any number of remote HPC resources and codes, thanks to its flexible plugin interface and workflow engine for the automation of complex sequences of simulations.

Journal ref: G. Pizzi, A. Cepellotti, R. Sabatini, N. Marzari, and B. Kozinsky, AiiDA: automated interactive infrastructure and database for computational science, Comp. Mat. Sci. 111, 218-230 (2016)

Open access link: arXiv:1504.0116

Data provenance

Data provenance refers to the ability to reconstruct all the history of a specific calculation, or scientific result, knowing all steps that brought to it and all parameters used in the intermediate calculations.

In AiiDA, we implemented a very sophisticated data repository to overcome these limitations and give the user both flexibility and query efficiency, which is based on a hybrid relational and non-relational database tailored for materials science computations. In it, each calculation, data object, code, and computer is stored as a node in a graph, with an arbitrary number of key=value pairs to describe its properties, and links to describe the logical relationships between data and calculations (inputs or outputs).

These nodes become part of an acyclic directed graph, storing the entire history of any data object and any calculation. This occurs transparently, without the need of any user intervention except for the definition of the input parameters. Finally, it is possible to browse and query the provenance information in the AiiDA database without the need to know the SQL language, but only using a high-level python user interface.