Welcome to the Curious Containers framework for reproducible research.
Data, code and environment are the crucial components required to successfully reproduce a computational experiment. Usually these are scripts or executables, that have been setup to work on a researcher’s computer and that load data from the local filesystem. In this situation, it is difficult for the experiment to be stored for future use, shared with coworkers, referenced in a publication or distributed in a compute cluster for parallel batch processing.
Curious Containers (CC) is a framework for computational reproducibility, that allows researchers to define each component of an experiment as a network resource. These resources are referenced by unique identifiers in the Reproducible Experiment Description (RED) file format. An execution engine provided by CC interprets the RED file and combines all resources to run the experiment. This concept allows resources to be interchangeable and to have individual access restriction based on the chosen network transmission and authentication protocol.
If you are new to the project, we advise you to work through the RED Beginner’s Guide. As an introduction (in german), watch the following talk at the deRSE 2019 conference for research software engineering (PDF Slides).
Issues and Feedback
If you have problems or want to give feedback, please open an issue in the curious-containers project on Github. Issue trackers of every other repository are closed.
How it works
CC employs Docker to install an experiment runtime environment, including code and dependencies, in a container image. Using a container registry this image becomes a network resource, that can be deployed on any Linux computer with Docker installed.
In contrast to existing research platforms, CC as a framework does not provide a tightly integrated storage solution. Instead, it connects to any data storage, that is already present in a research institution. For this to work, CC provides connector programs for common storage interfaces like SSH, HTTP or XNAT, that are shipped as part of an experiment’s container image. These connectors download input data into the running container, upload results to a defined destination or mount directories via FUSE network filesystems. If a connector for a certain storage solution does not yet exist, the user can provide a custom connector by implementing an interface specification.
CC implements two execution engines that run experiments defined in RED files. CC-FAICE is simple to install on a local computer and can run one experiment at a time. CC-Agency is a server side execution engine, that connects to docker-engines in a compute cluster for parallel execution, tracks the execution state of scheduled experiments in a database and provides a REST web interface.
CC and RED support the FAIR principles for reproducible research, that require all experiment resources to be findable, accessible, interoperable and reusable. Therefore, the command-line interface (CLI) description of the experiment’s script or executable, that is embedded in a RED file, follows the CommandLine Description of the Common Workflow Language (CWL). This enables portability between CC execution engines and a CWL runtime.
Machine Learning Workloads
CC explicitely supports machine learning and other high performance workloads. Available NVIDIA graphics processing units are accessible in Docker containers using Nvidia Container Toolkit (or its predecessor nvidia-docker). Large training data directories, that are multiple terabytes in size, can be mounted via FUSE.
The Road to RED 10
In the past, Curious Containers was mainly a research project to develop ideas and try new concepts in the context of computational reproducibility. Our existing file formats and software components are considered BETA releases.
It is now time to stabilize the ecosystem as a major step towards long-term reproducibility. RED 10 will be the first stable release, such that experiments defined in the RED 10 format will be supported in all future Curious Containers software releases.
RED 10 Pre-Releases
We have released RED 8, that includes the RED Connector CLI 1 specification. Therefore container images and their installed RED connectors, that are prepared to work with RED 8, will be compatible with all future RED versions.
With RED 9, we will move from the CWL 1.0 to the CWL 1.1 standard. RED 9 will be tested extensively without many additional changes, hopefully leading to a rock solid RED 10 release.
The Curious Containers software is developed at CBMI (HTW Berlin - University of Applied Sciences). The work is supported by the German Federal Ministry of Economic Affairs and Energy (ZIM project BeCRF, grant number KF3470401BZ4), the German Federal Ministry of Education and Research (project deep.TEACHING, grant number 01IS17056 and project deep.HEALTH, grant number 13FH770IX6) and HTW Berlin Booster.