Contributing to GMCI

The GMCI initiative is backed by topically related community submissions. We allow two types of submissions:

Log into Zenodo via GitHub, ORCID, OpenAIRE, or create a new account.

To upload a dataset or dataset collection, [Step 1] click on the ‘+’ to create a new record and [Step 2] select the Graphical Modelling and Causal Inference community. [Step 3] Upload the desired files and give as much additional information as possible in the cells below.

Please upload a dataset as a .csv file, or a collection of datasets as a .zip file. If applicable, auxiliary files (graph, ground truth, license, extended description) should be uploaded as separate files. We require the description of the Zenodo record to have the following format:

A description of the dataset/dataset collection.

# always two blank rows between information blocks

Task: A description of the Task.

Summary:

  • Size of dataset: Nr. of samples x Nr. of dimensions

  • Task: Causal Discovery Problem / Causal Inference Problem

  • Data Type: Continuous Data / Mixed Data / Discrete Data / Binary Data / Categorical Data

  • Dataset Scope: Collection of Datasets / Standalone Dataset

  • Ground Truth: Known Graph / Partial Graph / Unknown Graph

  • Temporal Structure: Static Data / Time Series Data

  • License: CC0 / CC BY / CC BY-NC / …

  • Missing Values: Existing Missing Values / No Missing Values

Missingness Statement: Missingness Statement, if applicable.

Collection: # If applicable.

  • Dataset1: Description Dataset1

  • Dataset2: Description Dataset2

Features:

  • Feature1: Description Feature1

  • Feature2: Description Feature2

Files:

  • File1: Description File1

  • File2: Description File2

License: # If applicable. Please indicate the main license (dataset, not of supporting material) above

  • File1: License1

  • File2: License2

For good reference, see the ALARM (A Logical Alarm Reduction Mechanism) dataset and the Sachs: Protein and Phospholipids Expressions dataset collection.

Prepare a git repository from the GMCI notebook template. Follow the Analysis Notebook Template (notebook.qmd) and set the picture for the library as gallery_picture.png.

Log into Zenodo via GitHub, ORCID, OpenAIRE, or create a new account. Then, connected your repository with Zenodo as shown in the picture below.

This will automatically create a record of the zipped repository along with a unique DOI. As the whole repository is zipped, we advise to restrict the repository’s content to the necessary files. The associated DOI will forward you to the Zenodo record, which you can submit it to the Graphical Modelling and Causal Inference community by editing the record’s page. Although the record’s description is taken from the git release, the records meta-information can be edited manually afterwards, e.g., referencing a publication, linking the associated Zenodo dataset or specifying an R package.

We connect the accepted repository with git submodule add, render notebook.qmd and showcase the analysis on this website’s library.

Please contact us when publishing a new release of the git repository. A new release will automatically trigger a new version of the Zenodo record. This website, however, has to be rendered manually. Note that the record on Zenodo, as a copy of the git, is permanent and previous record versions, respectively git releases, can still be accessed.

For good reference, see the PC Algorithm for Causal Discovery.

For both options, a Zenodo record has to be created according to the structure above and submitted to the GMCI Zenodo community. The community moderators process each submission, create the necessary embedding links, potentially request revisions, and assist with any questions. While accepted datasets appear in the GMCI Zenodo community, notebook submissions’ git repositories will be backed on Zenodo with the analysis rendered and featured on this website’s library.

Central instruments of FAIR (Wilkinson et al. 2016) research are providing open data access to researchers and transparently communicating design decisions via metadata in a findable and well-structured place. Therefore, please provide as much additional information as possible in the submission procedure.

Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.

The ownership after acceptance will stay with the authors. Git repositories hosting the notebook will become a submodule of the GMCI git repository, and Zenodo records will become part of the GMCI community.

There are no associated costs.

About Zenodo

The Zenodo platform hosted by CERN provides a long-term storage solution with unique digital object identifiers (DOIs), which can be referred to in publications. Uploading datasets and notebooks after publication is possible, and metadata (e.g., descriptions, licenses, references, etc.) can also be changed after the initial record creation. Note that the created record and data cannot be deleted except under special circumstances. If uncertainties arise, it is possible to test the mechanics of Zenodo without creating permanent digital objects in Zenodo’s Sandbox Instance.