Contributing to GMCI

The GMCI initiative is backed by topically related community submissions. We allow two types of submissions:

Dataset (Dataset Collection)

Log into Zenodo via GitHub, ORCID, OpenAIRE, or create a new account.

To upload a dataset or dataset collection, [Step 1] click on the ‘+’ to create a new record and [Step 2] select the Graphical Modelling and Causal Inference community. [Step 3] Upload the desired files and give as much additional information as possible in the cells below.

Please upload a dataset as a .csv file, or a collection of datasets as a .zip file. If applicable, auxiliary files (graph, ground truth, license, extended description) should be uploaded as separate files. We require the description of the Zenodo record to have the following format:

A description of the dataset/dataset collection.

# always two blank rows between information blocks

Task: A description of the Task.

Summary:

Size of dataset: Nr. of samples x Nr. of dimensions
Task: Causal Discovery Problem / Causal Inference Problem
Data Type: Continuous Data / Mixed Data / Discrete Data / Binary Data / Categorical Data
Dataset Scope: Collection of Datasets / Standalone Dataset
Ground Truth: Known Graph / Partial Graph / Unknown Graph
Temporal Structure: Static Data / Time Series Data
License: CC0 / CC BY / CC BY-NC / …
Missing Values: Existing Missing Values / No Missing Values

Missingness Statement: Missingness Statement, if applicable.

Collection: # If applicable.

Dataset1: Description Dataset1
Dataset2: Description Dataset2
…

Features:

Feature1: Description Feature1
Feature2: Description Feature2
…

Files:

File1: Description File1
File2: Description File2
…

License: # If applicable. Please indicate the main license (dataset, not of supporting material) above

File1: License1
File2: License2

For good reference, see the ALARM (A Logical Alarm Reduction Mechanism) dataset and the Sachs: Protein and Phospholipids Expressions dataset collection.

Analysis Notebook

Prepare a git repository from the GMCI notebook template. Follow the Analysis Notebook Template (notebook.qmd) and set the picture for the library as gallery_picture.png.

Log into Zenodo via GitHub, ORCID, OpenAIRE, or create a new account. Then, connected your repository with Zenodo as shown in the picture below.

Source: https://zenodo.org/account/settings/github/

This will automatically create a record of the zipped repository along with a unique DOI. As the whole repository is zipped, we advise to restrict the repository’s content to the necessary files. The associated DOI will forward you to the Zenodo record, which you can submit it to the Graphical Modelling and Causal Inference community by editing the record’s page. Although the record’s description is taken from the git release, the records meta-information can be edited manually afterwards, e.g., referencing a publication, linking the associated Zenodo dataset or specifying an R package.

We connect the accepted repository with git submodule add, render notebook.qmd and showcase the analysis on this website’s library.

Please contact us when publishing a new release of the git repository. A new release will automatically trigger a new version of the Zenodo record. This website, however, has to be rendered manually. Note that the record on Zenodo, as a copy of the git, is permanent and previous record versions, respectively git releases, can still be accessed.

For good reference, see the PC Algorithm for Causal Discovery.

For both options, a Zenodo record has to be created according to the structure above and submitted to the GMCI Zenodo community. The community moderators process each submission, create the necessary embedding links, potentially request revisions, and assist with any questions. While accepted datasets appear in the GMCI Zenodo community, notebook submissions’ git repositories will be backed on Zenodo with the analysis rendered and featured on this website’s library.

Central instruments of FAIR (Wilkinson et al. 2016) research are providing open data access to researchers and transparently communicating design decisions via metadata in a findable and well-structured place. Therefore, please provide as much additional information as possible in the submission procedure.

Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.

The ownership after acceptance will stay with the authors. Git repositories hosting the notebook will become a submodule of the GMCI git repository, and Zenodo records will become part of the GMCI community.

There are no associated costs.

About Zenodo

The Zenodo platform hosted by CERN provides a long-term storage solution with unique digital object identifiers (DOIs), which can be referred to in publications. Uploading datasets and notebooks after publication is possible, and metadata (e.g., descriptions, licenses, references, etc.) can also be changed after the initial record creation. Note that the created record and data cannot be deleted except under special circumstances. If uncertainties arise, it is possible to test the mechanics of Zenodo without creating permanent digital objects in Zenodo’s Sandbox Instance.