Using CausalMGM is easyJust follow the steps below

CausalMGM is a data analysis tool to explore large, complex datasets. The method learns a graphical model of the data where the nodes are variables and edges display the dependencies among variables. The graphical model allows users to query their data to find the direct influences of a target variable of interest, or to find novel associations between pairs of variables.

Introduction Video

User Guide

Manage Your Data

First, let's upload your dataset

Once uploaded, your dataset format will be varified.
Needs help with the data Format? Already have your task ID?
Upload Data Explain Data Format

Do not have datasets?

Get Sample Data Use Sample Data Retrieve Results
User Guide

Configure Feature Selection

Do you wish to perform automatic clustering of redundant features?     

Number of Variables to be selected:

Name of Target Variable:

Variables to keep (optional):

User Guide

Config your experiment

Please specify the value for each parameter

Lambda 1:   
     Lambda 2:  
  Lambda 3:  

   Enable PC-Stable?   

User Guide



Results will be avaliable here


CausalMGM Implimentation:

Vineet Raghu

Interface & Cloud Construction:

Xiaoyu Ge, Daniel Petrov

Supervised By :

Panos K. Chrysanthis
Panayiotis V. Benos

Data Format:

The CausalMGM method expects a text file in tab-separated format with variables in the columns and samples in the rows.

  • The first row of the file should have the variable names, and each row following should have numerical or categorical data.
  • Numeric columns should only contain digits along with a single decimal point.
  • Categorical columns should only have a maximum of 5 unique categories and may contain any combination of numbers and characters to encode each category.
  • The current implementation of MGM does not support data with missing values, so this should be handled by the user before submitting.
  • Otherwise, complete case analysis or median imputation will be performed automatically. If the user chooses to use our automated methods for handling missing data, then missing data entries should be encoded with an *.
  • Please download the sample data to see an example of the properly formatted dataset.

Explanation of Feature Selection:

CausalMGM's feature selection is based on the PrefDiv algorithm, which is a method to identify the features most associated with a target variable yet not associated to one another.

PrefDiv requires the following input:

  • The number of features to be selected.
  • Features that should be kept no matter their relevance.
  • The target variable.

Important Notes:
  • PrefDiv only operates on continuous features, so all categorical features will automatically be included.
  • The target variable may be continuous or categorical.
  • The method is computationally expensive on large datasets, please be cautious.

Introduction to CausalMGM