What is CKT?
CKT (Causal Knowledge Trace) is an interactive web application for exploring and analyzing causal relationships in knowledge graphs.
It provides tools for visualizing graphs (which may contain cycles), performing causal inference analysis, and understanding
complex relationships between variables extracted from biomedical literature.
Key Features
- Interactive Graph Visualization: Explore causal graphs with zoom, pan, and physics-based layouts
- Graph Editing: Remove nodes and edges, undo changes, and save modified graphs
- Causal Analysis: Calculate adjustment sets, find instrumental variables, and analyze causal paths
- Graph Configuration: Generate knowledge graphs from biomedical databases with customizable parameters
- Multiple Export Options: Save graphs as R files or HTML reports
User Guide
Step 1: Graph Configuration (Optional)
If you want to generate a new knowledge graph from the SemMedDB database:
- Navigate to the Graph Configuration tab
- Enter the Exposure CUI (Concept Unique Identifier) - you can search for a word and select from available CUIs
- Enter the Outcome CUI - you can search for a word and select from available CUIs
- Configure optional parameters:
- Squelch Threshold: Minimum number of unique PMIDs (publications) required for an edge
- Publication Year Cutoff: Only include publications from this year onwards
- Degree: Maximum distance from exposure/outcome nodes to include
- SemMedDB Version: Select the database version to use
- Click Generate Graph to create your knowledge graph
- Wait for the process to complete - this may take several minutes
Tip: If you already have a graph file, you can skip this step and go directly to Data Upload.
Step 2: Data Upload
Load a graph file into the application:
- Navigate to the Data Upload tab
- Choose one of two methods:
- Method 1 - Select Existing File: Choose a file from the dropdown (files in graph_creation/result directory) and click 'Load Selected Graph'
- Method 2 - Upload New File: Use the file upload interface to upload an R file containing your DAG definition
- Optional: Apply filtering to remove leaf nodes (nodes with only one connection)
- Wait for the graph to load - you'll see a progress indicator
Required File Format: Your R file must contain a dagitty graph definition assigned to variable 'g'
Step 3: Graph Visualization
Explore and modify your causal graph:
- Navigate to the Graph Visualization tab
- Interact with the graph:
- Zoom: Use mouse wheel or pinch gesture
- Pan: Click and drag the background
- Select Nodes/Edges: Click on them to view details
- Move Nodes: Drag nodes to reposition them
- View edge information in the table below the graph
- Modify the graph:
- Select a node and click Remove Selected Node
- Select an edge and click Remove Selected Edge
- Click Undo Last Removal to revert changes
- Adjust physics settings for better layout
- Save your work:
- Save DAG: Download modified graph as an R file
- Save HTML: Export as a readable HTML report
Node Colors:
● Red = Exposure |
● Cyan = Outcome |
● Gray = Other variables
Step 4: Causal Analysis
Perform statistical causal inference analysis:
- Navigate to the Causal Analysis tab
- Select your Exposure Variable from the dropdown
- Select your Outcome Variable from the dropdown
- Choose the Effect Type (Total Effect or Direct Effect)
- Run analyses:
- Calculate Adjustment Sets: Find variables to control for to estimate causal effects
- Find Instrumental Variables: Identify variables for instrumental variable analysis
- Analyze Causal Paths: Examine all paths between exposure and outcome
- Run Complete Analysis: Execute all analyses at once
- Review results in the tabbed panels
Understanding Causal Concepts
Adjustment Sets
A set of variables that, when controlled for in your analysis, blocks all confounding paths between exposure and outcome. This allows you to estimate the causal effect without bias.
Example: If you want to know if smoking causes lung cancer, you might need to adjust for age and genetics to block confounding paths.
Instrumental Variables
A variable that (1) affects the exposure, (2) does not directly affect the outcome except through the exposure, and (3) is not associated with confounders.
Example: Distance to a smoking cessation clinic might be an instrument for smoking behavior when studying health outcomes.
Causal Paths
All directed paths from exposure to outcome in the graph. Paths can be 'open' (creating confounding) or 'blocked' (already controlled).
Example: Smoking → Tar Deposits → Lung Cancer is a causal path.
Causal Graph
A graphical representation of causal relationships where nodes represent variables and directed edges represent causal effects. The graph may contain cycles representing feedback loops.
Example: Education → Income → Health Status