EpiXplain

CIIE — Counterfactual CpG Inverse Impact Engine

Local SHAP-style attributions + in silico CpG “neutralization” to test how perturbations shift a model’s decision. From disease ? explanatory CpGs ? genomic context.

See a visual example Planned API

Why counterfactuals?

Predictions alone do not convince clinicians. CIIE augments a classifier with what-if reasoning: if we attenuate or suppress specific CpG signals, does the predicted label change?

This provides actionable evidence that a subset of CpGs is causally influential for the model, not just correlated. It also links those CpGs to regulatory elements and genes to suggest plausible mechanisms.

Local
Per-patient
Counterfactual
Neutralize CpGs
Mechanistic
CpG ? gene

Inputs & outputs

  • Input: trained model + feature set (CpGs) + a patient sample.
  • Compute: local attributions (SHAP-like) + counterfactual trials.
  • Output: ranked CpGs & minimal subsets that flip/shift the decision; genomic mapping & annotations.

How it works

1) Local explainability SHAP-style attributions on CpGs 2) Counterfactual trials Neutralize/attenuate CpGs ? ?prob 3) Minimal subsets Identify smallest CpG sets that flip 4) Mechanistic mapping CpG ? TFBS/promoters ? genes Model-agnostic RF · SVM · XGBoost · transparent score-based Resources Gencode · ReMap · EWAS · STRING · Orphanet

Visual example

For a given patient, CIIE ranks CpGs by local impact. We progressively neutralize top-k CpGs and track the predicted class probability. If a small subset flips the decision, we show it along with the genomic context.

Demo dataset and full interactive plots will be available in the public beta.

? Probability after neutralizing top-k CpGs 0 k flip threshold

Planned API & uploads

Users will be able to upload raw IDAT or cleaned beta matrices and custom signature lists (CpG sets). The pipeline validates, normalizes and runs CIIE against selected models.

  • POST /upload - dataset (IDAT/CSV/Parquet) + metadata
  • POST /run-ciie - choose model, k strategy, neutralization rule
  • GET /result/{id} - attributions, counterfactual subsets, genomic mapping

Security & privacy

Runs execute on secured on-prem compute over VPN; the public site acts as a thin front-end. Logs and versions ensure reproducibility.

FAQ

Is CIIE model-specific?
No. It is model-agnostic and works with tree ensembles, margin-based models, and transparent score-based baselines.
Does counterfactual neutralization correspond to a biological edit?
It is an in silico probe: it tests the model’s sensitivity to local changes. Mechanistic links come from genomic annotations and literature, not direct editing.
Can I use my own CpG signatures?
Yes. Provide a CpG list per phenotype; CIIE will combine it with local attributions to test minimal subsets that drive decisions.