How to Choose a Conditional Independence Test¶
Choosing the right conditional independence (CI) test depends on the type of your data and the assumptions you are willing to make about the underlying dependence. citk ships 19 tests organised under the six survey families plus four adapter strategies; this guide gives a practical mapping from data + assumptions to tests.
Key Considerations¶
1. Data Type¶
All continuous:
fisherz_citk: linear, Gaussian — fastest baseline.spearman: monotonic but not necessarily linear — robust non-parametric alternative.kci: kernel-based, captures arbitrary non-linear dependence.rcit,rcot: random-Fourier-feature approximations to KCI; faster on larger samples.cmiknn: kNN-based conditional mutual information with local-permutation p-values.
All discrete (categorical):
gsq(G-test) orchisq(Chi-Square): classical contingency-table tests.dummy_fisherz: one-hot encoding adapter that aggregates Fisher-Z calls; competitive when categorical cardinalities are moderate.
Mixed continuous + discrete:
cmiknn_mixed: mixed-type kNN CMI estimator (tigramite).mcmiknn: another mixed-type kNN CMI implementation (vendored from upstreamhpi-epic/mCMIkNN).regci: parametric likelihood-ratio test using GLM regression chosen per response type (continuous → linear, discrete → logistic).ci_mm: symmetric likelihood-ratio test from RMXMthat runs both regression directions and combines them.gcm,wgcm,pcm: ML-residualisation tests using random forest regression (viapycomets); flexible, asymptotically calibrated, and the RF nuisance regressions handle continuous, discrete, or mixed inputs natively.disc_chisq,disc_gsq: equal-frequency discretisation adapters around classical discrete tests.hartemink_chisq: information-preserving Hartemink discretisation (via Rbnlearn) + Chi-Square; better dependence preservation than equal-frequency binning.
2. Relationship Type¶
Linear:
fisherz_citkis the computationally efficient choice when both Gaussianity and linearity hold.Monotonic:
spearmanworks on ranks; robust to non-linearities as long as the relationship is monotonic.Non-linear / complex: kernel tests (
kci,rcit,rcot), kNN-based tests (cmiknn,cmiknn_mixed,mcmiknn), and ML-residualisation tests (gcm,wgcm,pcm) are all designed to detect arbitrary dependence at higher computational cost.wgcmandpcmadd power on alternatives where the dependence is localised in the conditioning space or where the predictor is weakly identified.
3. Sample Size¶
Small samples: classical tests (
fisherz_citk,spearman,chisq,gsq) are most reliable; non-parametric and ML-based tests need more data for stable estimation.Large samples: kernel tests (especially exact
kci) become expensive — preferrcit/rcotfor random-feature approximations, orgcm/wgcm/pcmfor ML-residualisation with linear cost.Very large samples:
kciis roughly quadratic in \(n\); consider capping or switching to a faster family.
Summary Table¶
Test Name |
Family |
Data Type |
Relationship Type |
Key Assumption(s) |
|---|---|---|---|---|
|
Partial Correlation |
Continuous |
Linear |
Approximate Gaussianity |
|
Partial Correlation |
Continuous |
Monotonic |
Monotonicity |
|
Contingency Table |
Discrete |
Any |
Adequate cell counts |
|
Contingency Table |
Discrete |
Any |
Adequate cell counts |
|
Regression |
Mixed or continuous |
Any (within model class) |
Correct GLM specification per variable type; requires |
|
Regression |
Mixed |
Any (within model class) |
Correct linear/logistic per variable; requires |
|
Nearest Neighbor |
Continuous |
Any |
Sample size adequate for kNN density estimation; requires |
|
Nearest Neighbor |
Mixed |
Any |
Variable types declared via |
|
Nearest Neighbor |
Mixed |
Any |
Vendored upstream |
|
Kernel |
Continuous |
Any |
Suitable kernel choice; cost is at least quadratic in \(n\) |
|
Kernel |
Continuous |
Any |
Random-feature approximation; requires |
|
Kernel |
Continuous |
Any |
Random-feature approximation with reduced-dim conditioning; requires |
|
Machine-Learning-Based |
Mixed or continuous |
Any |
Consistent nuisance regression; requires |
|
Machine-Learning-Based |
Mixed or continuous |
Any (esp. localised) |
Consistent nuisance regression + sample splitting; requires |
|
Machine-Learning-Based |
Mixed or continuous |
Any (assumption-lean) |
Consistent residualisation; requires |
|
Adapter Strategies |
Mixed or continuous |
Any |
Discretisation preserves dependence; ChiSq cell-count rule |
|
Adapter Strategies |
Mixed or continuous |
Any |
Discretisation preserves dependence; GSq cell-count rule |
|
Adapter Strategies |
Mixed or discrete |
Any (encoded space) |
One-hot encoding fidelity; combined p-values approximation |
|
Adapter Strategies |
Mixed or continuous |
Any |
Information-preserving discretisation; requires |