Chi-Squared Test

The Pearson chi-squared (\(\chi^2\)) test is the classical contingency-table CI test for categorical data. It compares observed and expected cell counts in a contingency table built from \(X\), \(Y\), and the (categorical) conditioning set \(Z\), and is calibrated against an asymptotic \(\chi^2\) distribution (Agresti, 2013).

Intuition. Conditional independence in a contingency table corresponds to multiplicative factorisation of conditional cell probabilities; the Pearson statistic measures the squared, expected-frequency-normalised discrepancy between observed counts and the counts implied by that factorisation (Agresti, 2013).

Mathematical Formulation

The test compares observed frequencies \(O_i\) with the frequencies \(E_i\) that would be expected if \(X \perp Y \mid Z\) held, summed over all cells of the (stratified) contingency table (Agresti, 2013):

\[\chi^2 = \sum_{i} \frac{(O_i - E_i)^2}{E_i}\]

Under the null and standard regularity conditions, this statistic is asymptotically \(\chi^2\)-distributed with degrees of freedom

\[df = (|X| - 1)(|Y| - 1) \prod_{z \in Z} |z|\]

where \(|V|\) is the number of distinct categories of variable \(V\) (Agresti, 2013). The Pearson \(\chi^2\) statistic is asymptotically equivalent to the likelihood-ratio \(G^2\) statistic (Agresti, 2013).

Assumptions

  • Categorical data. Both the variables of interest and the conditioning set must be categorical (Agresti, 2013).

  • Independent observations. Standard multinomial sampling with independent draws is assumed (Agresti, 2013).

  • Adequate cell counts. Cochran (1954) recommends that the test may be inappropriate if more than 20% of cells have an expected count below 5 or any cell has expected count below 1; these rules of thumb remain the standard finite-sample diagnostic for the test.

  • Power decay with conditioning-set size. As \(|Z|\) grows the number of contingency-table cells grows multiplicatively; sparse-table effects degrade power well before computational limits are reached (Agresti, 2013).

  • Dtype validation is opt-in. Passing data outside the declared dtype produces undefined results; call Test.validate_data(data) to check. citk does not enforce supported_dtypes at construction.

Code Example

import numpy as np
from citk.tests import ChiSq

# Generate discrete data representing a collider: X -> Y <- Z
n = 500
X = np.random.randint(0, 2, size=n)
Z = np.random.randint(0, 2, size=n)
Y = (X + Z + np.random.randint(0, 2, size=n)) % 2
data = np.vstack([X, Y, Z]).T

# Initialize the test
chisq_test = ChiSq(data)

# Test for unconditional independence (X and Z are independent)
p_value_unconditional = chisq_test(0, 2)
print(f"P-value for X _||_ Z: {p_value_unconditional:.4f}")

# Test for conditional dependence on the collider Y
p_value_conditional = chisq_test(0, 2, [1])
print(f"P-value for X _||_ Z | Y: {p_value_conditional:.4f}")

API Reference

For a full list of parameters, see the API documentation: :class:citk.tests.contingency_table_tests.ChiSq.

References

Agresti, A. (2013). Categorical Data Analysis (3rd ed.). Wiley.

Cochran, W. G. (1954). Some methods for strengthening the common \(\chi^2\) tests. Biometrics, 10(4), 417-451.