Comparative metrics¶
This page summarises the mathematical content of the comparative
metrics implemented in bnm. The treatment follows Tsamardinos,
Brown, & Aliferis (2006) and the survey in de Jongh & Druzdzel
(2009); readers who need full proofs should consult those sources.
Setting¶
Let \(G_1, G_2\) be two mixed graphs over the same vertex set \(V\)
with \(|V| = d\). Each graph is encoded by a pair of \(d \times d\)
endpoint-mark matrices following the convention of decision D5
(docs/design/api_v0.py §D): endpoints[i, j] is the mark at
vertex \(j\) of the edge \(\{i, j\}\), taking values in
\(\{0, 1, 2, 3\}\) (no edge / TAIL / ARROW / CIRCLE).
For comparative metrics it is convenient to classify each unordered pair \(\{i, j\}\) with \(i < j\) into one of five disjoint edge kinds:
kind |
endpoint \((i, j)\) |
endpoint \((j, i)\) |
|---|---|---|
no edge |
NO_EDGE |
NO_EDGE |
directed forward \(i \to j\) |
ARROW |
TAIL |
directed backward \(i \leftarrow j\) |
TAIL |
ARROW |
undirected \(i - j\) |
TAIL |
TAIL |
bidirected \(i \leftrightarrow j\) |
ARROW |
ARROW |
(CIRCLE-marked edges, used by PAGs, are classified as a sixth kind and treated below.)
Adjacency mask and edge classification¶
Let \(A_k(i, j) = 1\) iff the unordered pair \(\{i, j\}\) is adjacent (edge kind \(\neq\) no edge) in \(G_k\). Let \(C_k(i, j)\) be the kind above (one of five for CPDAGs, one of six for PAGs). The skeleton-level metrics depend only on \(A_k\); the orientation-aware metrics depend on the full \(C_k\).
Hamming distance¶
The (skeleton-only) Hamming distance counts pairs where adjacency differs:
It ignores edge orientation and edge kind. Implemented as
bnm.hd.
Structural Hamming distance¶
The structural Hamming distance counts pairs where the kind differs (Tsamardinos et al., 2006):
where
additions count pairs adjacent in \(G_2\) but not in \(G_1\);
deletions count pairs adjacent in \(G_1\) but not in \(G_2\);
reversals count pairs adjacent in both but with different edge kind.
The decomposition is exposed by bnm.count_additions,
bnm.count_deletions, bnm.count_reversals for fine-grained
diagnostics. Implemented as bnm.shd.
Convention
A directed-vs-undirected mismatch (e.g. \(i \to j\) in \(G_1\) versus \(i - j\) in \(G_2\)) is counted as a single reversal — equivalently, SHD treats every distinct edge kind as equally far apart. de Jongh & Druzdzel (2009) discuss alternative conventions; bnm v0.2 follows the Tsamardinos et al. (2006) definition.
Precision, recall, F1¶
For exact-match orientation accounting, define:
Then
By convention, \(\mathrm{precision} = \mathrm{recall} = F_1 = 0\)
when both graphs have no adjacencies. Implemented as
bnm.precision, bnm.recall, bnm.f1.
On the strictness of TP
bnm defines a true positive as an exact-mark match — same
adjacency and same edge kind. Some references count adjacency-
only TPs (i.e. either edge kind matches) and report SHD-by-skeleton
separately. Practitioners citing F1 from bnm should be explicit
about the orientation-aware definition.
References¶
de Jongh, M., & Druzdzel, M. J. (2009). A comparison of structural distance measures for causal Bayesian network models. In Recent Advances in Intelligent Information Systems, 443–456.
Peters, J., & Bühlmann, P. (2015). Structural intervention distance for evaluating causal graphs. Neural Computation, 27(3), 771–799.
Tsamardinos, I., Brown, L. E., & Aliferis, C. F. (2006). The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65(1), 31–78.