Imagine you have 3 binary events (x1, x2, x3) and you want to find the joint probability p(x1, x2, x3), but you only know p(x1, x2), p(x1, x3) and p(x2, x3). All p() sum to 1. We know that if we sum over one variable we get the pair joint from the triple joint.
This can be useful if we know pairs of two attributes but there is no cross tab between 3 variables.
So we have 2**3 = 8 unknowns (or 7 if we factor in the sum constraint) and 3 * 4 knowns, however, there is redundancy so we actually have less than 7 knowns. So this means there is no unique solution. Typically this can be solved with a max entropy approach. In this example we compare the max ent approach to 2 competing approaches.
In the max ent approach we also impose an inequality constraint that the error between the implied and actual double joints is below a threshold.
Using 200 synthetic examples of known double and triple joint probabilities, we test whether any of the 3 methods performs best in terms of a small error between true and estimated triple joint.
We find that the max ent approach does best, brute force does worst. We also checked whether an averaged blend of the three could do better, it doesn’t. So this shows that max ent is pretty robust.
maxent bf sim blend
count 200.000000 200.000000 200.000000 200.000000
mean 0.014970 0.024962 0.019601 0.015578
std 0.008622 0.017489 0.009930 0.009223
min 0.004477 0.006336 0.002137 0.002692
25% 0.007793 0.012597 0.011709 0.009132
50% 0.012393 0.017957 0.018866 0.013213
75% 0.019964 0.033672 0.026529 0.019691
max 0.045763 0.116291 0.051152 0.061373