Basic HTML Version

provides information about the flow out of the funnel

, because

provides information about the

flow into the funnel

. But given the flow out of the faucet

, the setting of the handle

provides

no extra information about the flow from the funnel

. That is the same dependency profile that is

associated with the common cause. By symmetry, the result is the same if we swap

for

. So the

common effect pattern of conditional and non-conditional dependencies differs

from

the other three cases. Recognizing the common effect pattern in the data can potentially yield

genuine causal knowledge from cheap, abundant, moral, non-experimental data, as long as one

examines at least three variables.

Of course, all of that depends on the variables in question providing a

causal

description of the situation, a condition that, itself, requires some heavy lifting from Ockham. It

also assumes that causal paths do not cancel perfectly―no inductive method can win against a

illusion. But even given those assumptions, dependence is an

―it is verifiable from

data, but not refutable (because the dependence could be arbitrarily small), so the above logic

concerning the problem of induction, simplicity, Ockham’s razor, and reversals of opinion applies.

It can be shown (Kelly and Mayo-Wilson 2010) that inferring causal directionality from non-

experimental data is subject to the kind of forcible reversals of opinion that were discussed above,

in connection with the polynomial degree problem. No matter how strong a given causal

connection happens to be, you can never really guard against discovering new, arbitrarily small

effects that cause you to flip the orientation of the connection in question any finite number of

times, given that you can converge to the true orientation of the cause at all. Skepticism is one

response to our argument, but it is a luxury―sometimes, one

make a policy decision and

experimental data will not be forthcoming. The right response is that unavoidable reversals of

opinion are justified because they are unavoidable and avoidable reversals are not justified because

they are avoidable. The best possible methods for causal discovery from non-experimental data

are, therefore, those that minimize causal reversals. And which methods are those? The Ockham

efficiency theorem says: the Ockham ones.

Our analysis raises some real machine learning issues for causal discovery algorithms.

There are myriad causal theories, and the simplicity order over such theories branches massively.

Ockham’s horizontal razor is prohibitive to implement in that setting. However, the Ockham

efficiency theorems have some flexibility in application, because efficiency is relative to the

underlying simplicity concept, which can be understood more or less coarsely. It turns out that the

simplicity order over causal theories is

by the total number of individual causal

connections, in the sense that each step along a path in the order amounts to the addition of one

more causal connection between variables (Chickering and Meek 2002). If one thinks of simplicity

degrees as

in that ranking (i.e., as the total number of causal connections), then Ockham’s

horizontal razor allows one to return the disjunction of the theories of least rank that are consistent

with the data, rather than all theories that are minimal in the order (which could include many

more). Moreover, that strategy is optimal in terms of worst-case reversals over each rank level

(efficiency is relative to what one takes simplicity to be). Finally, there is an attractive trade-off.

The rank version of horizontal Ockham’s razor licenses one to say more and is also easier to

8