My attempt at demystifying causality

“Three morning routines of successful CEOs” said the random article on the internet, implying that if you do this morning routine as well, you might also become a successful CEO. But can a thing so complicated as being a great leader of a company be distilled down to a morning routine?

Articles like this are claiming that exists a causal relationship between the type of morning routine one might have and their success as a CEO (whatever that may be). The causality is established on the grounds that many of them (let’s say) wake at 5 a.m.

Look, example 1 wakes up at 5 a.m., as well as example 2, example 3, and so on and so forth, where example 1, …, example n are very rich and famous people. How can you not see it, waking up at 5 a.m. is important!

Of course, it’s not the only reason, the article continuous to claims, as hard work is essential of course, but waking up at 5 a.m. gives you the peace of mind to plan your day or 5 a.m. to 7 a.m. are 2 hours just for you

In this article I will try to explain in a simple way, why claims like this, coming from online articles that don’t perform actual research, or even from single, non-repeatedly proven by independent sources, research papers are most likely wrong. This is my first attempt at putting in writing my way of understanding causality, a very complex topic that many much more intelligent people have written about.

The first and simplest reason why claims like these are most likely wrong is the simplest one. We don’t know how many people that wake up at 5 a.m. end up becoming unsuccessful CEOs, or mediocre CEOs. How can we claim that this specific morning routine is important for success when some people it works, but for others, it doesn’t? The story goes much deeper though, as causation cannot be boiled down to a single percentage. If there are more than 50% of successful CEOs wake up at 5 a.m. than successful, then there must exist a causal relationship. No, unfortunately, it’s not as simple as that either.

Observational vs Experimental study

Observational studies

An observational study is a kind of study where the researcher (in our case the one who noticed the relationship between morning routine and the success of a CEO) observes the outcome of a treatment. In a more controlled example, you might observe whether or not a patient after being treated with a specific medication, manages to overcome the disease. We don’t have a saying as to who got the treatment and who didn’t, we just happen to fall upon a group of people, that each individual either got or didn’t get the treatment.

The problems with establishing causality with this method are numerous, but the most important one is that we cannot eliminate all biases in the experiment. In reality, we can never eliminate all biases, but we will see how we can circumvent this problem later (randomisation).

Selection bias

The simplest bias that is lurking, in this case, is called selection bias. Simply put, you don’t have any guarantees that the people that you chose to study are representing the entire population. The researcher might accidentally (in our case intentionally) choose a subset of people that all have some specific characteristic that makes them better candidates for the result. If that is true, how could you generalise your results drawn from this non-representing group of people to the whole population? In addition to that, how could you know if really unsuccessful people also wake up at 5 a.m. if you don’t include them in your study!? I think it’s easy to agree that when trying to identify whether a strategy (treatment) works on a situation (patient) you have to select a “good” by some metric, sample. The reason is simple. If your sample is not “good” then your sample cannot be used to infer something about the whole population.

Survivorship bias

A second very important bias that exists in these cases is called survivorship bias. This is equally important as the previous one, yet it’s also less obvious.

Survivorship bias comic

When we have to judge whether the strategy works or not, we don’t need to see only positive examples, we need to see negative examples as well. Nobody would say that playing the lottery is a good strategy for becoming a millionaire, even though we all have heard stories about people that won the lottery and changed their lives (albeit temporarily). Why aren’t we pursued by these positive examples? Well, the answer is simple. I claim that we intuitively understand that for every one winner there is a big number of losers and you are more likely to be looser than a winner when you play these odds. Hence the name survivorship bias, we cannot draw conclusions for the general population by analysing only the survivors, we have to account for the “dead” as well. This is very difficult exactly because they are “dead” and so we tend to listen again and again stories of the survivors. Nobody is taking interviews with unsuccessful CEOs. We hear again and again the vocal minority until at some point we believe them to be the majority. In a last attempt to explain this concept I will quote a macabre - not so funny - joke I make from time to time:

Do you know why all foreigners that come to London have a near-death story of looking the wrong way when passing the road, and almost dying?

Why?

Because the ones who don’t, they are already dead.

These are two easy-to-explain examples of how things can go wrong before even beginning to analyse the data. All these biases could be characterised as “common sense” but in science they have names. This is because we have to identify them and account for them on a daily basis in every experiment we set up to study causal relationships. It’s not that an observational study can never indicate a causal relationship, it’s that it has to be carefully planned and analysed in order to remove those biases from the data, something that more often than not is overlooked by reporters in magazines or influencers on blog posts.

Experimental studies

The simplest way we deal with all those biases is to perform an experimental study. In an experimental study, we get to choose beforehand who gets the treatment and who doesn’t. This is good because we can eliminate all those biases that we discussed above. However, the way you decide who gets the treatment and who doesn’t is very important. We have to allocate them randomly. This sounds very counterintuitive at first, but it makes sense if you make an effort to think about it. We usually go the extra mile by also giving one group the treatment and another group a placebo treatment, so the subjects don’t know the group they are in. We also again go the extra-extra mile and conceal this information from the doctors and the analysts as well! Even the doctors that give the medication to the subjects don’t know if they are giving the actual pill or a sugar pill.

Why randomisation works?

But why randomisation is so a good thing? It all comes down to what we want to estimate and I will try to explain it with an example drawn from agriculture and the book Causality, by Judea Perl.

Suppose that we have fertilisers A and B and we wish to know which one works better on our crops. The simplest thing that we can do is to draw a line passing through the middle of our field, and put fertiliser A on one half and fertiliser B on the other. Then when we measure the yields, if we see that the first half produces more we say that fertiliser A is better than fertiliser B. As simple as this sounds, this is unfortunately too naive.

If the farmer naively chooses to put fertiliser A on the top half of the field and fertiliser B on the bottom half it introduced drainage as a confounding variable, as when watering the field the bottom half will get more water. If they choose to put fertiliser A for one year and fertiliser B for the second year, they introduce weather as a confounder!

What we care about is:

Picture of intended model

But what we might end up estimating if we choose treatments naively is:

Picture of confounding model

This is where randomisation comes in, when we leave the selection of who gets the treatment to change the model that we are estimating becomes:

Picture of randomisation included model

Notice that there is no arrow from the confounding (drainage) to the treatment (fertiliser). Of course in this very simple example where we only have two halves, it’s not possible to eliminate the confounder, we need more than two subjects to draw a safe conclusion.

So what now? Let’s say we did everything correctly and selected a representative sample, performed a double-blind experiment, and accounted for all the possible confounders. Can we now say that our results are safe and sound? If all those steps were carried out carefully, in other words, if we did our experiment according to the scientific method, could we now finally say that waking at 5 a.m. makes you a successful CEO?! Unfortunately no. There is one more step before you can safely make such a claim and that has to do with the reproducibility of your results.

Different people doing the same thing

Academic papers usually contain at the end a section called “Conclusion”, where they provide a summary of the findings that are worth mentioning. If you read those results carefully, you will see that the language that they have is very specific. A bold and direct claim such as “Eating garlic makes you immune to COVID-19” is seldom made even if the data from their experiment suggests something like that. The reason is that scientists know that one experiment on its own doesn’t prove a positive statement. In fact, positive statements are notoriously difficult to prove. It can be easy to find evidence that suggests that the statement might be true, but it’s almost impossible to definitively prove a positive statement.

Sometimes the best that we can do in the real world is to gather as much evidence as possible and go with that until newer evidence arises. In our case, the experiment has to be reproduced by many different scientists, around the globe, independent of one another. If all of them point towards the same result, while having a diverse sample of different ethnicities, ages, genders, etc, etc, then we can say (but again not definitively) that “the literature suggests that …” as a general truth. This is as close to the truth as we can get in the complex real world, in most cases.

Back to reality

These things might sound self-evident for some people, especially in a silly example like the one I choose. In reality, though, many people get confused by this if the scenario is more complicated.

Consultants are the best at managing positions.
Chess players are better problem solvers.
PhDs are smart.

There is always some justification that could potentially work, and that’s how people are persuaded that their belief is true.

Consultants learn specific frameworks that other people don’t learn. That’s why they are better managers
Chess players can concentrate for prolonged periods. That’s why they are better problem solvers
PhDs do a lot of math. That’s why they are smarter.

Some of these claims could be true and others could be false, I don’t claim that I know the answer to all of them. What I claim is that every time such a powerful claim is made by someone, I expect it to be justified with scientific literature. Otherwise, I don’t take it too seriously.

What we have to remember is that this isn’t something that happens to less smart people. It’s something that happens to everyone, including ourselves and it’s very tricky to manage it when it happens on things that are close to us.

Conclusion

The job of a scientist is not to create theories, but to fight them. If all of the scientific community is trying to disprove a theory by conducting experiments and they keep failing, then we have evidence that the theory could be correct. If in addition to that, the theory can make some positive claims that are proven experimentally, then we have even more evidence that this theory is correct. It might be the case that after some time we find a better theory, one that fits even better the experimental data that we collected and they make more accurate predictions. Then the cycle starts all over again.

I started this article with an attempt of demystifying causality, and it ended up being about the scientific method. This is no accident. This method is the best tool we have to understand “how” things work. As Richard Dawkins famously puts it

If you base medicine on science you cure people. If you base the design of rockets on science they reach the moon. It works … Bitches!