Examples for teaching: Correlation does not mean causation - Cross Validated
A causal relation between two events exists if the occurrence of the first causes the other. The first event is called the cause and the second event is called the. A curvilinear relationship is described by a polynomial equation, which Causation and correlation: Correlation is one of the criteria used to determine a prediction equation or regression model), which is an algebraic equation expressing. that “prior knowledge of the causal relations is assumed as prerequi- . causal relation- ship implied by this assignment process, because algebraic equations.
Not obvious to me whether bacon is more or less healthy than downing a bunch of syrup or Fruit Loops or whatever else. But we'll let that be right here. Regular breakfast eaters seemed more physically active than the breakfast skippers. So the implication here is that breakfast makes you more active. And then this last sentence right over here, they say "Over time, researchers found teens who regularly ate breakfast tended to gain less weight and had a lower body mass index than breakfast skippers.
So the entire narrative here, from the title all the way through every paragraph, is look, breakfast prevents obesity.
Breakfast makes you active. Breakfast skipping will make you obese. So you just say then, boy, I have to eat breakfast. And you should always think about the motivations and the industries around things like breakfast. But the more interesting question is does this research really tell us that eating breakfast can prevent obesity? Does it really tell us that eating breakfast will cause some to become more active?
Does it really tell us that breakfast skipping can make you overweight or make it obese? Or, it is more likely, are they showing that these two things tend to go together?
And this is a really important difference. And let me kind of state slightly technical words here. And they sound fancy, but they really aren't that fancy. Are they pointing out causality, which is what it seems like they're implying. Eating breakfast causes you to not be obese. Breakfast causes you to be active.
Breakfast skipping causes you to be obese. So it looks like they are kind of implying causality. They're implying cause and effect, but really what the study looked at is correlation.
The whole point of this is to understand the difference between causality and correlation because they're saying very different things. And, as I said, causality says A causes B. Well, correlation just says A and B tend to be observed at the same time. Whenever I see B happening, it looks like A is happening at the same time.
Whenever A is happening, it looks like it also tends to happen with B. And the reason why it's super important to notice the distinction between these is you can come to very, very, very, very, very different conclusions.
So the one thing that this research does do, assuming that it was performed well, is it does show a correlation. So the study does show a correlation. It does show, if we believe all of their data, that breakfast skipping correlates with obesity and obesity correlates with breakfast skipping. We're seeing it at the same time. Activity correlates with breakfast and breakfast correlates with activity-- that all of these correlate. What they don't say-- and there's no data here that lets me know one way or the other-- what is causing what or maybe you have some underlying cause that is causing both.
So for example, they're saying breakfast causes activity, or they're implying breakfast causes activity. They're not saying it explicitly. But maybe activity causes breakfast. They didn't write the study that people who are active, maybe they're more likely to be hungry in the morning. And then you start having a different takeaway. Then you don't say, wait, maybe if you're active and you skip breakfast-- and I'm not telling you that you should.
Correlation and causality
I have no data one way or the other-- maybe you'll lose even more weight. Maybe it's even a healthier thing to do. So they're trying to say, look, if you have breakfast it's going to make you active, which is a very positive outcome.
But maybe you can have the positive outcome without breakfast. Likewise they say breakfast skipping, or they're implying breakfast skipping, can cause obesity.
But maybe it's the other way around. Maybe people who have high body fat-- maybe, for whatever reason, they're less likely to get hungry in the morning.
Identify dependent & independent variables | Algebra (practice) | Khan Academy
So maybe it goes this way. Maybe there's a causality there. Or even more likely, maybe there's some underlying cause that causes both of these things to happen.
And you could think of a bunch of different examples of that. One could be the physical activity. And these are all just theories. I have no proof for it. But I just want to give you different ways of thinking about the same data and maybe not just coming to the same conclusion that this article seems like it's trying to lead us to conclude. That we should eat breakfast if we don't want to become obese. So maybe if you're physically active, that leads to you being hungry in the morning, so you're more likely to eat breakfast.
And obviously being physically active also makes it so that you burn calories. It might seem that the answers to such fundamental questions would have been settled long ago. In fact, they turn out to be surprisingly subtle questions.
Over the past few decades, a group of scientists have developed a theory of causal inference intended to address these and other related questions. This theory can be thought of as an algebra or language for reasoning about cause and effect. Many elements of the theory have been laid out in a famous book by one of the main contributors to the theory, Judea Pearl.
Although the theory of causal inference is not yet fully formed, and is still undergoing development, what has already been accomplished is interesting and worth understanding.
Correlation and causality (video) | Khan Academy
In this post I will describe one small but important part of the theory of causal inference, a causal calculus developed by Pearl. This causal calculus is a set of three simple but powerful algebraic rules which can be used to make inferences about causal relationships.
The post is a little technically detailed at points. However, the first three sections of the post are non-technical, and I hope will be of broad interest. You may find it informative to work through these exercises and problems.
Before diving in, one final caveat: I am not an expert on causal inference, nor on statistics. The reason I wrote this post was to help me internalize the ideas of the causal calculus. Occasionally, one finds a presentation of a technical subject which is beautifully clear and illuminating, a presentation where the author has seen right through the subject, and is able to convey that crystalized understanding to others. Nonetheless, I hope others will find my notes useful, and that experts will speak up to correct any errors or misapprehensions on my part.
You might think that we could conclude from this that being Republican, rather than Democrat, was an important factor in causing someone to vote for the Civil Rights Act. However, the picture changes if we include an additional factor in the analysis, namely, whether a legislator came from a Northern or Southern state.
- There was a problem providing the content you requested
If we include that extra factor, the situation completely reverses, in both the North and the South. Democrat 94 percentRepublican 85 percent South: Democrat 7 percentRepublican 0 percent Yes, you read that right: You might wonder how this can possibly be true.
You can skip the numbers if you trust my arithmetic. In fact, at the time the House had 94 Democrats, and only 10 Republicans. The numbers above are for the House of Congress.
The numbers were different in the Senate, but the same overall phenomenon occurred. If we take a naive causal point of view, this result looks like a paradox.
As I said above, the overall voting pattern seems to suggest that being Republican, rather than Democrat, was an important causal factor in voting for the Civil Rights Act. So two variables which appear correlated can become anticorrelated when another factor is taken into account. You might wonder if results like those we saw in voting on the Civil Rights Act are simply an unusual fluke. But, in fact, this is not that uncommon.
If correlation doesn’t imply causation, then what does?
In each case, understanding the causal relationships turns out to be much more complex than one might at first think. Imagine you suffer from kidney stones, and your Doctor offers you two choices: Your Doctor tells you that the two treatments have been tested in a trial, and treatment A was effective for a higher percentage of patients than treatment B.
Keep in mind that this really happened. Suppose you divide patients in the trial up into those with large kidney stones, and those with small kidney stones. Then even though treatment A was effective for a higher overall percentage of patients than treatment B, treatment B was effective for a higher percentage of patients in both groups, i. I find it more than a little mind-bending that my heuristics about how to behave on the basis of statistical evidence are obviously not just a little wrong, but utterly, horribly wrong.
Or, to put it another way, they have not the first clue about statistics. Partial evidence may be worse than no evidence if it leads to an illusion of knowledge, and so to overconfidence and certainty where none is justified.