# Correlation analysis be misused to explain a cause and effect relationship

### Evidence in Medicine: Correlation and Causation – Science-Based Medicine

All of these claims are based on one of the most widely misused and It is not a statement about cause and effect. Viewed in isolation, it is not possible to tell what the relationship between two correlated variables is: A could cause B, implies a perfect correlation; i.e., the data points all fall precisely on a line defined by the. Repeat after me, correlation is not causation, correlation is not with 1 being a strong positive relationship between two sets of numbers, A more plausible explanation would be that cold weather tends to .. I agree with you entirely about the problems of statistics being misused, but I don't think you have. An example of unidirectional cause and effect: bad weather means be a genuine cause-and-effect relationship (such as rainfall levels and.

All of these claims are based on one of the most widely misused and misunderstood mathematical operations in existence: This chapter aims to demystify this relatively simple technique and to provide you with some ground rules for assessing such claims.

Several decades ago in western Pennsylvania, a study collected the records of both air pollution and deaths due to pulmonary disease from several counties around Pittsburgh when that city was the nation's leading center of steel production.

## Correlation and Causation

The authors reported a correlation between poor air quality and the death rate. In particular, Allegheny County had the worst air and the greatest number of deaths. Pollution control laws were just being implemented at the time, yet, as one wag pointed out, there were sites exempted by the County's laws: Perhaps, then, the study's authors had their conclusions backwards; really, it is deaths that cause air pollution. Although the true relation between air polution and pulmonary disease may seem obvious, a correlation between two varying quantities can never be taken as prima facie evidence that one causes the other.

A co- r relation is just that: It is not a statement about cause and effect. Viewed in isolation, it is not possible to tell what the relationship between two correlated variables is: It's common sense that air pollution causes disease.

Why do we need to worry about statistics and uncertainties -- the cause is obvious. The essayist Philip Slater once wrote: How many dogs do you know who believe the stars control their lives? This is also known as the Will Rogers effect, after the US comedian who reportedly quipped: When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states. If diagnostic methods improve, some very-slightly-unhealthy patients may be recategorised — leading to the health outcomes of both groups improving, regardless of how effective or not the treatment is.

Picking and choosing among the data can lead to the wrong conclusions. The skeptics see period of cooling blue when the data really shows long-term warming green.

This is bad statistical practice, but if done deliberately can be hard to spot without knowledge of the original, complete data set.

### Correlation is not causation | Nathan Green's S word | Science | The Guardian

Consider the above graph showing two interpretations of global warming data, for instance. Or fluoride — in small amounts it is one of the most effective preventative medicines in history, but the positive effect disappears entirely if one only ever considers toxic quantities of fluoride. For similar reasons, it is important that the procedures for a given statistical experiment are fixed in place before the experiment begins and then remain unchanged until the experiment ends.

Consider a medical study examining how a particular disease, such as cancer or Multiple sclerosis, is geographically distributed. If the disease strikes at random and the environment has no effect we would expect to see numerous clusters of patients as a matter of course. If patients are spread out perfectly evenly, the distribution would be most un-random indeed!

So the presence of a single cluster, or a number of small clusters of cases, is entirely normal. Some attack the validity of science itself, usually with post-modernist philosophy.

Pseudoscientific proponents, on the other hand, praise science, they just do it wrong. In reality there is a continuum along a spectrum from complete pseudoscience to pristine science, and no clear demarcation in the middle. Individual studies vary along this spectrum as well — there are different kinds of evidence, each with its own strengths and weaknesses, and there are no perfect studies.

Further, when evaluating any question in medicine, the literature the totality of all those individual studies rarely points uniformly to a single answer. These multiple overlapping continua of scientific quality create the potential to make just about any claim seem scientific simply by how the evidence is interpreted. Also, even a modest bias can lead to emphasizing certain pieces of evidence over others, leading to conclusions which seem scientific but are unreliable.

Also, proponents can easily begin with a desired conclusion, and then back fill the evidence to suit their needs rather than allowing the evidence to lead them to a conclusion. For example, the anti-vaccine movement systematically endorses any piece of evidence that seems to support the conclusion that there is some correlation between vaccines and neurological injury.

### Statistical Language - Correlation and Causation

Meanwhile, they find ways to dismiss any evidence which fails to show such a connection. They, of course, accuse the scientific community of doing the same thing, and each side cites biases and conflicts in the other to explain the discrepancy. It is no wonder the public is confused. How, then, do we use the evidence to arrive at reliable scientific conclusions?

That is what I will be discussing in this series of posts, beginning with a discussion of correlation and causation, but here is a quick overview: SBM is achieved through a consideration of scientific plausibility and a systematic review of the clinical evidence. In other words — all scientific evidence is considered in a fair and thorough manner, including basic science and clinical evidence, and placed in the context of what we know about how the world works.

This leads us to the final continuum — the consensus of expert opinion based upon systematic reviews can either result in a solid and confident unanimous opinion, a reliable opinion with serious minority objections, a genuine controversy with no objective resolution, or simply the conclusion that we currently lack sufficient evidence and do not know the answer.

It can also lead, of course, to a solid consensus of expert opinion combined with a fake controversy manufactured by a group driven by ideology or greed and not science. Correlation and Causation Much of scientific evidence is based upon a correlation of variables — they tend to occur together. Scientists are careful to point out that correlation does not necessarily mean causation. The assumption that A causes B simply because A correlates with B is a logical fallacy — it is not a legitimate form of argument.

However, sometimes people commit the opposite fallacy — dismissing correlation entirely, as if it does not imply causation. This would dismiss a large swath of important scientific evidence.