Are epidemological studies almost worthless?

20 February, 2007

Browsing through Nassim Nicholas Taleb’s diary I noticed this quote:

At the AAAS conference in San Francisco I was a discutant of session in which John Ioannidis showed that 4 out of 5 epidemiological “statistically significant” studies fail to replicate in controlled experiments.

NNT crows that this is what he has already come to describe as the narrative fallacy. If you look hard enough at enough data, you will see a pattern emerge.

Anyway I have looked up John Ioannidis’s research and found this interesting paper Why Most Published Research Findings Are False
, which unfortunately I haven’t had the chance to read in full.

The outline of his idea is simple enough, if you look at enough data (particularly small data sets) you will find statistically significant relationships. The part I thought was interesting was this.

As has been shown previously, the probability that a research finding is indeed true depends on the prior probability of it being true (before doing the study), the statistical power of the study, and the level of statistical significance.

Which is kind of obvious. If I correlate enough astrological data with some disease I will inevitably find some correlation, but because the prior probability of it being true is essentially zero there is still very little chance of the study being true.
Read the rest of this entry »

A cool little tool

9 June, 2006

I’ve just discovered this cool tool from Google called Gapminder. It lets you graph various developmental indicies against each other over time for the past 25-30 odd years.

One thing I found interesting is that the plot of CO2 emissions per capita, versus GDP per capita. In 1975, plotting it log/log gives you a pretty straight line (excluding China), so we have an exponent of around 2 approximately, so if we double GDP, we were typically quadrupling of CO2 emissions. These are done very roughly by picking them off the screen so don’t trust them too much. Now if I roll it forward to 2002, it seems that there are now two regimes, a low GDP regime where the CO2 usage increases as fast as ever with GDP (exponent of around 2), and a “high” GDP regime (greater than about $4500) where the exponent would seem to be less than one. ie. when we double GDP, we less than double CO2 usage. I would imagine this is in part due to the oil shocks, transforming behaviour in western countries.

Anyway this type of thing gives me some hope we can make further progress and get that slope down to zero or less and increase GDP without increasing CO2 output. Current world oil prices will be helping.

My Suduko Solver

5 May, 2006

Some time ago I did a few of these Suduko puzzles and then thought to myself, “all I’m doing here is manually apply about 3 or four rules”. So like many others I wrote myself a little visual basic Suduko solver. It seems to do most of them, although I’m sure it doesn’t get some – none that I’ve found, but then the testing wasn’t what you would call rigorous or lengthy.

Anyhow my plans came to nought when I found that I couldn’t upload an .exe file, probably because of the virus danger such files pose I’m guessing. Anyway if someone reads this and knows a good way to make a small file available, let me know.