Public Health

#FOAMPubMed 3: Type I Error


First things first, no piece of research is perfect.  Every study will have its limitations. 

One way we try to make research better is through understanding error.  

If we find that the new drug works when it doesn’t that’s called a false positive.  We can’t eliminate false positives; some patients will get better even if given placebo.  But too many false positives and we will find an effect when one doesn’t actually exist. We will wrongly reject our null hypothesis.  

Type I Error comes about when we wrongly reject our null hypothesis. 

This will mean that we will find our new drug is better than the standard treatment (or placebo) when it actually isn't.

Type I Error is also called alpha

A way I like to look at Type I Error is the influence of chance on your study. Some patients will get better just through chance. You need to reduce the impact of chance on your study.

For instance, I may want to investigate how psychic I am. My null hypothesis would be ‘I am not psychic.’

I toss a coin once. I guess tails. I’m right. I therefore reject my null hypothesis and conclude I’m psychic.

You don’t need to be an expert in research to see how open to chance that study is and how one coin toss can’t be enough proof. We’d need at least hundreds of coin tosses to see if I could predict each one.

You see how understanding Type I Error influences how you design your study, including your sample size

More of that later. The next blog will look at how we actually statistically show that we’ve reduced Type I Error in our study.

#FOAMPubMed 2: The null hypothesis


When we do research in Medicine it’s usually to test whether a new treatment works (by testing it against placebo) or better than the established treatment we’re already using.

At the beginning of our study we have to come up with a null hypothesis (denoted as H0).

The null hypothesis is a statement that assumes no measurable difference between whatever you’re studying.  

The null hypothesis is therefore usually something along the lines of: 

‘Drug A won’t be better than Drug B at treating this condition.’  

We then set out to test this null hypothesis.  If we find Drug A is better than B then we reject the null hypothesis and conclude Drug A is the superior treatment. If Drug A is found to be no better (i.e. the same or worse) than Drug B then we accept our null hypothesis and conclude that Drug A is non-superior (or inferior).

Error comes when we either wrongly reject or wrongly accept the null hypothesis.

Error means we come to the wrong conclusion. There are two types of error, the next blog will look at the first, Type I Error.

#FOAMPubMed 1: Lemons and Limes, the first clinical trial and how to make a research question


Before we conduct any research we first need to construct a research question. This can be a difficult step. Our question needs to be precise and easy to understand. To do this we can use the ‘PICO’ criteria:


We need a population of interest. These will be subjects who share particular demographics and needs to be clearly documented.


The intervention is something you’re going to do to your population. This could be treatment or education or an exposure such as asbestos. The effect of the intervention is what you’re interested in.


If we’re going to study an intervention we need to compare it. We can use people without the exposure (control) or compare the treatment to another or placebo.


The outcome is essentially what we are going to measure in our study. This could be mortality, it could be an observation such as blood pressure or a statistic such as length of stay in hospital. Whatever it is we need be very clear that this our main outcome, otherwise known as our primary outcome. The outcome decides our sample size so has be explicit.

PICO therefore allows us to form a research question.

To demonstrate this let’s look at the first ever clinical trial and see how we use PICO to write a research question.

It’s the 18th century. An age of empires, war and exploration. Britain, an island nation in competition with its neighbours for hegemony, relies heavily on her navy as the basis of her expansion and conquest. This is the time of Rule Britannia. Yet Britain, as with all sea going nations, was riddled with one scourge amongst its sailors: scurvy.

Scurvy is a disease caused by a lack of Vitamin C. Vitamin C, or ascorbic acid, is essential in the body to help catalyse a variety of different functions including making collagen, a protein which forms the building blocks of connective tissue, and wound healing. A lack of Vitamin C therefore causes a breakdown of connective tissue as well as impaired healing; this is scurvy, a disease marked by skin changes, bleeding, loss of teeth and lethargy. Hardly the state you want your military to be when you’re trying to rule the waves.

James Lind was born in Edinburgh in 1716. In 1731, he registered as an apprentice at the College of Surgeons in Edinburgh and in 1739 became a surgeon's mate, seeing service in the Mediterranean, Guinea and the West Indies, as well as the English Channel. In 1747, whilst serving on HMS Salisbury he decided to study scurvy and a potential cure.

James Lind 1716-1794

James Lind 1716-1794

Lind, as with medical opinion at the time, believed that scurvy was caused by a lack of acid in the body which made the body rot or putrefy. He therefore sought to treat sailors suffering with scurvy with a variety of acidic substances to see which was the best treatment. He took 12 sailors with scurvy and divided them into six pairs. One pair were given cider on top of their normal rations, another sea water, another vinegar, another sulphuric acid, another a mix of spicy paste and barley with another pair receiving two oranges and one lemon (citrus fruits containing citric acid).

Although they ran out of fruit after five days by that point one of the pair receiving citrus fruits had returned to active duty whilst the other was nearly recovered. Lind published his findings in his 1753 work, A treatise on scurvy. Despite this outcome Lind himself and the wider medical community did not recommend citrus fruits to be given to sailors. This was partly due to the impossibility of keeping fresh fruit on a long voyage and the belief that other easier to store acids could cure the disease. Lind recommended a condensed juice called ‘rob’ which was made by boiling fruit juice. Boiling destroys vitamin C and so subsequent research using ‘rob’ showed no benefit. Captain James Cook managed to circumnavigate the globe without any loss of life to scurvy. This is likely due to his regular replenishment of fresh food along the way as well as the rations of sauerkraut he provided.

It wasn’t until 1794, the year that Lind died, that senior officers on board the HMS Suffolk overruled the medical establishment and insisted on lemon juice being provided on their twenty three week voyage to India to mix with the sailors’ grog. The lemon juice worked. The organisation responsible for the health of the Navy, the Sick and Hurt Board, recommended that lemon juice be included on all voyages in the future.

Although his initial assumption was wrong, that scurvy was due to a lack of acid and it was the acidic quality of citrus fruits that was the solution, James Lind had performed what is now recognised as the world’s first clinical trial. Using PICO we can construct Lind’s research question.


Sailors in the Royal Navy with scurvy


Giving sailors citrus fruits on top of their normal rations


Seawater, vinegar, spicy paste and barley water, sulphuric acid and cider


Patient recovering from scurvy to return to active duty

So James Lind’s research question would be:

Are citrus fruits better than seawater, vinegar, spicy paste and barley water, sulphuric acid and cider at treating sailors in the Royal Navy with scurvy so they can recover and return to active duty?

After HMS Suffolk arrived in India without scurvy the Naval establishment began to give citrus fruits in the form of juice to all sailors. This arguably helped swing superiority the way of the British as health amongst sailors improved. It became common for citrus fruits to be planted across Empires by the Imperial countries in order to help their ships stop off and replenish. The British planted a particularly large stock in Hawaii. Whilst lemon juice was originally used the British soon switched to lime juice. Hence the nickname, ‘limey’.

A factor which had made the cause of scurvy hard to find was the fact that most animals can actually make their own Vitamin C, unlike humans, and so don’t get scurvy. A team in 1907 was studying beriberi, a disease caused by the lack of Thiamine (Vitamin B1), in sailors by giving guinea pigs their diet of grains. Guinea pigs by chance also don’t synthesise Vitamin C and so the team were surprised when rather then develop beriberi they developed scurvy. In 1912 Vitamin C was identified. In 1928 it was isolated and by 1933 it was being synthesised. It was given the name ascorbic (against scurvy) acid.

James Lind didn’t know it but he had effectively invented the clinical trial. He had a hunch. He tested it against comparisons. He had a clear outcome. As rudimentary as it was this is still the model we use today. Whenever we come up with a research question we are following the tradition of a ship’s surgeon and his citrus fruit.

Thanks for reading.

- Jamie

Medicine and Game Theory: How to win


You have to learn the rules of the game; then learn to play better than anyone else - Albert Einstein

Game theory is a field of mathematics which emerged in the 20th century looking at how players in a game interact. In game theory any interaction between two or more people can be described as a game. In this musing I’m looking at how game theory can influence healthcare both in the way we view an individual patient as well as future policy.

There are at least two kinds of games. One could be called finite, the other infinite. A finite game is played for the purpose of winning, an infinite game for the purpose of continuing the play.     

James P. Carse Author of Finite and Infinite Games

Game theory is often mentioned in sports and business

In a finite game all the players and all the rules are known. The game also has a known end point. A football match would therefore be an example of a finite game. There are two teams of eleven players with their respective coaches. There are two halves of 45 minutes and clear laws of football officiated by a referee. After 90 minutes the match is either won, lost or drawn and is definitely over.

Infinite games have innumerable players and no end points. Players can stop playing or join or be absorbed by other teams. The goal is not an endpoint but to keep playing. A football season or even several football seasons could be described as an infinite game. Key to infinite games then is a vision and principles. A team may lose one match but success is viewed by the team remaining consistent to that vision; such as avoiding relegation every season or promoting young talent. Athletic Club in Spain are perhaps the prime example of this. Their whole raison d'être is that they only use players from the Basque Region of Spain. This infinite game of promoting local talent eschews any short term game. In fact their supporters regularly report they’d rather get relegated than play non-Basque players.

Problems arise by confusing finite and infinite games. When Sir Alex Ferguson retired as Manchester United manager after 27 years in 2013 the club attempted to play an infinite game. They chose as his replacement David Moyes, a manager with a similar background and ethics to Ferguson, giving him a 9 year contract. 6 months into that he was fired and since then United have been playing a finite game choosing more short term appointments, Louis van Gaal and Jose Mourinho, rather than following a vision.

It’s easy to see lessons for business from game theory. You may get a deal or not. You may have good quarters or bad quarters. But whilst those finite games are going on you have your overall business plan, an infinite game. You’re playing to keep playing by staying in business.

What about healthcare?

So a clinician and patient could be said to be players in a finite game competing against whatever illness the patient has. In this game the clinician and patient have to work together and use their own experiences to first diagnose and then treat the illness. The right diagnosis is made and the patient gets better. The game is won and over. Or the wrong diagnosis is made and the patient doesn’t get better. The game is lost and over. But what about if the right diagnosis is made but for whatever reason the patient doesn’t get better? That finite game is lost. But what about the infinite game?

Let’s say our patient has an infection. That infection has got worse and now the patient has sepsis. In the United Kingdom we have very clear guidelines on how to manage sepsis from the National Institute of Clinical Excellence. Management is usually summed up as the ‘Sepsis Six’. There are clear principles about how to play this game. So we follow these principles as we treat our patient. We follow the Sepsis Six. But they aren’t guarantees. We use them because they give us the best possible chance to win this particular finite game. Sometimes it will work and the patient will get better and we win. Sometimes it won’t and the patient may die. Even if all the ‘rules’ are followed, due to reasons beyond any of the players. But whilst each individual patient may be seen as a finite game there is a larger infinite game being played. By making sure we approach each patient with these same principles we not only give them the best chance of winning their finite game but we also keep the infinite game going; of ensuring each patient with sepsis is managed in the same optimum way. By playing the infinite game well we have a better chance of winning finite games.

This works at the wider level too. For example, if we look at pneumonia we know that up to 70% of patients develop sepsis. We know that smokers who develop chronic obstructive pulmonary disease (COPD) have up to 50% greater risk of developing pneumonia. We know that the pneumococcal vaccine has reduced pneumonia rates especially amongst patients in more deprived areas. Reducing smoking and ensuring vaccination are infinite game goals and they work. This is beyond the control of one person and needs a coordinated approach across healthcare policy.


Are infinite games the future of healthcare?

In March 2015 just before the UK General Election the Faculty of Public Health published their manifesto called ‘Start Well, Live Better’ for improving general health. The manifesto consisted of 12 points:

The Start Well, Live Better 12 priorities from Lindsey Stewart, Liz Skinner, Mark Weiss, John Middleton, Start Well, Live Better—a manifesto for the public's health, Journal of Public Health, Volume 37, Issue 1, March 2015, Pages 3–5,

There’s a mixture of finite goals here - establishing a living wage for example - and some infinite goals as well such as universal healthcare. The problem is that finite game success is much more short-term and easier to measure than with infinite games. We can put a certain policy in place and then measure impact. However, infinite games aimed improving a population’s general health take years if not decades to show tangible benefit. Politicians who control healthcare policy and heads of department have a limited time in office and need to show benefits immediately. The political and budgetary cycles are short. It is therefore tempting to choose to play finite games only rather than infinite.

The National Health Service Long Term Plan is an attempt to commit to playing an infinite game. The NHS England Chief Simon Stevens laid out five priorities for the NHS focusing health spending over the next 5 years: mental health, cardiovascular disease, cancer, child services and reducing inequalities. This comes after a succession of NHS plans since 2000 which all focused on increasing competition and choice. The Kings Fund have been ambivalent about the benefit those plans made.

Since its inception the National Health Service has been an infinite game changing how we view illness and the relationship between the state and patients. Yet if we chase finite games that are incongruous to our finite game we risk that infinite game. There is a very clear link between the effect of the UK government’s austerity policy on social care and its impact on the NHS.

We all need to identify the infinite game we want to play and make sure it fits our principles and vision. We have to accept that benefits will often be intangible and appreciate the difficulties and scale we’re working with. We then have to be careful with the finite games we choose to play and make sure they don’t cost us the infinite game.

Playing an infinite game means committing to values at both a personal and institutional level. It says a lot about us and where we work. It means those in power putting aside division and ego. Above all it would mean honesty.

Thanks for reading

- Jamie

"Obviously a major malfunction" - how unrealistic targets, organisational failings and misuse of statistics destroyed Challenger


There is a saying commonly misattributed to Gene Kranz the Apollo 13 flight director: failure is not an option. In a way that’s true. Failure isn’t an option. I would say it’s inevitable in any complicated system. Most of us work in one organisation or another. All of us rely on various organisations in our day to day lives. I work in the National Health Service, one of 1.5 million people. A complex system doing complex work.

In a recent musing I looked at how poor communication through PowerPoint had helped destroy the space shuttle Columbia in 2003. That, of course, was the second shuttle disaster. In this musing I’m going to look at the first.

This is the story of how NASA was arrogant; of unrealistic targets, of disconnect between seniors and those on the shop floor and of the misuse of statistics. It’s a story of the science of failure and how failure is inevitable. This is the story of the Challenger disaster.

”An accident rooted in history”

It’s January 28th 1986 at Cape Canaveral, Florida. 73 seconds after launching the space shuttle Challenger explodes. All seven of its crew are lost. Over the tannoy a distraught audience hears the words, “obviously a major malfunction.” After the horror come the questions.

The Rogers Commission is formed to investigate the disaster. Amongst its members are astronaut Sally Ride, Air Force General Donald Kutyna, Neil Armstrong, the first man on the moon, and Professor Richard Feynman; legendary quantum physicist, bongo enthusiast and educator.

The components of the space shuttle system (From

The shuttle programme was designed to be as reusable as possible. Not only was the orbiter itself reused (this was Challenger’s tenth mission) but the two solid rocket boosters (SRBs) were also retrieved and re-serviced for each launch. The cause of the Challenger disaster was found to be a flaw in the right SRB. The SRBs were not one long section but rather several which connected with two rubber O-rings (a primary and a secondary) sealing the join. The commission discovered longstanding concerns regarding the O-rings.

In January 1985 following a launch with the shuttle Discovery soot was found between the O-rings indicating that the primary ring hadn’t maintained a seal. At that time the launch had been the coldest yet at about 12 degrees Celsius. At that temperature the rubber contracted and became brittle making it harder to maintain a seal. On other missions the primary ring was nearly completely eroded through. The flawed O-ring design had been known about since 1977 leading the commission to describe Challenger, “an accident rooted in history.”

The forecast for the launch of Challenger would break the cold temperature record of Discovery: -1 degrees Celsius. On the eve of the launch engineers from Morton Thiokol alerted NASA managers of the danger of O-ring failure. They advised waiting for a warmer launch day. NASA however pushed back and asked for proof of failure rather than proof of safety. An impossibility.

“My God Thiokol, when do you want me to launch? Next April?”

Lawrence Molloy, SRB Manager at NASA

NASA pressed Morton Thiokol managers to go over their engineers and approve launch. On the morning of the 28th the forecast was proved right and the launch site was covered with ice. Reviewing launch footage the Rogers Commission found that in the cold temperature O-rings on the right SRB had failed to maintain a seal. 0.678 seconds into the launch grey smoke was seen escaping the right SRB. Due to ignition the SRB casing expanded slightly and the rings should have moved with the casing to maintain the seal. However, at minus one degrees Celsius they were too brittle and failed to do so. This should have caused Challenger to explode on the launch pad but aluminium oxides from the rocket fuel filled the damaged joint and did the job of the O-rings by sealing the site. This temporary seal allowed the Challenger to lift off.

This piece of good fortune might have allowed Challenger and its crew to survive. Sadly, 58.788 seconds into the launch Challenger hit a strong wind sheer which dislodged the aluminium oxide. This allowed hot air to escape and ignite. The right SRB burned through its joint to the external tank, coming loose and colliding with it. This caused a fireball which ignited the whole stack.

Challenger disintegrated and the crew cabin was sent into free fall before crashing into the sea. When the cabin was retrieved from the sea bed the personal safety equipment of three of the crew had been activated suggesting they survived the explosion but not the crash into the sea. The horrible truth is that it is possible they were conscious for at least a part of the free fall. Two minutes and forty five seconds.

So why the push back from NASA? Why did they proceed when there were concerns about the safety of the O-rings? This is where we have to look at NASA as an organisation arrogantly assumed it could guarantee safety. This included its own unrealistic targets.

NASA’s unrealistic targets

NASA had been through decades of boom and bust. The sixties had begun with them lagging behind the Soviets in the space race and finished with the stars and stripes planted on the moon. Yet the political enthusiasm triggered by President Kennedy and the Apollo missions had dried up and with it the public’s enthusiasm also waned. The economic troubles of the seventies were now followed by the fiscal conservatism of President Reagan. The money had dried up. NASA managers looked to shape the space programme in a way to fit the new economic order.

First, space shuttles would be reusable. Second, NASA made bold promises to the government. Their space shuttles would be so reliable and easy to use there would be no need to spend money on any military space programme; instead give the money to NASA to launch spy satellites. In between any government mission the shuttles would be a source of income as the private sector paid to use them. In short, the shuttle would be a dependable bus service to space. NASA promised that they could complete sixty missions a year with two shuttles at any one time ready to launch. This promise meant the pressure was immediately on to perform.

Four shuttles were initially built: Atlantis, Challenger, Columbia and Discovery. The first shuttle to launch was Columbia on 12th April 1981, one of two missions that year. In 1985 nine shuttle missions were completed. This was a peak that NASA would never exceed. By 1986 the target of sixty flights a year was becoming a monkey on the back of NASA. STS-51-L’s launch date had been pushed back five times due to bad weather and the previous mission itself being delayed seven times. Delays in that previous mission were even more embarrassing as Congressman Bill Nelson was part of the crew. Expectation was mounting and not just from the government.

Partly in order to inspire public interest in the shuttle programme the ‘Teacher in Space Project’ had been created in 1984 to carry teachers into space as civilian members of future shuttle crews. From 11,000 completed applications one teacher, Christa McAuliffe from New Hampshire was chosen to fly on Challenger as the first civilian in space. She would deliver two fifteen minute lessons from space to be watched by school children in their classrooms. The project worked. There was widespread interest in the mission with the ‘first teacher in space’ becoming something of a celebrity. It also created more pressure. McAuliffe was due to deliver her lessons on Day 4 of the mission. Launching on 28th January meant Day 4 would be a Friday. Any further delays and Day 4 would fall on the weekend; there wouldn’t be any children in school to watch her lessons. Fatefully, the interest also meant 17% of Americans would watch Challenger’s launch on television.

NASA were never able to get anywhere close to their target of sixty missions a year. They were caught out by the amount of refurbishment needed after each shuttle flight to get the orbiter and solid rocket boosters ready to be used again. They were hamstrung immediately from conception by an unrealistic target they never should have made. Their move to inspire public interest arguably increased demand to perform. But they had more problems including a disconnect between senior staff and those on the ground floor.

Organisational failings

During the Rogers Commission NASA managers quoted that the risk of a catastrophic accident (one that would cause loss of craft and life) befalling their shuttles was 1 in 100,000. Feynman found this figure ludicrous. A risk of 1 in 100,000 meant that NASA could expect to launch a shuttle every day for 274 years before they had a catastrophic accident. The figure of 1 in 100,000 was found to have been calculated as a necessity; it had to be that high. It had been used to reassure both the government and astronauts. It had also helped encourage a civilian to agree to be part of the mission. Once that figure was agreed NASA managers had worked backwards to make sure that the safety figures for all the shuttle components combined to make an overall risk of 1 in 100,000. NASA engineers knew this to be the case and formed their own opinion of risk. Feynman spoke to them directly. They perceived the risk at somewhere between 1 in 50 and 1 in 200. Assuming NASA managed to launch sixty missions a year that meant their engineers expected a catastrophic accident somewhere between once a year to once every three years. As it turned out the Challenger disaster would occur on the 25th shuttle mission. There was a clear disengagement between the perceptions of managers and those with hands on experience regarding the shuttle programme’s safety. But there were also fundamental errors when it came to calculating how safe the shuttle programme was.

Misusing statistics

One of those safety figures NASA included in their 1 in 100,000 figure involved the O rings responsible for the disaster. NASA had given the O rings a safety factor of 3. This was based on test results which showed that the O rings could maintain a seal despite being burnt a third of the way through. Feynman again tore this argument apart. A safety factor of 3 actually means that something can withstand conditions three times those its actually designed for. He used the analogy of a bridge built to only hold 1000 pounds being able to hold a 3000 pound load as showing a safety factor of 3. If a 1000 pound truck drove over the bridge and it cracked a third of a way through then the bridge would be defective, even if it managed to still hold the truck. The O rings shouldn’t have burnt through at all. Regardless of them still maintaining a seal the test results actually showed that they were defective. Therefore the safety factor for the O rings was not 3. It was zero. NASA misused the definitions and values of statistics to ‘sell’ the space shuttle as safer that it was. There was an assumption of total control. No American astronaut had ever been killed on a mission. Even when a mission went wrong like Apollo 13 the astronauts were brought home safely. NASA were drunk on their reputation.


The Rogers Commission Report was published on 9th June 1986. Feynman was concerned that the report was too lenient to NASA and so insisted his own thoughts were published as Appendix F. The investigation into Challenger would be his final adventure; he was terminally ill with cancer during the hearing and died in 1988. Sally Ride would also be part of the team investigating the Columbia disaster; the only person to do so. After she died in 2012 Kutyna revealed she had been the person discretely pointing the commission in the correct direction of the faulty O-rings. The shuttle programme underwent a major redesign and it would be two years before there was another mission.

Sadly, the investigation following the Columbia disaster found that NASA had failed to learn lessons from Challenger with similar organisational dysfunction. The programme was retired in 2011 after 30 years and 133 successful missions and 2 tragedies. Since then NASA has been using the Russian Soyuz rocket programme to get to space.

The science of failure

Failure isn’t an option. It’s inevitable. By its nature the shuttle programme was always experimental at best. It was wrong to pretend otherwise. Feynman would later compare NASA’s attitude to safety to a child believing that running across the road is safe because they didn’t get run over. In a system of over two million parts to have complete control is a fallacy.

We may not all work in spaceflight but Challenger and then Columbia offer stark lessons in human factors we should all learn from. A system may seem perfect because its imperfection is yet to be found, or has been ignored or misunderstood.

The key lesson is this: We may think our systems are safe, but how will we really know?

"For a successful technology, reality must take precedence over public relations,

for Nature cannot be fooled."

Professor Richard Feynman

Bullet Holes & Bias: The Story of Abraham Wald

“History is written by the victors”

Sir Winston Churchill

It is some achievement if we can be acknowledged as succeeding in our field of work. If that field of work happens to be helping to win the most bloody conflict in history then our achievement deserves legendary status. What then do you say of a man who not only succeeded in his field and helped the Allies win the Second World War but whose work continues to resonate throughout life today? Abraham Wald was a statistician whose unique insight echoes in areas as diverse as clinical research, finance and the modern celebrity obsession. This is his story and the story of survivorship bias. This is the story of why we must take a step back and think.

Abraham Wald and Bullet Holes in Planes

Wald was born in 1902 in the then Austria-Hungarian empire. After graduating in Mathematics he lectured in Economics in Vienna. As a Jew following the Anschluss between Nazi Germany and Austria in 1938 Wald and his family faced persecution and so they emigrated to the USA after he was offered a university position at Yale. During World War Two Wald was a member of the Statistical Research Group (SRG) as the US tried to approach military problems with research methodology.

One problem the US military faced was how to reduce aircraft casualties. They researched the damage received to their planes returning from conflict. By mapping out damage they found their planes were receiving most bullet holes to the wings and tail. The engine was spared.


Abraham Wald

The US military’s conclusion was simple: the wings and tail are obviously vulnerable to receiving bullets. We need to increase armour to these areas. Wald stepped in. His conclusion was surprising: don’t armour the wings and tail. Armour the engine.

Wald’s insight and reasoning was based on understanding what we now call survivorship bias. Bias is any factor in the research process which skews the results. Survivorship bias describes the error of looking only at subjects who’ve reached a certain point without considering the (often invisible) subjects who haven’t. In the case of the US military they were only studying the planes which had returned to base following conflict i.e. the survivors. In other words what their diagram of bullet holes actually showed was the areas their planes could sustain damage and still be able to fly and bring their pilots home.

No matter what you’re studying if you’re only looking at the results you want and not the whole then you’re subject to survivorship bias.

No matter what you’re studying if you’re only looking at the results you want and not the whole then you’re subject to survivorship bias.

Wald surmised that it was actually the engines which were vulnerable: if these were hit the plane and its pilot went down and didn’t return to base to be counted in the research. The military listened and armoured the engine not the wings and tail.

The US Airforce suffered over 88,000 casualties during the Second World War. Without Wald’s research this undoubtedly would have been higher. But his insight continues to this day and has become an issue in clinical research, financial markets and the people we choose to look up to.

Survivorship Bias in Clinical Research

In 2010 in Boston, Massachusetts a trial was conducted at Harvard Medical School and Beth Israel Deaconess Medical Center (BIDMC) into improving patient survival following trauma. A major problem following trauma is if the patient develops abnormal blood clotting or coagulopathy. This hinders them in stemming any bleeding they have and increases their chances of bleeding to death. Within our blood are naturally occurring proteins called factors which act to encourage blood clotting. The team at Harvard and BIDMC investigated whether giving trauma patients one of these factors would improve survival. The study was aimed at patients who had received 4-8 blood transfusions within 12 hours of their injury. They hoped to recruit 1502 patients but abandoned the trial after recruiting only 573.

Why? Survivorship bias. The trial only included patients who survived their initial accident and then received care in the Emergency Department before going to Intensive Care with enough time passed to have been given at least 4 bags of blood. Those patients who died prior to hospital or in the Emergency Department were not included. The team concluded that due to rising standards in emergency care it was actually very difficult to find patients suitable for the trial. It was therefore pointless to continue with the research.

This research was not the only piece reporting survivorship bias in trauma research. Does this matter? Yes. Trauma is the biggest cause of death worldwide in the under 45 year-olds. About 5.8 million people die worldwide due to trauma. That’s more than the annual total of deaths due to malaria, tuberculosis and HIV/AIDS. Combined. Or, to put it another way, one third of the total number of deaths in combat during the whole of the Second World War. Every year. Anything that impedes research into trauma has to be understood. Otherwise it costs lives. But 90% of injury deaths occur in less economically developed countries. Yet we perform research in Major Trauma Units in the West. Survivorship bias again.

As our understanding of survivorship bias grows so we are realising that no area of Medicine is safe. It clouds outcomes in surgery and anti-microbial research. It touches cancer research. Cancer survival rates are usually expressed as 5 year survival; the percentage of patients alive 5 years after survival. But this doesn’t include the patients who died of something other than cancer and so may be falsely optimistic. However, Medicine is only a part of the human experience survivorship bias touches.

Survivorship Bias in Financial Markets & our Role Models

Between 1950 and 1980 Mexico industrialised at an amazing rate achieving an average of 6.5% growth annually. The ‘Mexico Miracle’ was held up as an example of how to run an economy as well as encouraging investment into Latin American markets. However, since 1980 the miracle has run out and never returned. Again, looking only at the successes and not the failures can cost investors a lot of money.

Say I’m a fund manager and I approach you asking for investment. I quote an average of 1.8% growth across my funds. Sensibly you do your research and request my full portfolio:


It is common practice in the fund market to only quote active funds. Poorly performing funds, especially those with negative growth, are closed. If we only look at my active funds in this example then yes, my average growth is 1.8%. You might invest in me. If however you look at all of my portfolio then actually my average performance is -0.2% growth. You probably wouldn’t invest then.

Yet survivorship bias has a slight less tangible effect on modern life now. How often is Mark Zuckerberg held up as an example for anyone working in business? We focus on the one self-made billionaire who dropped out of education before making their fortune and not the thousands who followed the same path but failed. A single actor or sports star is used as a case study on how to succeed and we are encouraged to follow their path never mind that many who do fail. Think as well about how we look at other aspects of life. How often do we look at one car still in use after 50 years or one building still standing after centuries and say, “we don’t make them like they used to”? We overlook how many cars or buildings of a similar age have now rusted or crumbled away. All of this is the same thought process going through the minds of the US Military as they counted bullet holes in their planes.

To the victor belong the spoils but we must always remember the danger of only looking at the positive outcomes and ignoring those often invisible negatives. We must be aware of the need to see the whole picture and notice when we are not. With our appreciation of survivorship bias must also come an appreciation of Abraham Wald. A man whose simple yet profound insight shows us the value of stepping back and thinking.

Thanks for reading

- Jamie

The Most Famous Case Report Ever

Case reports are nothing new. We’ve all told colleagues about interesting cases we’ve seen. I’ve presented a couple at RCEM. They tend to focus on the weird and wonderful, cases with surprising twists and turns but with actual limited learning. That’s why case reports are at the bottom of the table when it comes to levels of evidence. However, one in particular could be said to have marked a turning point in modern medical practice.

The Morbidity and Mortality Weekly Report (MMWR) has been published weekly by the Centre for Disease Control and Prevention (CDC) since 1950. Each week they release public health information, possible exposures, outbreaks and other health risks for health workers to be aware of. One case report in particular stands out out of all of their back catalogue. It was written by various doctors from the University of California, Los Angeles and Cedars-Mt Sinai Hospital, Los Angeles. It was published on June 5th 1981:

The MMWR June 5th 1981

Reported by MS Gottlieb, MD, HM Schanker, MD, PT Fan, MD, A Saxon, MD, JD Weisman, DO, Div of Clinical Immunology-Allergy, Dept of Medicine, UCLA School of Medicine; I Pozalski, MD, Cedars-Mt. Sinai Hospital, Los Angeles; Field Services Div, Epidemiology Program Office, CDC.

Pneumocystis Pneumonia (PCP) is a rare form of pneumonia caused by the yeast like fungus Pneumocystis jiroveci. The fungus can live in the lungs of healthy people without causing any problems so to see it in 5 otherwise healthy young (the oldest was 36) people was odd.

Less than a month later the MMWR published further cases of PCP as well as Kaposi sarcoma in 26 previously well homosexual men in Los Angeles and New York since 1978. Kaposi sarcoma is very rare form of cancer previously seen usually in older men of Jewish/Mediterranean descent. Again it was virtually unheard of it in young men. It was suspected that something was affecting their immune systems preventing them from fighting off infections and malignancy.

At the time there were many theories as to what was causing the immune systems of patients to shut down. It was felt that it was linked to the ‘gay lifestyle’ in some way leading to the stigmatising description in the media of GRID (Gay-related immunodeficiency) first used in 1982. By 1983 the disease was linked also to injecting drug users, haemophiliacs who’d received blood products and Haitians. This led to another stigmatising phrase ‘the 4H club’ (Homosexuals, Heroin addicts, Haemophiliacs and Haitians).

In 1982 however, the CDC had actually given it a proper name: ‘Acquired Immune Deficiency Syndome’ or ‘AIDS’.

The fact it was being transmitted to blood product recipients suggested the cause had to be viral as only a virus could pass the filtration process. In 1983 two rival teams, one American and one French, both announced they had found the virus causing AIDS with ongoing debate as to who got there first. Each team gave it a different name. In 1985 a third one was chosen: ‘Human Immunodeficiency Virus’ or ‘HIV’. By that time the virus had spread not just in America but in Canada, South America, Australia, China, the Middle East and Europe. Since 1981 worldwide more than 70 million people have been infected with the HIV virus and about 35 million people have died of AIDS. 

The MMWR of 5th June 1981 is now recognised both as the beginning of the HIV/AIDS pandemic and as the first publication of HIV/AIDS. Although only a case report it shows the value of these publications at the front line. Only by recording and publishing the ‘weird and wonderful’ can we start to share practice, appreciate patterns and spot emergent diseases.

Thanks for reading

- Jamie


We Need to Talk About Kevin: Is Kevin McCallister a Psychopath?

Home alone tooth 2.gif

It’s Christmas time, there’s no need to be afraid. At Christmas time we let in light and we banish shade. And usually sit down to watch a number of Christmas films including one or both of Home Alone (1990) or Home Alone 2: Lost in New York (1992)*. Both films were written and produced by John Hughes and directed by Chris Colombus and star Macaulay Culkin as Kevin McCallister as a young boy left home at Christmas in the first film and who ends up in New York in the second. In both he has to defend himself against a couple of bumbling burglars ‘The Wet Bandits’, Harry (Joe Pesci) and Marv (Daniel Stern). Home Alone remains the highest grossing live action comedy in the US and only lost the worldwide title in 2011 to The Hangover II. Both films are firm fixtures for Christmas watching.

Yet, on a recent viewing the other day there was an easy question in my mind. Not the sad decline of Macaulay Culkin from childhood star to example for any child who becomes famous or even the fact that the McCallister family would clearly have social services swarming over them. No, this question was about the character of Kevin himself. Beyond all the jokes and slapstick, is Kevin McCallister a psychopath?

I’m not a psychiatrist so I first had to look up the criteria to make a diagnosis. Turns out psychopathy doesn’t really exist anymore as a diagnosis and has been largely replaced by anti-social personality disorder (ASPD). So this changed my question for this musing immediately; does Kevin McCallister fulfil the criteria for ASPD?

As Kevin McCallister is American it seemed right to base any diagnosis against the criteria of the American Psychiatric Association. Fortunately, they publish their diagnostic criteria in the Diagnostic and Statistical Manual of Psychiatric Disorders (DSM) now in its 5th iteration published in 2013. The DSM defines the essential features of a personality disorder as “impairments in personality (self and interpersonal) functioning and the presence of pathological personality traits.” The DSM 5 details very clear criteria to make a diagnosis of ASPD.

First there needs to be significant impairments in self functioning AND in interpersonal functioning.

DSM defines impairments in self functioning as either identity or self-direction:

a.Identity: Ego-centrism; self-esteem derived from personal gain, power, or pleasure.

b.Self-direction: Goal-setting based on personal gratification; absence of prosocial internal standards associated with failure to conform to lawful or culturally normative ethical behaviour.

Kevin certainly has high self-esteem. In Home Alone he genuinely believes that through his own power he has made his own family disappear. A belief that prompts celebration:

In terms of conforming to legal or ethical behaviour at no point in either film does he seek to tell the police or authorities that he’s home alone or at risk of the Wet Bandits. Indeed, he shop lifts albeit inadvertently in the first film and uses his father’s credit card to book into a luxury hotel in the second. He certainly uses his freedom for personal gratification spending $967 ($1,742.98 in today’s money) on room service alone in Home Alone 2.

So far it seems like he’s ticking the boxes without us even mentioning the vigilante justice. More of that violence later.

On to interpersonal functioning, defined by the DSM as either in empathy or intimacy:

a.Empathy: Lack of concern for feelings, needs, or suffering of others; lack of remorse after hurting or mistreating another.

b.Intimacy: Incapacity for mutually intimate relationships, as exploitation is a primary means of relating to others, including by deceit and coercion; use of dominance or intimidation to control others.

Speaking as a doctor in Emergency Medicine let’s get this straight: Kevin would have killed the Wet Bandits several times over. Especially in the second film where to start he throws four bricks on Marv’s head from height. Marv would be dead. No question. Kevin McCallister is attempting murder:

Later on, he sets up elaborate traps and stays around rather than running away (like most would do) merely to supervise Marv getting electrocuted and Harry setting fire to his head:

At no stage does he show any remorse and actually celebrates what he’s doing. Prior to meeting Kevin the Wet Bandits were cartoon villains, non-violent and stupid. Did they deserve to die? Kevin obviously felt it was worth risking it and enjoyed it.

He does form friendships in both films with people he previously feared; Old Man Marley and the Pigeon Lady. He inspires the former to re-connect with his family and gives the latter a present. This does suggest that he can form bonds with people. However, both were useful to him by helping him escape the Wet Bandits so it could be argued he was exploiting and rewarding them for his own benefit. This bit is open to debate but for the benefit of the blog lets assume this was Machiavellian manipulation and move on.

The patient then needs to have pathological personality traits in antagonism and disinhibition.

The DSM defines antagonism as:

a.Manipulativeness: Frequent use of subterfuge to influence or control others; use of seduction, charm, glibness, or ingratiation to achieve one„s ends.

b.Deceitfulness: Dishonesty and fraudulence; misrepresentation of self; embellishment or fabrication when relating events.

c. Callousness: Lack of concern for feelings or problems of others; lack of guilt or remorse about the negative or harmful effects of one’s actions on others; aggression; sadism.

d. Hostility: Persistent or frequent angry feelings; anger or irritability in response to minor slights and insults; mean, nasty, or vengeful behaviour.

We’ve already looked at Kevin’s violence and cruelty. He’s also certainly a master of deception. Throughout both films he is adept at speaking to adults and painting stories with great ease. Lying comes very easily to him as does using props and music whether to pretend to be his dad in the shower, a gangster with a gun or even a house full of celebrating people:

This is a crafty kid who is willing to lie and smile while doing it.

Disinhibition is defined by the DSM as:

a. Irresponsibility: Disregard for – and failure to honor – financial and other obligations or commitments; lack of respect for – and lack of follow through on – agreements and promises.

b. Impulsivity: Acting on the spur of the moment in response to immediate stimuli; acting on a momentary basis without a plan or consideration of outcomes; difficulty establishing and following plans.

c.Risk taking: Engagement in dangerous, risky, and potentially self-damaging activities, unnecessarily and without regard for consequences; boredom proneness and thoughtless initiation of activities to counter boredom; lack of concern for one’s limitations and denial of the reality of personal danger

In both films Kevin shows poor regard for his own safety, whether climbing down a rope soaked in kerosine, zip-lining from the roof of his house or climbing his brother’s shelves:

Early on in both films he is shown impulsively lashing out in anger when he feels frustrated at having his pizza eaten or embarrassed in public during his choir solo. Kevin is not inhibited:

And violence is clearly natural to him. So far…seems to be meeting all the criteria.

Finally these factors must be consistent across time and place. They must not be due to intoxication or head injury.

As Kevin behaves the same in Home Alone (1990, set in Chicago) and Home Alone 2 (1992, set in New York) we can assume his behaviour is consistent to time and place. At no point is he seen taking drugs or drinking alcohol so we can rule those out as a cause. He does hit his head slipping on ice in Home Alone 2 but that’s very late on and isn’t shown to effect his behaviour in any way. Once again he’s meeting criteria.

They must not be better understood as “normative for the individual’s developmental stage or socio- cultural environment.”

Kevin is a remarkable child acting in a way we wouldn’t expect of a boy his age. He definitely has, at best, a chaotic family and there’s no doubt that after Home Alone 2 social services would have come down on the McCallister family like a tonne of bricks:

However, the house is pristine and all the children look well nourished and dressed. While there’s plenty of questions about what kind of job the McCallisters must do in order to fund this lifestyle there’s no indication that this is a family where violence is the norm. Box ticked again.

Finally, the individual is at least age 18 years.

Ah, here it falls down right at the last. Kevin is 8 in Home Alone and so well below the age where we can diagnose ASPD. NICE does have a Quality Standard (QS59) first published in 2014 aimed at identifying children at risk of ASPD. This includes interventions for the whole family. But in no way can a conclusive diagnosis be made in a child.

- - -

So is Kevin McCallister a psychopath? By the DSM diagnostic criteria the answer is no. Does he show some traits that might set off alarm bells? The answer is yes. However, there is a debate about whether ‘psychopathy’ is an evolved trait which has been able to survive through natural selection as it has benefited human society to have individuals without morals prepared to do whatever it takes to achieve their goals (Glenn, Kurzban and Raine, 2011). Maybe we should celebrate Kevin’s innate traits as he uses them to defend himself and his family. After all, it means the bad guys get caught and it wouldn’t really make good films if he just rang the police like a good citizen.

- - -

Time to be serious now. This blog does highlight a big issue I find with mental health. Almost ahead of any other profession people are all too willing to play ‘keyboard psychiatrist’ and diagnose public figures such as Donald Trump with a mental illness. Whilst this blog is meant as a bit of fun it shows how mental health has very clear diagnostic criteria to be met before we use loaded terms such as ‘psychopath’. That can be my Christmas message: before you make a stigmatising diagnosis make sure you know what you’re talking about. In fact in general: research first, speak later. Let’s be nice people.

Merry Christmas, you filthy animals

*Yes, I am aware there is a ‘Home Alone 3’ and even somethings called Home Alone 4: Taking Back the House and Home Alone: The Holiday Heist. I just choose to ignore them as we all should.

Home alone tooth.gif


Glenn, A., Kurzban, R. and Raine, A. (2011). Evolutionary theory and psychopathy. Aggression and Violent Behavior, 16(5), pp.371-380.

Hopwood, C. J., Thomas, K. M., Markon, K. E., Wright, A. G., & Krueger, R. F. (2012). DSM-5 personality traits and DSM-IV personality disorders. Journal of abnormal psychology121(2), 424-32.