Finding the Truth – methods in biomedical research. II – Observational and epidemiological methods

Types of observational approaches

In this first section there is a brief explanation of some of the observational approaches used by biomedical scientists with simple illustrative examples.

Geographical comparisons

Differences in some aspect of the environment, lifestyle or diet of several populations are related to their death rates or incidence of a particular disease or health-related characteristic. For example, average sugar consumption in populations might be related to a measure of their children’s dental health or levels of tobacco consumption might be related to death rates from lung cancer. One would need to adjust the death rates in the latter study to correct for any differences in the age structures of the population. One could compare deaths within a specific group e.g. males aged 45-50 years or calculate what is known as a standardised mortality ratio (SMR). SMR is an age-corrected measure that allows mortality rates in populations with different age structures to be directly compared.

Back in 1973, Lillian Gleiberman published a paper in the rather obscure journal Ecology of Food and Nutrition in which she correlated average blood pressure with estimated average daily salt intake in 27 populations around the world. Figure 1 was plotted using some of Gleiberman’s male data and shows a highly significant positive correlation which suggests that around 37% (r2) of the variation in blood pressure can be explained by variation in average salt intake. These findings are consistent with the now widely accepted belief that high salt intake is a causative factor in the development of hypertension. Gleiberman would have been aware of the many limitations in her original study but the data was convincing enough to suggest that the hypothesis that high salt intake is a cause of high blood pressure was worthy of further investigation.

Figure 1           Relationship between average blood pressure and estimated salt intake in 27 male populations around the world.

Salt and hypertension

Anomalous populations                                                                                                   

Some population groups may deviate markedly from what is an otherwise strong and fairly consistent trend across different populations. Mortality rates for coronary heart disease tend to be higher in populations consuming higher amounts of animal fats. However, Danish scientists Hans Bang and Jorn Dyerberg reported in the 1970s that traditional Greenland Eskimos (Inuit) had relatively low levels of heart disease despite eating large amounts of animal-derived fats. Most of the fat consumed by these Inuit was of marine origin i.e. from fish and marine mammals like seals and whales. Such observations triggered the current interest in fish oils which differ from the fat of land animals because they are rich in omega-3 fatty acids and relatively low in saturated fats. More recently, Elizabeth Preston has been suggest that Dyerberg and Bang used flawed underestimates of the rates of heart disease amongst the Inuit. If, as she suggests, heart disease rates amongst the Inuit was much higher than assumed by Bang and Dyerberg then decades of work in this major research area may have been triggered by flawed data!

Special groups

Some groups have lifestyle characteristics that are very different from the bulk of the population living in the same region. Seventh Day Adventists in the USA tend to abstain from alcohol and tobacco use and about half do not eat meat. They are intensively studied because they also have lower rates of some cancers and heart disease than other Americans and those Adventists who do and do not eat meat also have different mortality and illness patterns to those who do. Loma Linde University in Southern California is a Seventh Day Adventist educational institute which specialises in the health sciences. Studies conducted at Loma Linde using Adventists have generated many papers about aspects and characteristics of vegetarian diets and health and this information has implications for the population as a whole.

Time trends

Changes in patterns of behaviour or environmental changes can be related to changes in the frequency of specific diseases. A sharp increase in sugar consumption e.g. in an island population when sugar first starts to be imported in bulk might be associated with a sharp increase in rates of tooth decay in the island’s children. The increase in dental disease with the increased use of sugary foods on the remote South Atlantic island of Tristan da Cunha after 1937 is widely used as evidence for the key role of sugar in promoting dental caries. More recently it has been claimed that the introduction of fluoride supplementation programme in 1982 was associated with a subsequent large increase in the number of caries-free children on the island. Increases in levels of alcohol consumption over time could be related to changes in the number of people affected by liver disease. According to Robert Mann and two colleagues in a 2003 paper, the introduction of prohibition in the USA was associated with a huge decline in death rates from liver cirrhosis from the high levels seen at the beginning of the 20th century. Rates of liver disease rose again after prohibition ended in 1933.

Figure 2 is taken from my previous post on cot death and shows how UK cot death rates rose in the period after 1971 as parents were persuaded to use the front sleeping position for their infants. They peaked in the mid-1980s when evidence that front sleeping increased cot death risk started to emerge and be discussed in the media. Rates edged downwards after peaking in around 1987, then a “back to sleep” campaign began in 1991 and the rate halved in 1992 and has continued to edge down since then.

Figure 2           Cot death rates in England and Wales over the period 1971-2004 (all 5 year intervals except red bars)

cot death4

Migration studies

When people migrate from one place to another then they are immediately exposed to a new environment and they also tend to gradually adopt aspects of the diet and lifestyle of the native population in their new homeland – this gradual adoption of aspects of the new lifestyle and diet is called acculturation. As migrants acculturate there is also a tendency for their health characteristics and disease patterns to become more similar to those in the new country and to move away from those seen in their native country. This suggests that many of the differences in disease and mortality patterns between countries are due to differences in environment, lifestyle and diet rather than due to genetic differences between races. Ethnic Japanese in the USA have much lower levels of strokes than those in Japan. Salt intakes in Japan were amongst the highest in the world because of the high consumption of salted and pickled foods; this observation is consistent with the suggested causal link between high salt intake and high blood pressure because hypertension is a major cause of stroke. Hypertension, gout and type 2 diabetes were once uncommon amongst Polynesian inhabitants of certain Pacific islands but were much higher in people of Polynesian descent who had migrated to New Zealand (examples from Epidemiology in Medical Practice by the late David Barker and Geoffrey Rose).

Studies upon multiple sclerosis have occupied the attentions of epidemiologists for many years. When people migrate from high incidence areas like the UK to low incidence areas rates tend to fall. This fall seems to be more pronounced in those who migrate before 15 years old than in those who migrate later in life. When people migrate from low to high areas then they tend to retain their low risk. One suggestion is that multiple sclerosis may be triggered by delayed exposure to an infectious agent like the Epstein-Barr virus (causative agent for glandular fever).

In addition to movements between countries one might also look at the health changes that accompany migration within countries e.g. from villages to cities in developing countries. The village environment may be typical agricultural communities where peasant farmers are required to do a lot of physical work and where the population eat a high starch diet based upon what they grow themselves. The city life may be a more sedentary wage-based existence where there is a greater reliance upon imported and processed foods with higher levels of fat, sugar and salt. Back in 1969 by it was noted by Gerry (AG) Shaper and his colleagues that when Samburu men were recruited into the army from Kenyan villages, there was a rise in their average blood pressure within weeks of recruitment. The village diet was much lower in salt than the army rations and thus these observations are consistent with a link between higher salt intake and higher blood pressure. In a more recent study Professor Neil Porter and colleagues looked at members of the Kenyan Luo tribe who were either living in their villages or had migrated to Nairobi. Once again they found higher blood pressure and evidence of higher salt intake in the migrants.

Cross sectional surveys

When large-scale cross-sectional surveys are conducted then it is possible to look for links or correlations between the variables measured. For example, the National Diet and Nutrition Survey programme is an ongoing series of large surveys of representative samples of the UK population that began more than thirty years ago. These surveys record details of the diets of the sample, make physical measurements (like weight, height and blood pressure), survey their social and behavioural characteristics with a questionnaire and record the results blood and urine analyses of nutrients and risk markers (like blood cholesterol). In these and similar studies it has been consistently shown that when individuals are put into categories according to their activity level then the proportion who are overweight (BMI 25+) or obese (BMI 30+) rises in the lower activity categories and this is illustrated in figure 3 which was constructed using NDNS data. This is consistent with the hypothesis that inactivity is a cause of excessive weight gain and obesity.

Figure 3           The effect of estimated activity level upon the risk of being overweight or obese.


Despite the findings of Gleiberman in figure 1 above, single population surveys like the NDNS do not usually find clear relationships between individual salt intake and blood pressure (or between saturated fat intake and blood cholesterol). This is thought to be because:

  • Many other factors like activity, body weight and alcohol intake affect blood pressure
  • Genetics affects blood pressure and individuals may vary considerably in their genetic susceptibility to salt.
  • The measure of salt intake often used is a “24 hour snapshot” that is a poor measure of habitual salt intake.

Case-control studies

In case-control studies, the exposure to a suspected causative factor is compared in matched groups of people with or without a disease or disease indicator. For example, a sample of people diagnosed with mesothelioma (a normally rare form of lung cancer) could be matched with a sample of similar age, social group, geographical location and sex who are free of the disease. The past workplace exposure to asbestos could be compared in the cases and in the unaffected group (controls). One would almost certainly find evidence of workplace exposure to asbestos in almost all of the cases but many fewer controls because exposure to asbestos causes mesothelioma.

In 1950 Richard Doll and A Bradford Hill reported a study in which 700 people with lung cancer from 20 London hospitals were interviewed and asked about their previous smoking behaviour. Their responses were compared to those of patients with cancers of the stomach and bowel and with cancer-free patients of the same age and sex. They found that lung cancer patients were more likely to have smoked than the subjects without lung cancer. They suggested that above the age of 45, the risk of developing lung cancer in those who smoked more than 25 cigarettes a day might be as much as 50 times as high as those who did not smoke. Even allowing for some inaccuracy in this estimate it would be difficult not to believe that this was probably a causal association.

As another example, one could ask parents about the past sleeping position of infants who have died of cot death (cases) and those who have not died (controls). An expert report published by the UK Department of Health in 1993 reported that 20 studies of this type from different countries had found that the rate of use of the front sleeping position varied between 2 and 12 times higher in those who had suffered cot death compared to those who had not. These results are consistent with the hypothesis that the front sleeping position increases the risk of cot death.

When such studies are used to try to identify dietary links to disease, it must be borne in mind how difficult it is to get reliable estimates of even current nutrient intakes. To try to assess past diets, increases that difficulty very substantially and means that these estimates are almost inevitably prone to substantial error and uncertainty. Current diets, especially in people suffering from a serious disease (cases), may be a poor guide to what the diet was like when the disease was initiated. People are much more likely to give an accurate indication of, for example, their past smoking habits, past employment and the usual sleeping position of their infants than to give an accurate estimate of their past dietary intakes.

Cohort studies

Initial measurements or other data are recorded from a large sample or cohort of people; these might be lifestyle characteristics (e.g. tobacco use, activity level, alcohol consumption), dietary habits (e.g. dietary fibre, saturated fat or salt intake) or risk factor levels (e.g. body mass index, blood cholesterol level or blood pressure). These individuals are then monitored over a period of years and the occurrence of serious illnesses and deaths from particular causes are recorded. One might then relate disease or death rates to the initial measurements. Thus, for example, one might relate the risk of having a (fatal) heart attack to the initial blood cholesterol level and show that high blood cholesterol level is strongly associated with increased risk. One might relate initial activity and physical fitness indicators to the risk of becoming overweight or obese in the following years e.g. do children who watch the most hours of television have a higher risk of becoming overweight or obese?

In order to get a large enough number of cases or deaths from a particular disease to perform statistical analysis, the initial sample size often needs to be large (maybe thousands or tens of thousands of subjects) and these subjects need to be monitored for several years or even decades. For example, in order to get 150 cases of colon cancer in a 5 year study, one would need a sample of 100000 middle-aged northern Europeans. Even a doubling of risk in a large cohort study, especially of a relatively uncommon disease, may represent just a handful of “extra” cases and so the possibility that it is due to bias or the residual effects of confounding variables seems easier to accept. An increase of from 5 to 10 cases per 10000 over 5 years is a doubling of the risk (relative risk =2).

Cohort studies are often expensive and labour intensive and may take several years to generate any useful data but they are regarded as the most powerful of the observational/epidemiological methods.

Another paper about smoking and lung cancer by Richard Doll and A Bradford Hill published in 1956 is a classic example of an early cohort study. Around 40,000 British physicians filled in questionnaires about their current and previous smoking habits. Doll and Hill then recorded the number and causes of death amongst this sample for the next four and a half years. In men over 35 years, the death rate from lung cancer was 13 times higher in current smokers than in non-smokers and increased progressively with the amount smoked (see figure 4). This 13 fold increase in risk in smokers compared to non-smokers is called the relative risk and the relative risk in the highest smoking category was 24!

Figure 4           The relationship between tobacco usage and death rate from lung cancer (data of Doll and Hill, 1956). 1 g of tobacco is about 1 cigarette

tobacco and lung cancer

A number of famous cohort studies have recruited very large samples and have followed these subjects for decades. The Framingham study based in the town of that name in Massachusetts, USA started in 1948 with an initial sample of 5000 adults. The Nurses’ Health Study was started in 1976 and 120,000 married American female nurses were recruited and asked to fill in a lifestyle questionnaire and this was initially aimed at investigating the possible adverse effects of oral contraceptive use. Both studies are still ongoing and have been widened to include additional measurements and new cohorts over the years. The European Prospective Investigation into Cancer and Nutrition (EPIC) recorded detailed information about diet and lifestyle, physical measurements and collected and analysed blood samples from well over 500000 people living in 23 separate locations in 10 different European countries. These dietary and lifestyle characteristics were then related to the subsequent risk of developing cancer. These cohort studies and others like them have identified or confirmed many associations between aspects of lifestyle and diet and disease risk.

Back in 1980, a cohort study was begun using 18,000 middle-aged civil servants (the Whitehall Study). A questionnaire was administered on a Monday morning that asked about the amount and intensity of activity engaged in over the preceding week-end. Over the course of the next 8 years Jerry Morris and his colleagues reported that those who had reported having engaged in vigorous leisure time activity had only about half the incidence of heart attacks compared to those who did not, even when allowance was made for the effects of smoking. This is consistent with a hypothesis that vigorous leisure time activity reduces the risk of coronary heart disease.

Association in observational studies does not prove cause and effect

All of these observational, epidemiological methods discussed above are used to demonstrate association between two variables such as a dietary or lifestyle characteristic and risk of a disease. The most important limitation of this epidemiological type of research is that demonstration of an association between two variables does not prove cause and effect. This flaw remains no matter how many subjects are used.

When two variables A and B are found to be associated then there are several possible explanations of this association. It could be that A causes B or that B causes A. So looking back at figure 3, the association between inactivity and obesity risk could mean that inactivity is a cause of obesity or it could be that overweight and obese people find exercise more stressful and unpleasant and so curtail their activity. If one could show that activity level in lean people predicted their future risk of becoming obese then this would strengthen the hypothesis that inactivity causes weight gain.

It may be that some third variable C is linked to both B and A, but that A and B are not causally linked. In this case, C would be called a confounding variable and the association between A and B is due to both of them being related to C.  It is also possible that the apparent association between A and B is just a chance observation or caused by bias in the study especially if it is only a relatively weak association.

If one found a relatively weak but significant association between the amount of alcohol consumed and the risk of developing lung cancer then this might mean that alcohol is a direct contributory cause of lung cancer. Smoking is now known to be a major cause of lung cancer and so an alternative explanation is that both alcohol consumption and lung cancer risk are independently linked to this third variable, the level of exposure to cigarette smoke. Perhaps people who drink a lot of alcohol are also more likely to smoke or have been more exposed to passive cigarette smoke e.g. by spending time in smoke-filled bars or rooms. In this case, the association between alcohol consumption and risk of lung cancer could be because heavy drinkers get more exposure to carcinogenic cigarette smoke than light or non-drinkers. Both alcohol and lung cancer risk are positively linked to this third, confounding variable, the level of exposure to cigarette smoke.

Some examples of associations unlikely to be due to cause and effect:

  • The more firemen sent to a fire the more damage is done – could be wrongly used to imply that the firemen make the fire worse but a more likely explanation is that more firemen are sent to deal with the worse fires (effect and cause).
  • People who visit their doctor most regularly have the worst health records.
  • Figure 5 is a graph showing a very strong negative correlation between the number of fresh lemons imported into the USA from Mexico and the number of fatalities in road accidents in the USA over a five-year period. Surely this cannot be cause and effect or even effect and cause? Figure 5 suggests that both highway fatalities and lemon imports happened to be falling at the same time and so they appear to be related i.e. time would be a confounding variable in this instance.

Figure 5           The correlation between lemon imports from Mexico and the fatality rate on US highways between 1996 and 2000  Source

lemons and US road deaths

When people design and analyse epidemiological studies, they try to correct for the effects of confounding variables and so if any hypothetical association between alcohol consumption and lung cancer risk was corrected for the effects of smoking then the association would certainly diminish and might well disappear completely. The problem is that it may be very difficult in some studies to know what all of the likely confounding variables are. For example, there are many suspected risk factors for coronary heart disease so to look at the impact of one of these separated from all of the others is almost impossible. It may also be difficult to accurately allow for the effect of some confounding variables especially where it is difficult to measure the confounding variable accurately. For example, variation in activity level and physical fitness may be a likely confounding variable in many studies looking at lifestyle or dietary factors as possible causes of disease. It is very difficult to measure activity level with any precision or certainty and many epidemiological studies in the past have not attempted to correct for this potential confounder. So, for example, if one were looking at the association between a dietary variable and heart attack risk one would try to correct for things like social class, smoking behaviour and other dietary variables but many past studies would not have measured or corrected for physical activity levels. The choice of which potential confounding variables to correct for and how this correction is done may determine whether the association under test remains statistically significant or not (see multiple modelling in an earlier post.

Many reports of associations in scientific papers say that the association remained after correction for a variety of likely confounding variables. However, there is no statistical magic wand that precisely corrects for all of the possible confounders. The process of correction is an imprecise, sometimes very imprecise, process especially if the measurement of the confounder is only a crude or unreliable estimate. The choice of which variables to correct for is a decision of the authors who may be limited by data availability. In many studies there may be no data on one or more potentially confounding variables because the data was collected for a different purpose e.g. as part of large national surveys intended to monitor the health, diet and general well-being of the population.

Re-demonstrating existing, well-documented associations with ever larger numbers of subjects does not really help in deciding whether the association is causal unless better information on likely confounding variables is also available.

Criteria for establishing cause and effect

There are several characteristics of an epidemiological association that increase the likelihood of the association being causal. These tests of causality are sometimes known as the Bradford Hill Criteria or Hill’s criteria for causation and were originally set out in the 1960s by the English epidemiologist Austin Bradford Hill.

Strength of the association

The stronger an association is, the more likely it is to be due to cause and effect rather than due to some unforeseen confounding variable. Thus the 13 fold increase in risk of lung cancer death rate in smokers compared to non-smokers shown in found by Doll and Hill in 1956 is highly likely to be due to a cause and effect relationship i.e. smoking is an important cause of lung cancer. Many of the associations between dietary characteristics and disease that regularly attract media attention have relative risks that are much lower than this: often in the range of 1-2, or even 1-1.5. It is much more likely that some of these may be caused by bias or due to the residual effects of some confounding variable(s).

The odd but strong association shown in figure 5 demonstrates that strength of association is not an infallible guide; the negative association between lemon imports road fatalities in the USA is close to perfect and yet seems almost certainly not due to cause and effect.

Temporality and reversibility

The suspected cause must precede its effect. For example, any rise in tooth decay in an island population must occur shortly after supplies of sugar start arriving on the island. The repeal of prohibition in the USA in 1933 was, after a lag period, followed by a rise in the levels of alcoholic liver diseases as alcohol consumption rose.

One could also include an extra dimension to this criterion, reversibility; if reduced exposure to the suspected cause is associated with reduced incidence of the disease then it is more likely to be a causal association. If sugar causes tooth decay then interruption of supplies to an island (e.g. during wartime) should be followed by a decrease in rates of tooth decay. Epidemiological evidence from case-control and cohort studies suggested that front sleeping increased the risk of cot death in infants and led to big increases in the incidence of cot death in the UK in the 1970s and early 1980s. A “back to sleep” campaign to persuade parents to let babies sleep on their backs was started in the UK in 1991 and in 1992 the rate of cot deaths in the UK was halved (see figure 2).

Health promotion initiatives are based upon this belief in reversibility, for example:

  • If smokers give up smoking their risk of developing lung cancer will diminish over time
  • If heavy drinkers moderate their alcohol consumption then their risk of liver disease will diminish
  • If high blood cholesterol levels are reduced then the risk of cardiovascular disease will fall.

Of course, the expectation of reversibility has to be qualified and can only be expected if the better behaviour starts before irreversible damage has been done; it is too late to give up smoking when advanced lung cancer has been diagnosed.

Specificity of variables

A single specific cause has a single specific effect. The more specific an association between a factor and an outcome the more likely it is that the factor is causative. Correction for confounding variables should not eliminate the association but as already noted decisions about which confounders to correct for and how that correction is done may affect whether the association under test remains significant or not.

Is it graded or dose dependent?

There is usually a graded effect of true causes rather than a threshold effect. If the population of the UK is classified into several different activity categories, then levels of obesity and overweight increase progressively with decreasing activity as shown in figure 3. In figure 4 there is a very clear progressive increase in lung cancer mortality as the “dose” of tobacco increases.

Consistency of the findings

If an association is found in a variety of studies using different investigative approaches then it is more likely to be due to cause and effect. Many different types of study indicate a link between smoking and lung cancer and it is now almost universally accepted that this association is causal.

Several different types of epidemiological studies suggest that there is a negative association between intake of dietary antioxidants and risk of cancer and heart disease. Despite this, clinical trials of several antioxidant supplements in adults have failed to demonstrate any holistic benefits of taking them and they are more likely, in these trials, to do net harm than net good (see previous post on antioxidants).


An association is more likely to be due to cause and effect if there is a plausible explanation or mechanism which is supported by laboratory studies. A causal link between inactivity and excessive weight gain is readily explicable because inactivity reduces energy expenditure making it more likely that energy intake will exceed expenditure and the surplus energy stored as extra body fat. A note of caution is warranted here; scientists are very good at producing intellectually satisfying mechanisms to explain how any particular association could be due to cause and effect. Sometimes equally plausible mechanisms can be produced to explain the opposite finding. The most plausible and intellectually satisfying explanations are not always those that prove to be correct.


The suggested cause and effect relationship should not conflict with other relevant knowledge whether it is other epidemiological evidence or the results of experiments. In the antioxidant case-study there is just such a conflict:

  • Epidemiological evidence of several types indicates that high antioxidant intakes are associated with lower mortality and reduced risk of cancer and cardiovascular disease
  • Several large controlled trials of some antioxidant supplements have found no benefits or, in some cases, increased mortality in the supplemented group.


The case is strengthened if the proposed cause and effect relationship is analogous to another known cause and effect relationship. It is well known that type 2 diabetes is the result of decreased tissue sensitivity to insulin rather than primarily due to a failure of insulin production. In most obese people there are high levels of leptin, a hormone produced by adipose tissue that is thought to regulate body fat stores by reducing food intake and perhaps increasing energy expenditure. By analogy perhaps most obesity is driven by reduced leptin sensitivity rather than a failure of production.

Experimental evidence

Evidence from whatever controlled experiments are possible should also be consistent with the cause and effect hypothesis. This is clearly not the case with the antioxidant example discussed earlier.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s