Sir Cyril Burt (1883-1971) and the heritability of intelligence debate

Sir Cyril Burt 1883-1971 and the Heritability of Intelligence Debate

“Everything about the man – his fine sturdy appearance: his aura of vitality; his urbane manner; his unflagging enthusiasm for research, analysis and criticism; even such a small detail as his firm meticulous handwriting; and, of course, especially his notably sharp intellect and vast erudition – all together leave a total impression of immense quality, of a born nobleman.”

Professor Arthur Jensen from an obituary of Cyril Burt 1972

This is a case study from my unpublished book about error and fraud in science. It is the longest case-study in the book because in addition to discussion of Cyril Burt’s fraudulent work it also includes discussion of many other examples of fraudulent or very biased research that contributed towards the belief that intelligence was a largely inherited characteristic that was only modestly affected by environmental influences. This belief provided the scientific justification for the eugenics movement and forced sterilisation programmes in the USA, parts of Europe and ultimately Nazi Germany; it also influenced educational and immigration policies.

Burt Overview

Cyril Burt was an English educational psychologist who became Professor and Head of the Department of Psychology at University College, London (1932-1950), one of Britain’s foremost educational and research establishments. He wrote many papers, reports and books and was editor of what is now called the Journal of Mathematical and Statistical Psychology for many years. In 1942, he was elected President of the British Psychological Society and was the first psychologist to be knighted for his services to psychology in 1946. He was the first non-American to be awarded the prestigious Thorndike prize for his contributions to the field of educational psychology by the American Psychological Society in 1968.

Burt was a major figure in the field of educational psychology in Britain and has even been considered the father of educational psychology in the UK. He was a strong supporter of the view that intelligence is largely the result of genetic make-up and that environmental factors have a relatively small influence. He subscribed to the notion that general intelligence was an essentially innate and measurable entity. His published data and his writings helped to create a climate that led to the establishment of the eleven plus examination system in England in which children were graded and selected at around eleven years of age and sent to different types of school depending upon their in performance in this examination. Those who performed well in this 11+ exam went to academically focused grammar schools and those who performed less well went to less academically and more supposedly practically focused secondary modern schools. It was an implicit assumption of this system that the academic abilities of children were essentially fixed and measurable at this age and so the less academically able children could be channelled into practical jobs and trades rather than those requiring higher academic abilities.  Burt used his research to argue the case for continuing selective education in England but in 1969, two years before his death, the national eleven plus examination was abolished and almost entirely replaced by the current non-selective comprehensive system although the old selective system persists in specific localities. His work and views were influential in the USA where they were used to argue that because IQ was largely determined by genetics, diverting large amounts of resources to improving the education of poor black and white children was unlikely to be effective.

Early in his career Burt published studies which seemed to indicate that there was strong correlation between the intelligence of parents and their offspring. He also published a series of papers ending in 1966 in which he looked at the IQs of identical twins who were reared apart i.e. with identical genetic make-up but subject to different environmental influences.  Burt correlated the IQ of pairs of twins reared together and reared apart to test the influence of a different environment on the latter group’s characteristics. He reported very high correlation between the IQ scores of pairs of twins reared apart which although reduced compared to those reared together nonetheless were high enough for him to claim that IQ was 75-80% determined by inheritance. Several of his most controversial and most quoted papers were published towards the end of his life. In 1961 he published a classic paper in the British Journal of Statistical Psychology entitled Intelligence and Social Mobility. In this paper he states that the correlation between the IQ of fathers and sons is only about 0.5 and the correlation between social class (judged by parental occupation) and intelligence is considerably lower in children than adults. He argues that if the distribution of IQ within social classes remains constant then this must mean a considerable amount of social mobility is taking place i.e. the most intelligent of the lower class children moving into a higher classification and those with the lowest IQ in the higher classes moving down in social class and he presents data that seem to confirm his hypotheses. In one of his last papers published in The Irish Journal of Education in 1969 he argued that over the period 1914 to 1965 the measured intelligence and certain scholastic abilities (reading, spelling and arithmetic) were worse in England in 1965 than they had been in 1914. This paper includes a table which charts the scores of children for intelligence and five separate tests of these key scholastic abilities at seven points between 1914 and 1965. It shows a very substantial decline in the scores for all the scholastic tests between 1914 and 1945 and a small drop in intelligence score; there is only partial recovery of these scores between 1945 and 1965. All of his work on intelligence seemed to support his view that that nature (genetics) was far more important in determining IQ than environment (nurture).

Soon after Burt’s death, Professor Leon Kamin in a 1974 book entitled The Science and Politics of IQ produced compelling evidence which suggested that Burt’s data on identical twins was either fabricated or improperly manipulated although Kamin never actually made a direct accusation of research fraud. This theme was taken up by a journalist Oliver Gillie who wrote an article published in the Sunday Times in October 1976 in which he openly suggested that Burt had faked his twin data and invented two of his research collaborators. Kamin and others have also suggested that much of the data published after his retirement in 1950 was similarly manipulated, fabricated or selected to support his point of view e.g. Professor Donald Dorfman of the University of Iowa published a paper in Science in 1978 which made a very persuasive case for the data in Burt’s 1961 paper mentioned above having been fabricated. Much of his early work on the heritability of IQ has been suggested to be fraudulent or so poorly executed and/or described so as to render it of little scientific value. Kamin summed up his conclusions on Burt’s data thus:

“The numbers left behind by Professor Burt are simply not worthy of our current scientific attention.”

In recent years there have been several attempts to rehabilitate or at least give the benefit of the doubt to Burt. Many of these attempts seem to me like attempts by an expensive lawyer to give a sympathetic jury an excuse for acquittal on the grounds of reasonable doubt when presented with an offender who seems to have been caught red handed. Whilst it is sometimes just about feasible to judge that the case against him for any single accusation might be considered not proven in a criminal court; there are so many of these accusations that fall into this category that taken together the case for him either fabricating or dishonestly manipulating his data is, in my opinion, overwhelming. His rehabilitators have attributed some of the flaws in his data or its presentation to carelessness or to memory lapses of an old man. It has been argued that the lack or rigour in his experimental design, execution and reporting of data were those of a different era but this does a disservice to many scientists of past generations who collected and reported their data in exemplary fashion. Others seem to be implicitly suggesting that, for example, because his twin study data seems to agree quite closely with that produced by some other groups it should be accepted as valid even though there are impossibilities in the statistics and probably impossible claims about the environments in which the separated twins were brought up. The underlying message seems to be that even though he may have fiddled his data he probably got approximately the right answer! Two of his former PhD students Professor Alan Clarke and his wife Ann (nee Gravely) who were probably the first to raise doubts that some of Burt’s data “appear suspiciously perfect” summed up his scientific career when they concluded that Burt was:

“Either a fraudulent scientist or a fraud as a scientist”

I think it would be quite reasonable to remove the either/or from this quotation and insert an and in place of the or.

The Social and Scientific Context for Burt’s work

General acceptance of the heritability of intelligence

Burt’s work on the heritability of intelligence and his reports that there are large differences in the IQ of different social classes provided powerful ammunition to those, particularly in the USA,  who believed that certain racial groups, lower classes social and even women were intellectually inferior to white men of (Northern)European origin. Burt’s assertions and data seemed to show that this was a largely a consequence of their genetic make-up. Many (most?) 19th and early 20th century scientists believed this to be a largely unalterable state of affairs. Some of the more liberal may have openly wished that this were not the case but this was what scientific data had conclusively shown that God or, in later years, Evolution had produced. Some of the more ardent believers in this view of the heritability of intelligence tried to apply these beliefs to influence policies on education, immigration and even control of human reproduction and they sometimes succeeded in generating political programmes that seem outrageous by modern standards e.g. passing of laws permitting the forced sterilisation of criminals and the “feeble-minded”.

From measuring heads and brains to measuring IQ

Scientific measurements of various different types were used over the years to support the view that intelligence was largely inherited leading to inevitable and largely unalterable differences between races, social classes and the sexes. For example:

  • Measurements of skull capacity as a proxy measure of brain size either in living or dead subjects
  • Measuring brain weight directly at autopsy
  • Measuring the size of parts of the brain thought to involved in higher intellectual function relative to those involved in more basic animal functions, and finally
  • The measurement of intellectual ability with so-called IQ tests.

This whole field is the subject of a famous book by Stephen Jay Gould entitled The Mismeasure of Man in which numerous examples are discussed and analysed in considerable detail and some of them are summarised here.

Let us make the very dubious assumption that size of a skull indicates the brain size of the living individual and that this in turn is a reliable proxy measure of the innate intellectual capability of that individual. This was the prevailing view amongst the eminent scientists, doctors and anthropologists of the 19th century. Alfred Binet, the inventor of the IQ test concluded in 1898 that because measurements had been made on several hundred subjects the correlation between head size and intelligence was incontestable but when he later went on to make measurements of head size on live children he became much less convinced as he found only very small differences between measurements of the brightest and the least intelligent children. Binet also found that when making measurements he tended to subconsciously bias his measurements in favour of the answer he expected i.e. to slightly overestimate the head volume of the brightest and slightly underestimate that of the least bright; this is why making measurements blind (i.e. not knowing which group is which until after the data has been collected) is now considered such a vital part of good experimental design. (Pierre) Paul Broca was an eminent 19th century anatomist and physician who discovered the area in the brain responsible for speech (known to this day as Broca’s area). He was also a meticulous head and brain measurer and he concluded that brains of men were larger than those of women, eminent men had larger brains than those of lesser abilities and that brains of superior races were larger than those of inferior races. He was in no doubt that brain size and intelligence were strongly linked:

“Other things equal, there is a remarkable relationship between the development of intelligence and the volume of the brain”

If one wanted to use this assumption that skull capacity indicates intellectual ability (which now seems inherently naïve and highly dubious) to objectively compare the intellectual ability of two groups then one would want to ensure that:

  • One had large representative samples of the two groups, and/or
  • Other factors known to affect skull capacity like body size, age or sex were either eliminated (by matching of the two groups) or that one could effectively correct for any differences in such variables between the two groups.

In an ideal scenario one would select large matched samples of the two groups of people to have their skull capacity measured at that point in time; something that can presumably be accomplished with some precision using modern imaging techniques. Binet tried to do this by estimating cranial volume from measurements of skull dimensions in living children.

If one simply wanted to confirm one’s existing belief that group A were intellectually superior to group B then one could use a sample of skulls for group B that compared to those from group A contained more skulls from small or poorly nourished individuals or from (smaller) women or people who were very elderly when they died. No matter how meticulous and accurate the subsequent measurements of cranial capacity that were made (e.g. filling the skull with small lead shot) the results would be invalid as an indication of the true difference in cranial capacity of the two groups because of initial biases in the selection of the sample. As an analogy, if one were surveying the voting intentions of a British town one would get a very different prediction of the outcome of the election if one just sampled an impoverished working class estate or just an affluent middle class estate. Inadvertent biased sampling could give a false impression of overall voting intentions but it would also be easy to deliberately bias the prediction of voting outcome by biasing selection of the sample from different areas of the town e.g. if one wanted to boost the morale of a particular candidate’s campaign team or generate media headlines to indicate momentum for the campaign of one of the candidates. Opinion pollsters go to great lengths to try and make their samples representative but despite many decades of experience, inadvertent bias in their sampling is probably responsible for several of the errors made in predicting the outcomes of recent elections.

Likewise many factors could affect the weight of the brain if this were being used as an indicator of the innate intellectual ability of its owner such as:

  • Body size (and therefore sex) – bigger people would be expected to have bigger brains.
  • Age – there is a progressive decline in brain mass in the elderly.
  • Cause of death – protracted illness before death leads to a decline in brain mass so acute causes of death e.g. judicial execution might be expected to be associated with higher brain mass.
  • The patient’s state of nutrition and hydration at the time of death and more generally nutritional state throughout life.
  • The time and circumstances of brain removal e.g. how quickly it is removed after death and how quickly after removal it is weighed, and the exact protocol used to excise the brain from the cadaver would affect weight.

There are probably other factors that would be apparent to an anatomist but even from this short list it is clear that there would need to be a very rigorous procedure for making sure that if one were comparing brain weights of two groups that there was no bias in the selection procedure. There would also need to be a very rigorous protocol for collecting and weighing brains post mortem.

Binet invented his IQ tests in order to identify which French children, whose school performance was poor, needed some special educational provision. These tests have been modified and developed over the years and these successors have been widely used to assess the innate intelligence of children and adults. Binet specifically warned that his tests should not be used as a measure of innate intelligence and also that they should not be used to rank normal children according to intelligence. He argued that low scores should indicate that children needed special help to improve rather than be considered as innately incapable.

The letters IQ stand for intelligence quotient because they were originally used to compare a child’s chronological age with their mental age and if mental age was divided by chronological age the resulting figure was called the intelligence quotient. Charles Spearman proposed the idea of a general factor or g factor in intelligence because children’s performance on a battery of tests seemed to be correlated. He proposed that this underlying general intelligence factor (g) contributed to their performance in all other intellectual activities as measured by these specific tests that measured things like numeracy, language skills, abstract reasoning, memory and general knowledge. In very simplistic terms the performance on any specific test was partly due to general intelligence and partly the result of specific ability in the area being tested. Some even went so far as to suggest that g might be an inherited factor located in a specific location within the brain or that that it might be inherited as a simple Mendelian genetic characteristic.

An IQ test or more likely a battery of tests that really purport to measure innate ability should ideally be independent of educational, cultural, environmental and family experience and must assess ability on a wide range of skills and abilities that collectively make up what we call intelligence. The heritability of IQ has been debated for as long as the term has existed and the heritability of intelligence per se well before that. Tests of IQ have been used in much the same way as skull capacity or brain weights to provide evidence that women are intellectually inferior to men, certain racial groups intellectually inferior to others and that social class is largely determined by the innate intellectual inferiority of those in the lower classes.

For almost all human and animal characteristics it is generally accepted that nature (i.e. genetics) and nurture (environment and upbringing) contribute to the final characteristics of any individual although there is often considerable disagreement about their relative importance. It seems inevitable that a person’s genetic make-up plays some part in determining their intellectual capabilities or intelligence.

An adult’s height is partly determined by genetics and partly by things like their diet, health and even their emotional care during childhood. Observations made on institutionalised children during the 1940s showed that children deprived of love and emotional care were stunted and psychologically underdeveloped despite being given adequate nutrition and medical care; a condition referred to as deprivation dwarfism. The average height of British adults has been increasing with each generation since at least the early 20th century. Average height has traditionally been quite class dependent although this gap seems to be narrowing in more recent generations presumably as the health and nutrition of the lower social classes has improved towards that of the higher classes. The Japanese have traditionally been considered short in stature but the average height of Japanese boys increased by 20% between 1950 and 1990. None of these large changes can be attributed to genetic change. All of these obvious and substantial secular changes have been the result of changes in environment (nurture) despite an undoubted heritable component to height.

I am not qualified even to speculate on the politically and emotionally charged question of the degree of heritability of intelligence. I would just make the point that surely the aim of society and its educational systems should be to try to ensure that every individual has the opportunity to develop intellectually to their genetic potential just as the aim of nutritional care is to enable them to grow to their genetic potential. Large increases in height have largely been brought about by environmental and lifestyle influences upon essentially genetically stable populations.

There are numerous examples of very poor scientific investigation that could be discussed under this heading and I have selected three examples to briefly review; all three examples and many others are discussed in greater depth in Stephen Gould’s book The Mismeasure of Man and/or by Leon Kamin in his book The Science and Politics of IQ.

Morton’s Ranking of Races by Cranial Capacity (after Gould, 1978)

By the time he died in 1851 Samuel Morton, an eminent physician from Philadelphia, had amassed and measured the capacity of a collection of over 600 skulls. He measured the capacity of these skulls by filling them with mustard seed or small gauge lead shot. Morton believed in these pre-Darwin times that the major races of mankind (e.g. Negros and Caucasians) were different species. He published in several of his books, tables that showed the average skull sizes of a number of different races and subdivisions of larger racial groups e.g. different tribes of Native Americans. His summary tables supported his belief and that of most of his scientific contemporaries that Whites had the largest skull capacities (so also the largest brains and intellect), Blacks the smallest with Native Americans in between these two extremes. Amongst the Caucasian group White Europeans came out on top, Jews in the middle and Hindoos at the bottom. These tables were widely re-published and referred to for decades after their production and Morton himself was regarded as a major American scientist who used measurement and data rather than just relying on rhetoric to support his theories. Stephen Gould uses a quote from the New York Tribune at the time of his death as one of several indicators of the high esteem with which Morton was regarded as a scientist:

Probably no scientific man in America enjoyed a higher reputation amongst scholars throughout the world than Dr Morton”

Morton not only published summary tables and illustrations in his books but also published his raw data with explanations of how it was obtained; surely an indication that he had great faith in his own analysis and conclusions? The availability of this raw data enabled Stephen Gould to re-analyse the data and to publish his findings in the prestigious journal Science in 1978. In this re-analysis Gould found that the skull sizes (all measured in cubic inches) of six different racial groups were very similar ranging from 83 in Africans to 86 in Native Americans with modern Caucasians coming in at 85. In Morton’s original summary table the range was from 96 in English Caucasians to 75 in Hottentots and Australian Aborigines; he recorded a combined average of 83 for African born and American born Negroes.

Gould did not believe that Morton had deliberately falsified his findings but that a series of lapses and major biases had all tended to favour the results that he had anticipated:

  • He excluded or included subgroups in his summary data in directions that favoured his anticipated outcome.
  • He never made any adjustment for size or sex so differences between groups would be distorted by differences in body sizes and sex ratio of the samples. It was clear from Morton’s own data that there were substantial sex differences in skull size within individual racial groups. All of Morton’s English group, that had the highest skull size ranking, were male and all of the Hottentots which came out bottom were female.
  • Some uncorrected errors and small rounding errors which always favoured Morton’s anticipated outcomes.
  • Convenient omissions of large or small skulls that could have altered his rank order.

This paper by Gould is a sobering read for anyone who doubts whether blinding is necessary for making accurate and objective measurements in research.

Dr Robert Bennett Bean and the two halves of the corpus callosum

The corpus callosum is a flat brain structure that lies beneath the cerebral cortex and contains very large numbers of neural fibres that connect the left and right cerebral hemispheres. It can be separated anatomically into a front part known as the genu which is divided by a narrow isthmus from a rear portion known as the splenum. Robert Bean was an anatomist who in 1906 published a long paper in the American Journal of Anatomy entitled Some Racial Peculiarities of the Negro Brain. One of the things that Bean did in this paper was to compare the relative sizes of the front (genu) and back (splenum) portions of the corpus callosum in a large sample of brains from unclaimed bodies that had been given to medical schools . He reasoned that as the front parts of the brain were generally concerned with higher intellectual functions and the hind parts with the more basic functions then one would expect the front part of the corpus callosum to be relatively larger in those of higher intellectual ability. He constructed a graph (scatter gram) in which he plotted the length of the front part of the corpus callosum against the length of the back part in a large sample of brains from both black and white cadavers. This graph shows very clear differences between black and white brains with Blacks having a relatively large hind portion and Whites a relatively large front portion which he believed reflected a higher level of intelligence. It was possible for Bean to draw a diagonal line on this scatter gram which almost completely separated the white and black brains. There are several other analyses in this long anatomical paper which all point towards larger more developed higher centres of the brain in white than black brains. This paper and Bean’s other writings were very influential at the time as they seemed to provide clear scientific evidence to support many people’s existing prejudices. Gould uses a quote from an editorial in the journal American Medicine which suggested that Bean had provided:

 

“The anatomical basis for the complete failure of the negro schools to impart the higher studies”

 

This editorial went on to suggest that politicians must now accept the error of assuming human equality and seems to be indicating that this evidence provided justification for disenfranchising black people because of their lack of intelligence.

 

In 1909, another anatomist, Franklin Mall, who was based at the world famous John Hopkins University, also published a paper in the American Journal of Anatomy in which he refuted Bean’s findings on the differences in the brains of Whites and Blacks between the two parts of their corpus callosum. Mall obtained a sample of 106 brains (including 18 of Bean’s original sample) and he measured the size of the two parts of the corpus callosum using the method described by Bean. Mall, however, made his measurements blind and did not know until after the measurements were complete whether the brains were from white or black or male or female subjects. Mall found no difference between the brains of Blacks and Whites or of males and females and his scatter gram shows brains of males and females, Blacks and Whites all clustered together and intermingled. In the 18 of Bean’s original sample that Mall re-measured, for 7 out of the 10 white brains he got lower values for the front portion of the corpus callosum and for 7 out of the 8 black brains he got smaller values for the back portion of the corpus callosum.

 

There is no way of knowing at this distance in time whether Bean deliberately manipulated his measurements but it is perfectly possible that he made his measurements honestly but his own expectations biased his results.

 

The misuse of IQ Tests in the USA 1910-1930

Three names of eminent psychologists stand out for discussion under this heading H.H. Goddard, L.M. Terman and R.M. Yerkes.

Herbert Henry Goddard (1886-1957) was from 1906-1918 Research Director at the Vineland Training School for Feeble-Minded Girls and Boys in New Jersey. From 1922-1938 he was a professor of psychology at Ohio State University. Early in his career he asserted that intelligence was a single measurable characteristic; the so-called reification of intelligence from an abstract multifaceted concept to a single almost physical entity perhaps even a simple Mendelian trait. Goddard translated Binet’s work on IQ testing into English and was the first to use Binet’s IQ tests in the USA. He believed that the feeble-minded or morons as he designated them should be confined and cared for in benign institutions like his own at Vineland where they could be effectively prevented from breeding and polluting the American race. Others at this time went even further and advocated the forced sterilisation of criminals and mental defectives.  In 1912 he published a famous book The Kallikak Family: A study of the Heredity of Feeble-Mindedness. In this book he traced the offspring of a normal ex-soldier who fathered two lines of descendants: one from his wife a good Quaker woman who all turned out to be normal worthy citizens and those from an affair with a feeble-minded woman who worked in a tavern who all ended up as paupers and criminals. Goddard gave this man the name Kallikak after the Greek for beauty (kallos) and bad (kakos) because he had fathered a good and a bad line of offspring. When the photographs of the descendants of Mr Kallikak in Goddard’s book were examined in 1980, an expert in photography found that all of the faces of the bad line had been crudely re-touched to give them a more sinister appearance but not the three photographs of the member of the good line who worked at Goddard’s institution. Goddard believed that with experience he, and women he had trained, could gauge the mental deficiency of people simply by their visual appearance. Later in his life, Goddard changed his views and no longer believed that feeble-mindedness was necessarily incurable or that the feeble-minded generally needed to be confined to institutions.

Lewis Madison Terman (1877-1956) was professor of psychology at Stanford University in California from 1910-1945 and head of the psychology department for more than 20 years. Terman developed and extended Binet’s IQ tests so that they could be used to test adults as well as children. The Stanford-Binet tests that he helped to develop became the standard for IQ testing and for validating the results of other IQ tests. Terman also believed that his tests would allow high grade defectives to be identified and their reproduction curtailed. In his later writing he also changed his stance. He pointed out in 1937 that the average differences between the IQs of different social groups were relatively small with greatly overlapping distributions and he also said that it was difficult to decide on the relative contribution of genetic and environmental influences to these inter class differences or the differences between the scores of rural and urban children.

Robert Mearns Yerkes (1876-1956) obtained a first degree and PhD in psychology from Harvard University and joined the faculty there in 1902. When America entered World War I he was given an army officer’s commission and, together with Goddard, Terman and others, in 1917 set up a programme to test the intelligence of all the huge number of conscripts into the American army precipitated by the war.  As a result of this programme, the intelligence of 1.75 million American men was assessed. This was undoubtedly an enormous, almost unprecedented, data set but there were very serious questions about the quality and robustness of much of this data. There were two types of test, the Alpha test for those recruits who were literate and the Beta test for those who were not. Some who performed poorly on the Beta test were subject to recall and further tests. It is clear from this basic description that there would be problems of grading individuals on continuous scale given that they were not all assessed in the same way. These tests were used to grade the recruits on an A to E scale (with additional pluses and minuses). The original purpose of this grading was to help make decisions about what type of role the recruits were suitable for although, in practice, it seems to have had little impact on the military role assigned to recruits.

The results of these tests did have major impact outside of the army. Some of the summary results seem quite absurd nowadays and were even seen as rather unexpected and bizarre by some of those involved in their collection. The average mental age of white American conscripts was just over 13 years. Any adult with a mental age of less than 12 years was classified at the time as a moron and 37% of Whites and 89% of Blacks fell into this category. This was a shocking finding for American scientists and politicians. European immigrants could be graded according to their country of origin with those of Southern European origin or Slavic races from Eastern Europe performing more poorly than those of Northern European origin. Negroes came out at the bottom of the classification. Some examples of average mental ages for different groups recorded in these tests:

White average             13.08

Russians                      11.34

Italians                        11.01

Poles                            10.74

Negro average             10.41

 

These results influenced the US Immigration Act of 1924 which set quotas for immigrants from each nation as 2% of those recorded in the 1890 census. The practical impact of using a 34 year old census was to greatly reduce the quota of those allowed in from Eastern and Southern Europe because their numbers were relatively small in 1890.

In his famous book Gould gives very detailed critique of these tests and the ways in which they were administered, analysed and interpreted. I am just going to pick on one specific piece of analysis to illustrate how whatever numbers were generated they seemed to be analysed and interpreted to support the existing beliefs of the analyser about the superiority/inferiority of particular racial or social groups. The average mental age of pooled immigrants to the USA increased progressively as their duration of residence in America increased. The average in those who had been resident for different lengths of time broken up into five year blocks was:

Up to5 years                11.29

6-10                             11.70

11-15                           12.53

16-20                           13.50

20+ years                     13.74

 

The most obvious explanation for this progressive change is that the improvement in test scores was due to increased familiarity with American customs, culture and language. This explanation was effectively dismissed by Carl Brigham a psychologist at Princeton University who instead suggested that it was caused by changes in the racial mix of immigrants over time leading to a progressive decline in the average intellectual capabilities of new immigrants with time. In 1930, Brigham, just like Goddard and Terman recanted his earlier beliefs and even apologised for his past mistakes. The most significant of his admissions were that:

 

  • Test scores could not be reified into some measure of an entity called intelligence
  • The scores could not properly be used to compare nations
  • The tests measured familiarity with American language and culture rather than innate intelligence.

This extremely large but imperfectly collected and often poorly analysed and interpreted data set was used for many years to support the existing views of many about the heritability of intelligence and of the innate and largely immutable racial and class differences in intelligence. They also influenced American immigration policy in a way that selectively restricted the immigration of people from Southern and Eastern Europe.

Cyril Burt – career timeline

Cyril Burt was born in London in March 1883. He was the son of a family doctor who when Burt was ten years old moved to a Warwickshire village. Burt was awarded a scholarship to study Classics at Jesus College, Oxford. Despite studying classics, in his spare time at Oxford Burt mixed with and worked some of the most prominent of the early British psychologists. During these undergraduate years he even undertook a project aimed at standardising some of the then current psychological tests. Burt obtained a modest degree in his nominated subject and his biographer Leslie Hearnshaw suggested that this was probably because he was distracted from his subject studies by his psychology work. Burt obtained a Teachers’ Diploma and undertook some teaching experience and his later career was focused on educational psychology. Burt did not have any formal training in science and never, for example, obtained a PhD, the standard postgraduate research degree, although he supervised almost a hundred PhDs in his later career. Despite his lack of a formal science qualification and modest degree result, no one who has read Hearnshaw’s biography and perused his list of publications can be in any doubt that this was a man of formidable academic ability with an amazingly wide range and depth of knowledge and interests across many different academic disciplines. He also had a great capacity with language to present forceful and convincing expositions of his views in print and probably also orally. I can sympathise with the description of Burt as a “polymath of Renaissance proportions” as well as sympathising with the much less flattering quote from the Clarke’s given earlier.

In 1908 Burt’s first academic post was as Assistant Lecturer in Physiology and Lecturer in Experimental Psychology at the University of Liverpool where he worked with the eminent neurophysiologist and Nobel Prize winner, Professor Charles Sherrington. It is here that Burt began his research into the heritability of intelligence and concluded in a paper written in 1912 that:

“Among individuals mental capacity is inherited. Of this the evidence is conclusive.”

Psychologists who have critiqued this paper point out that Burt actually generated and published very little experimental data to support his very definite conclusions. The paper purported to show that there was a good correlation between parental and offspring intelligence. Parental intelligence seems to have been largely determined by occupation and the assessor’s impressions during an interview although a sample were said to have been tested formally to standardise the more informal grading. This seems to be a characteristic of much of Burt’s writing i.e. very forceful and eloquent exposition of views supported by limited experimental data or data where the precise details of the data and of how it was generated are sketchy or glossed over or, less charitably, deliberately hidden to prevent readers from realising just how weak the evidence was.

In 1913, Burt was appointed to a part-time position as educational psychologist for the London County Council (LCC). This post involved routine clinical work particularly with subnormal and delinquent children. He also undertook research and survey work on methods of testing children’s academic abilities and aptitudes and the distribution of abilities amongst different groups of children. In 1924 he also became a part-time professor at the London Day Training College which later became The Institute of Education, part of the University of London.  In 1932 Burt was appointed Professor of Psychology at University College London and head of the department of perhaps the premier psychology department in Britain at that time. At UCL he succeeded one of his mentors, Professor Charles Spearman.

Burt wrote numerous academic papers, academic reports, book chapters and numerous books including:

The Distribution of Educational Abilities (1917)

Mental and Scholastic Tests (1921) –A standard work for decades

Handbook of Tests for Use in Schools (1923)

The Young Delinquent (1925)

The Subnormal Mind (1935)

The Backward Child (1937)

The Factors of the Mind (1940)

A Psychological Study of Typography (1959)

The Gifted Child (1975)

ESP and Psychology (1975)

Several of Burt’s books became classics in their field and went through multiple editions and translations. Burt contributed to the more popular media as well as the academic, he gave several series of talks on BBC radio and even appeared on BBC television after his retirement; he also contributed dozens of articles to The Listener a magazine published by the BBC from 1929-1991. Burt was the first psychologist to be knighted in 1946. He retired as Professor of Psychology at UCL in 1950 but continued to supervise his many postgraduate students after retirement and tried to maintain his influence over the department and its new American head. His uncomfortable relationship with his replacement led to him eventually being told by the Provost that his links with the department were ended. Retirement was not the end of Burt’s psychological career; he continued to mark examinations, review manuscripts for books and articles, to edit a journal and to write prolifically. The extent of his post-retirement publications can be crudely illustrated by the 18 page list of his publications given at the end of his 1979 biography by Leslie Hearnshaw; more than half of this list is for publications after he officially retired from UCL. As noted earlier, many of his most controversial papers were published after he had retired and he claimed in several of these to be still actively generating research data long after his retirement from UCL in 1950.

In 1947 the British Journal of Psychology (Statistical Section) was founded and Burt was co-editor and became sole editor in 1954 when it was renamed The British Journal of Statistical Psychology (in 1966 it became known by its current title The British Journal of Mathematical and Statistical Psychology). In 1960, a new editor was appointed with Burt as his assistant although in practice he actually continued to run it until 1963 when another new editor was appointed, again with Burt nominally designated as his assistant. Burt’s relationship with this new editor was uncomfortable, like that with the new head of psychology at UCL, because Burt tried to still control his journal. The new editor side-lined Burt and effectively took full control of the journal in 1964 although Burt remained as nominal assistant editor until 1968. A brief review of Burt’s publications shows that he was a major contributor to this journal over the period 1948-1964 and the 66 articles published in this journal made up more than half of his scientific papers over this period. After relinquishing control of the journal in 1964 he published only one further paper in the journal in 1971. Many of his articles in this journal look very lengthy by modern standards (e.g. a 44 page article on experiments in telepathy in children) and they sometimes seem to cover topics beyond what one would consider the normal area for this journal e.g. a 18 page article on the psychology of typography in 1955 and a 10 page article in 1960 on Hebrew Psychology. It is also suggested by his biographer that he was also making many contributions under assumed names. This gives the strong impression that for a time this journal was so firmly controlled by him that he could effectively publish anything he wanted.

In 1968 Burt was the first non-American to receive the Thorndike award given annually by the American Psychological Society for outstanding contributions to the field of educational psychology. His Thorndike lecture was published in March 1972, 5 months after his death.

Burt worked with, was influenced by, or influenced many of the famous names in 20th century psychology and statistics: the neurophysiologist and Nobel Prize winner Charles Sherrington, Charles Spearman, Alan Clarke (a PhD student of Burt’s), Francis Galton, Hans Eysenck (also one Burt’s PhD students), Arthur Jensen and Raymond Cattell (Burt supervised his MA in Education thesis). The last four on this list have all been controversial figures because of their advocacy of the belief that genetic factors largely determine IQ and thus directly or indirectly supporting a causal link between social class and/or race and intelligence.

Assessing the allegations against Sir Cyril Burt

Burt died in 1971 and three years later in 1974 Dr Leon Kamin, a Professor of Sociology at Princeton University, published his book The Science and Politics of IQ. Kamin suggested in this book that Burt’s papers lacked much of the essential detail about the subjects and the methodology that would normally be reported in a scientific paper. He also noted that the correlations reported in his three papers relating to the IQ of identical twins were identical to three decimal places both for the twins reared apart (0.771) and those reared together (0.994) despite apparently large increases in sample size. It seems essentially impossible that such a result could have occurred and Kamin concluded that Burt had manipulated his data to support his beliefs. In his 1995 assessment of some of Burt’s work Professor Nicholas Mackintosh found that invariant correlation coefficients despite apparently differing sample sizes is a regular feature of Burt’s papers.

Oliver Gillie, a medical journalist, added considerable fuel to the controversy surrounding the quality and integrity of Burt’s research publications. In a Sunday Times article published in October 1976, Gillie repeated Kamin’s findings and made the accusation of research fraud explicit; the title of his article was Crucial Data was Faked by Eminent Psychologist. Gillie added a further accusation which was that two of the women who assisted Burt in his data collection and wrote articles with him or independently were in fact phantoms. Burt used the names of these two women to publish articles and book reviews consistent with his views and also to add credibility to claims that he was continuing to amass data long after his retirement from UCL. Before discussing the details of any of these investigations it is worth noting the effect they had on Leslie Earnshaw who wrote a biography of Burt that was originally commissioned by his sister Dr Marion Burt. Earnshaw had given a favourable Memorial Address at a memorial service for Burt and wrote a favourable obituary/appreciation of Burt in 1972. Yet in the year before the biography was completed and published (1979), Earnshaw felt compelled to write to Marion Burt to say that he was forced to accept the accusations made against her brother and that these conclusions would be in the biography. Earnshaw was a professor of psychology at Liverpool University (1947-1975) and was selected for the memorial address, obituary and biography because of his interest and expertise in the history of psychology rather than because of any great personal involvement with Burt.

I am going to essentially try and answer four questions in my assessment of Burt’s work:

  • Can we trust Burt’s data on identical twins reared apart and together as summarised in his paper of 1966?
  • Did Margaret Howard and Jane Conway really collaborate and assist Burt after WW2 and did they actually write the articles and book reviews published in their names?
  • Can we trust the data in Burt’s classic 1961 paper on “Intelligence and social mobility”?
  • Can we trust the data in Burt’s 1969 paper in the Irish Journal of Education entitled Intelligence and Heredity: Some Common Misconceptions?

If a simple yes/no answer was all that was required then I could give a firm and confident NO to each of these questions. This is not quite the same as saying definitively that Burt deliberately fabricated or falsified the data in these papers or that the Misses Conway and Howard never existed. In a summing up of the evidence against Burt in 1995, Professor Nicholas Mackintosh FRS from Cambridge University said that in order to give Burt the benefit of the doubt and accept at face value the data in the three papers mentioned above then would have to accept that:

“He or his assistants did actually collect some information about the IQ scores of 53 pairs of separated MZ twins, of some 1000-2000 fathers and their sons, and of successive generations of London schoolchildren between 1914 and 1965. But his accounts of these data are so woefully inadequate and riddled with error, and some of the data must be based on such grossly inadequate methods, that no reliance can be placed on the numbers he presents.”

This is essentially the same conclusion that Kamin came to in 1974 when making the judgement that Burt’s data was worthless and Mackintosh goes on to confirm his agreement with Kamin’s conclusion.

I am first going to consider the second question relating to the shadowy Misses Howard and Conway because these women were supposed to be collecting and collating data for Burt long after his retirement and this is what gives credibility to his claims of reporting new data in his later papers. If these women were not actually working with Burt after his retirement then this very seriously undermines the credibility of much of the data reported in his later papers. His housekeeper and secretary from 1950 never met either woman and she was told that both had emigrated by 1950 i.e. long before they were supposed to be still actively collecting data for Burt and writing papers both with him and independently. Around 1978, Leslie Earnshaw did a search of the psychology literature. He found that Margaret Howard is named as a co-author on 4 of Burt’s articles between 1952 and 1960 and she is listed as a sole author of a paper published in 1958 and four book reviews. J Conway is first acknowledged by Burt in a 1943 paper and between 1958 and 1962 she is credited with writing 2 articles, a note and three book reviews. These women were thus apparently working with Burt for around 20 years. Everything that they published was in the Journal of Statistical Psychology and all published during the time when Burt was the editor. Earnshaw clearly was convinced that Howard and Conway were pseudonyms that Burt used for articles and book reviews that he had written. Earnshaw suggests that the style and content of their writings had all the hallmarks of being written by Burt and perhaps modern computer analysis could confirm this? Earnshaw also quotes a 1962 entry from Burt’s diary which implies that he was writing a response on behalf of Miss Howard.

Earnshaw also suggested that this pattern of behaviour i.e. writing articles and then publishing them in his own journal under a pseudonym was something that he did regularly. He notes that of the more than forty people who contributed reviews, notes and letters to the Journal of Statistical Psychology during Burt’s tenure as editor more than half were not identifiable. In 1978 Oliver Gillie traced a Miss Elizabeth Molteno who was acknowledged in Burt’s 1943 paper on twins along with Conway and Howard; she told him that not only had she never met Conway or Howard but that she had never assisted Burt in his work. Thus a real person’s name was being used even though she had not been a collaborator so this means that just giving (scant) evidence of the existence of either of these two elusive assistants does not mean that they actually assisted Burt as he claimed.

Oliver Gillie began his search for these two women in 1976 i.e. just 14 years after the last publication by Conway when it would have been probable that either or both women would still be alive. He put an advertisement in the personal column of The Times requesting that these women who had worked with Sir Cyril Burt or anyone who knew them should contact him. He also contacted many of Burt’s closest associates who had worked with him at various stages of his career and none of them seemed to know these women. There is no record of either of them as students or staff at UCL, the London Day Training College, Senate House (the administrative centre for London University) or of them being employed as teachers within state schools in London. J Conway gave her address for a 10 page article published in May 1959 as Psychology Department, University College London, which clearly implies that she had some official affiliation with UCL as a student or staff member but they have no record of her. Gillie initially tried to trace these women through the British Psychological Society and it was suggested to him by two officials that these were in fact, as Earnshaw also concluded, pen names used by Burt. Gillie was told that previous attempts to contact these women to obtain permission to cite their work had been referred to Burt who had granted permission on their behalf. Earnshaw also had access to Burt’s diaries and some of his papers and these gave no indication that Burt had corresponded with or met with these women after his retirement.

There have been some claims from more than one source that Margaret Howard did exist and there seems to be no need to doubt the authenticity of these claims or at least that they were made in good faith because it is clear beyond any reasonable doubt that Burt was not actively collaborating with these women in the collection and collation of data after his retirement or in writing articles in collaboration with them.

Burt’s 1966 paper on identical twins reared apart and together was by far the largest sample of twins reared apart (53) ever to have been published anywhere in the world and so was clearly an influential piece of work. If you take his published accounts at face value then he accumulated data on an unbelievable extra 32 pairs of separated identical twins between 1955 and 1964! The other thing that makes Burt’s data stand out from the others is that he states that the twins were separated before they were six months old and that there was no correlation between the socio-economic circumstances of the two twin’s homes. Even in the five pairs who were both brought up by relatives, in each case one lived in the country and one lived in the town. The correlation between the social classes of the homes of his twins reared apart has been calculated as -0.04; i.e. almost exactly zero! Normal adoption practice was to place children in homes similar to those in which they are born so the lack of any hint of a correlation between birth home and adoptive home is unusual. Several other studies of adopted twins reported that in a high proportion of cases one stayed at their home and the other was brought up by relatives; often the separation occurred after some years being reared together. So this is a pretty unique and extraordinarily large data set that could hardly have been better if it had been designed as an experiment or if the data were at least partly theoretical idealised data. Burt was asked repeatedly by other researchers for copies of his raw data which he generally failed to supply but he did supply some to Professor Arthur Jensen relating to the intelligence estimates and social class of the homes of the twins reared apart from which the correlation of -0.04 between social class of their homes was estimated Jensen.

Burt published three papers in which he reported on the IQ of identical twins reared apart and together in 1955, 1958 and 1966 with another paper published in 1958 by Conway. Leon Kamin noted that the correlations between the IQ of twins reared apart and those reared together were exactly the same to 3 decimal places in all three studies published by Burt despite some very substantial changes in the number of values for the twins reared apart (see table 1)

Table 1            Correlation coefficients (r) for identical twins reared apart in three papers by Burt

Year                            Reared apart                reared together

Number (n)      “r”                   “n”                   “r”

 

1955                21                    0.771               83                    0.944

1958                30+                  0.771               ?                      0.944

1966                53                    0.771               95                    0.944

In a later analysis of Burt’s publications Mackintosh notes that invariant correlations cropped up very frequently and very improbably in Burt’s other papers.

Apart from the coincidence of the identical correlation coefficients, Kamin makes a number of criticisms of these papers, for example:

  • The age at which twins were tested and their sexes are not given
  • It is unclear exactly what testing procedure was used and for one test in particular “the group test of intelligence” no description could be found.
  • The implication in the papers is that a standard testing procedure was used over several decades.

Kamin also did an analysis of the average IQs and the variability (variance) of the separated twins separated into 4 categories (plus 7 fostered into residential institutions):

  • Upper class children (socials groups 1-3) reared in their own home (19)
  • Lower class children (groups 4-6) reared in their own home (34)
  • Children fostered into lower class homes (32)
  • Children fostered into upper class homes (14)

Kamin noted that the variability of fostered children in upper class homes was much smaller than that for children fostered into lower class homes. He noted other highly statistically improbable differences in variances when he pursued this analysis. His conclusions were that that either/or:

  • The IQs of children adopted into upper-class homes were systematically under-assessed in order to preserve the very high correlation between the twins
  • The social class of these homes have been systematically overrated to support the claim that children had been fostered into uncorrelated environments.

Kamin’s analysis of Burt’s cumulative data on the intelligence of identical twins reared apart or together leaves little room for doubt that the data was not collected and analysed in a way that conforms to even the vague and woefully inadequate description of his methodology. The assessments of Earnshaw and Gillie of the extent of his post-retirement data collection and collaboration with his untraced assistants adds further to doubts about the authenticity of this data.  This leads to the conclusion that much of the data was fabricated, borrowed surreptitiously from other sources and/or manipulated to conform better to Burt’s views on the heritability of intelligence.

In 1978 Professor Donald Dorfman of the University of Ohio published in Science a forensic examination of Burt’s 1961 paper on The Inheritance and Social Mobility. It is not immediately clear exactly what Burt is claiming to have done but the consensus seems to be that it purports to be an analysis of data collected over a period of almost fifty years (from 1913) on what Dorfman believed to be many thousands of father/son pairs. The data relates to the distributions of intelligence scores in different occupational groups and how social mobility helped to maintain average scores in the different social groupings. As with many of Burt’s papers the exact details of how the data was collected, what tests were used and how they were administered is sketchy or (deliberately) vague. Burt does acknowledge that the assessments of adult intelligence were “less thorough and less reliable”. The primary purpose of the data collection was to assess whether backwardness was a family characteristic and the collection of data to look at the heritability of intelligence was incidental. He describes the results as a pilot study (with thousands of pairs of subjects!) and states that the data “are too crude and limited for a detailed examination by analysis of variance”.

Dorfman re-produces tables 1 and 2 from Burt’s paper in which the number of people in 10 point IQ bands (from 50-140+) are listed for each of six different occupational classes; the fathers in table 1 and their sons in table 2. I will give just three suspicious features of this data identified by Dorfman.

  • When he used a formula (published in a paper by one of Burt’s alter egos Miss Conway) to calculate regression coefficients for the relationship between intelligence scores of father and sons in the six occupational classes, he got a regression coefficient of 0.50 each time; a statistically highly improbable occurrence.
  • When he constructed frequency distribution curves for the pooled IQ scores of all fathers and of all sons he got two perfect and identical normal distributions (bell shaped curves). When he plotted on the same graph, a theoretical curve for a normal distribution where the mean was 100 and the standard deviation was 15, it also matched perfectly with the two graphs for fathers and sons (see figure below). Dorfman pointed out that others had found that IQ distributions tended not to be perfect bell-shaped curves but were usually somewhat skewed. Even in one of Burt’s own later (1963) papers he makes the point that the distribution of IQ is plainly skewed. Dorfman then compares the normal distributions of these two graphs produced from Burt’s data with other sets of data with highly normal distributions (e.g. height and weight). He found that Burt’s data was some of “the most normally distributed in the history of anthropometric measurement”. This conclusion was made despite the parameter being measured (intelligence) often not conforming well to a normal distribution and despite the crudeness of the data as acknowledged by Burt. Dorfman concludes that the data in tables 1 and 2 of Burt’s paper were carefully constructed to produce a normal curve with a mean of 100 and a standard deviation of 15.

 

graph

 

Reproduced from Dorfman (1978)

  • He then looked at tables 3 and 4 in Burt’s paper which Burt used to compute social mobility across occupational classes i.e. the way in which class differences are maintained by some of the brightest children in the lower classes moving upwards and some of the least intelligent in the upper classes moving downwards. He concludes from a relatively complex analysis that Burt did not re-classify actual data but constructed data to meet the requirements of supporting his hypothesis. Just one relatively simple piece of analysis illustrates some of the striking coincidences and irregularities in the data. Dorfman looked at the percentage of fathers in each of the six occupational categories used in this 1961 paper and compared these with data published by Spielman and Burt in 1926. In the 1926 report, there were 8 occupational categories rather than 6 so Dorfman combined the results of categories 6, 7 and 8 used in the 1926 version. When he compared the two versions he got identical values from the 2 sources as shown in the table below.

Percentage of fathers in each occupational category in a 1926 report by Spielman and Burt and in Burt (1961)

Vocational Category                           % of fathers 1926                   % of fathers 1961

1                                                          0 (0.1)                                      0

2                                                          3                                              3

3                                                          12                                            12

4                                                          26                                            26

5                                                          33                                            33

6                                                          26 (19+7+0.2)                         26

Dorfman concludes that beyond any reasonable doubt the row totals in the 1961 paper were taken from the 1926 study even though the 1961 data was supposed to represent data collected over nearly 50 years from 1913 onwards.

I am convinced by Dorfman’s analysis that much of the data in Burt’s 1961 paper was fabricated. Dorfman summed up his own conclusions that the data:

“were in complete agreement with a genetic theory of IQ and social class” and that “beyond reasonable doubt they were fabricated from a theoretical normal curve, from a genetic regression equation and from figures published more than 30 years before Burt completed his surveys”.

The final question to be addressed relates to one of Burt’s final papers (1969) when he argued that there had been a decline in educational attainment between 1914 and 1965. The data from the most controversial table in this paper published in the Irish Journal of Education is shown in the table below.

graph2

 

From Burt, C (1969) Intelligence and Heredity: Some common misconceptions. Irish Journal of Education 3, 75-94.

 

The data in this table was said to have been computed by a Miss M.G. O’Connor and she is a third missing lady in the Burt affair who has never been traced. She was said to be an Irish ex-student of Burt’s but in 1979 Oliver Gillie wrote that he could find no trace of her in Britain or Ireland i.e. within ten years of the paper’s publication. Burt discusses the data in this table and then based on this makes the following statement:

“The main conclusion I myself would draw from the just quoted is that, as has so often been surmised, a definite limit to what children can achieve is inexorably set by the limitations of their innate capacities; and no improvements in the quality of education can affect the genetic composition of a large and stable population.”

As is the norm with most of the data reported in Burt’s papers there is very little information given on how Miss O’Connor generated these numbers. In this case the information about methodology is almost non-existent and is sparse even by Burt standards. Were the same tests used throughout the 50years? If yes then how can such tests conceivably be reliable indicators in children in 1914 and 1965? If no then how were the scores in different tests converted into the numbers used over a 50 year period? What sampling method was used? Were the children from 1914-1965 in similar social circumstances? How large were the samples and how was the data collected and by whom?

The implication is that this data was collected by Burt and his assistants but there is strong evidence that he was not actively involved in data collection after his retirement in 1950 and that he had no visible contact with his assistants (if they existed) after this date. Professor Mackintosh pointed out in 1995 just how implausible some of the numbers in this table were and seemed to contradict most other findings e.g. it has been conservatively estimated that intelligence test scores improved by 10 points between 1914 and 1965. To be taken seriously there would need to be very convincing detailed accounts of the data collection, sampling and analysis to convince any serious reader that the numbers were robust enough to be used as real evidence. Mackintosh clearly believed that the data had been manipulated to fit Burt’s purpose and theories and he considers that there is no innocent explanation for these aberrant results.

The most likely explanation, in the light of his past behaviour, is that he made up the results to support his arguments about heritability of intelligence to oppose changes being made to British education policies and Miss O’Connor was probably a figment of his imagination.

My final conclusion about Burt is that he was a brilliant and eloquent charlatan!

Endnote

Shortly after completing this case study I saw an item on the BBC News web-site (27/02/2015) entitled:

“Virginia eugenics victims compensated for sterilisation”

The Virginia state legislators had just agreed to pay compensation for surviving victims of the state’s forced sterilisation programme for those deemed undesirable or mentally unsound. According to the BBC, over 8000 Virginians were forcibly sterilised whilst the law was in force c1927-1979. In the USA as a whole, 65,000 Americans in 33 states were subject to this compulsory sterilisation. The US eugenics programmes were said to be a model for the Nazi eugenics programme aimed at creating a master race. Sweden, Canada and Japan also practised forced sterilisation in the 20th century. Much of the justification and theoretical underpinning for these eugenics programmes would have come from the flawed and/or fraudulent research discussed in this case study and this BBC report highlights how some individuals were still living with the consequences of these programmes in 2015.

Main sources used

Bean, RB (1906) Some racial peculiarities of the Negro brain. American Journal of Anatomy 5, 353-432.

Burt, CL (1961) Intelligence and social mobility. British Journal of Statistical Psychology. 14, 3-24.

Burt, CL (1966) The genetic determination of differences in intelligence: a study of monozygotic twins reared together and apart. British Journal of Psychology 57, 137-53.

Burt, CL (1969) Intelligence and heredity: some common misconceptions. Irish Journal of Education 3, 75-94.

Conway, J (1959) Class differences in general intelligence: II. A reply to Dr. Halsey. British Journal of Statistical Psychology 12, 5-14.

Dorfman, DD (1978) The Cyril Burt question: new findings. Science 201, 1177-86.

Gillie, O (1976) Did Sir Cyril Burt fake his research on heritability of intelligence? Part 1. The Sunday Times 27th October 1976.

Gillie, O (1979) Burt’s missing ladies. Science 204, 1035-9.

Gould, SJ (1978) Morton’s ranking of races by cranial capacity. Science 200, 503-9.

Gould, SJ (1996) The Mismeasure of Man (revised and expanded edition) London: Penguin Books.

Hearnshaw, LS (1979) Cyril Burt. Psychologist. London: Hodder and Stoughton.

Jensen, AR (1975) Sir Cyril Burt. Psychometrica 37, 115-7.

Kamin, LJ (1977) The Science and Politics of IQ Harmondswoth, UK: Penguin Books.

Mackintosh, NJ (1975) Twins and other kinship studies. In Cyril Burt. Fraud or framed. Oxford: Oxford University Press. Pp 45-69.

Mackintosh, NJ (1975) Declining educational standards. . In Cyril Burt. Fraud or framed. Oxford: Oxford University Press. Pp 95-110.

Mackintosh, NJ (1975) Does it matter? The scientific and political impact of Burt’s work. In Cyril Burt. Fraud or framed. Oxford: Oxford University Press. Pp 130-151.

Mall, FP (1909) On several anatomical characters of human brains, said to vary according to race and sex, with special reference to the weight of the frontal lobe. American Journal of Anatomy 9, 1-32.

Advertisements

One thought on “Sir Cyril Burt (1883-1971) and the heritability of intelligence debate

  1. Pingback: Is most published research really wrong? | Dr Geoff

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s