Community Dental Health

cover art

Cover Date:
December 2008
Print ISSN:
0265 539X
Vol:
25
Issue:
4

Editorial - The statistical legacy of William Sealy Gosset (“Student”)

The work The Probable Error of a Mean that led to today’s t-distribution was the lead article in the March 1908 issue of Biometrika. Its author was William Sealy Gosset [1876-1937], who -- for proprietary reasons -wrote under the pen name “Student.” With an Oxford degree in mathematics and chemistry, Gosset began work at the Guinness brewery in Dublin in 1899. Those who still misspell his surname today –as of this writing, the list includes the website of his own employer -- might find it easier to remember the correct spelling if they realized that the Gossets came from Huguenot stock, that the name on the family crest refers to “trois Goussés de fèves feuillées et tigées, et rangées, en pairle de meme” and that it was transmuted to Gosset when his forbears settled in the Channel Island of Jersey The first of his more than 20 statistical papers, published in 1907 while he was on sabbatical leave at Karl Pearson’s unit at University College, dealt with statistical variations in the counting of (yeast) cells. In it, he independently derived the Poisson distribution. Moreover, he used statistical techniques for correlations and goodness of fit developed by Pearson in the previous ten years to test the fit of this theoretical distribution to several sets of experimental data. In 1935, he moved back to his native England to set up a Guinness brewery in Park Royal in London. Gosset died at the age of 61 in 1937. His friend E S Beaven, a barley breeder, wrote an obituary in the London Times. In it, he revealed Gosset’s pen name, and described him as one of a “new generation of mathematicians who were founders of theories now generally accepted for the interpretation of industrial and other statistics.” The 1939 Biometrika tribute by his Guinness colleague McMullen warmly described “ “Student” as a Man.” “Student” as Statistician” by Egon Pearson also described his warm personal relationship with Gosset, and reviewed Gosset’s groundbreaking statistical work, his links with Karl Pearson and Ronald Fisher, and how Gosset remained a friend of both of these temperamental men. Fisher, who never again published in Biometrika after his early dealings with its editor Karl Pearson, wrote his appreciation of Gosset in the Annals of Eugenics (now the Annals of Human Genetics). He asked Gosset’s wife Marjory for a suitable photograph of Gosset. She replied that he didn’t like to be photographed in his later years, but she could supply “one fairly good photo of him taken about 1908, but I suppose that wouldn’t do.” He used it as the frontispiece for the issue of the journal; this photo is the source for the image used in this 2008 tribute. And, thanks to Stephen Ziliak’s book, we now know that Gosset’s nom-de-plume may well have derived from a manufacturer ’s imprint on the cover of “Student’s” first small sample notebook, “The Student’s Science Notebook, Eason and Son, Ltd., Dublin and Belfast,” 1905-1907, which “Student” used while on sabbatical at Karl Pearson’s Biometric Laboratory. His notebooks contain the results of the statistical simulations, and his mathematical derivations. He used the simulations to check the form of his theoretical curves, since he was not entirely confident that they were mathematically correct. In Fisher’s obituary, W. S. Gosset is described as “one of the most original minds in contemporary science.” Fisher goes on, “Without being a professional mathematician, he first published, in 1908, a fundamentally new approach to the classical problem of the theory of errors, the consequences of which are only still gradually coming to be appreciated in the many fields of work to which it is applicable. The story of this advance is as instructive as it is interesting.” And indeed it is. The “different distribution curves for the behaviour of means based on different sample sizes” were the impetus for Fisher to develop the idea of degrees of freedom, and analysis of variance. The concept of the former is still difficult for students today; this author likes to describe the number of degrees of freedom as the number of independent assessments of error. In the 1939 tribute to Gosset, Fisher upbraided Karl Pearson -- Gosset’s supervisor while he wrote the seminal 1908 paper – for his insistence on using a divisor of n, rather than n-1, to estimate the variance. Gosset had studied Airy’s Theory of Errors, and must have read there that the use of n-1 leads to an unbiased estimator. Indeed, in a letter to a Dublin colleague in May 1907, he wrote “when you only have quite small numbers, I think the formula with the divisor of n-1 we used to use is better” But Karl Pearson, scoffed that it doesn’t matter, “because only naughty brewers take n so small that the difference is not of the order of the probable error!” Gosset’s limited mathematical statistics capacity and vision did not allow him to see how his table could be used for a much broader array of statistical analyses than the simple 1-sample or paired sample situations he illustrated in his 1908 paper. It was Fisher who saw, and in 1925 fully described, the many extensions of Gosset’s work, not just to the familiar 2-independent-samples context, but also to correlation and regression coefficients. Fisher also convinced Gosset to leave behind the ratio, z = (ȳ-µ)/s , whose sampling distribution he had originally derived, for the ratio we learn today, t= (ȳ-µ)/(s/√n), and in which – unlike the use of the divisor of n that Pearson insisted on -- the standard deviation s is now estimated using the degrees of freedom n-1 as the divisor.

Correspondence to: James A. Hanley, Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 1020 Pine Avenue West, Montreal, Quebec, H3A 1A2, Canada. E-mail: James.Hanley@McGill.CA. Web: www.epi.mcgill.ca/hanley

Article Price
£15.00
Institution Article Price
£
Page Start
194
Page End
195
Authors

Articles from this issue

  • Title
  • Pg. Start
  • Pg. End

  1. Oral health problems and needs in nursing home residents in Northern Italy
  2. 0
  3. 0

  1. Editorial - The statistical legacy of William Sealy Gosset (“Student”)
  2. 194
  3. 195

  1. Effect of tooth loss and denture status on oral health-related quality of life of older individuals from Sri Lanka
  2. 196
  3. 200

  1. The distribution of general dental practitioners with NHS contract numbers in relation to the distance of their practices from the seven dental undergraduate teaching hospitals in England outside London
  2. 201
  3. 204

  1. Australian Research Centre for Population Oral Health, The University of Adelaide, Australia
  2. 205
  3. 210

  1. Retention and effectiveness of fissure sealants in Kuwaiti school children
  2. 211
  3. 215

  1. Loss of sealant retention and subsequent caries development.
  2. 216
  3. 220

  1. Oral health and treatment needs among 15-year-olds in Tehran, Iran
  2. 221
  3. 225

  1. The frequency of periodontal infrabony defects on panoramic radiographs of an adult population seeking dental care
  2. 226
  3. 230

  1. Caries prevalence and need for dental care in 13–18-year-olds in the Municipality of Milan, Italy.
  2. 237
  3. 242

  1. Number of teeth and serum lipid peroxide in 85-year-olds
  2. 243
  3. 247

  1. Access to dental services for people with a physical disability: a survey of general dental practitioners in Leicestershire, UK
  2. 248
  3. 252

  1. Short Communication - Risk indicators of dental caries in 5-year-old Brazilian children
  2. 253
  3. 256