Canadian normative data for the SF-36 health survey [CMAJ

Canadian normative data for the SF-36 health survey

Wilma M. Hopman,* Tanveer Towheed,† Tassos Anastassiades,† Alan Tenenhouse,‡ Suzette Poliquin,‡ Claudie Berger,§ Lawrence Joseph,¶ Jacques P. Brown,** Timothy M. Murray,†† Jonathan D. Adachi,‡‡ David A. Hanley,§§ Emmanuel Papadimitropoulos,¶¶ and the Canadian Multicentre Osteoporosis Study Research Group (CaMos)***

The list of CaMos investigators appears at the end of this article.

CMAJ 2000;163(3):265-71

Contents

Abstract
Introduction
Methods
Results
Interpretation
References

Abstract

Background: The Medical Outcomes Study 36-item Short Form (SF-36) is a widely used measure of health-related quality of life. Normative data are the key to determining whether a group or an individual scores above or below the average for their country, age or sex. Published norms for the SF-36 exist for other countries but have not been previously published for Canada.

Methods: The Canadian Multicentre Osteoporosis Study is a prospective cohort study involving 9423 randomly selected Canadian men and women aged 25 years or more living in the community. The sample was drawn within a 50-km radius of 9 Canadian cities, and the information collected included the SF-36 as a measure of health-related quality of life. This provided a unique opportunity to develop age- and sex-adjusted normative data for the Canadian population.

Results: Canadian men scored substantially higher than women on all 8 domains and the 2 summary component scales of the SF-36. Canadians scored higher than their US counterparts on all SF-36 domains and both summary component scales and scored higher than their UK counterparts on 4 domains, although many of the differences are not large.

Interpretation: The differences in the SF-36 scores between age groups, sexes and countries confirm that these Canadian norms are necessary for comparative purposes. The data will be useful for assessing the health status of the general population and of patient populations, and the effect of interventions on health-related quality of life.

[Contents]

Over the past 20 years, there has been an increased recognition of the patient's point of view as an important component in the assessment of health care outcomes. This has resulted in the development of several instruments to measure health-related quality of life. One of the most widely used and psychometrically sound instruments is the Medical Outcomes Study 36-item Short Form (SF-36). This relatively brief and simple questionnaire contains 36 items covering 8 health concepts chosen on the basis of reliability, validity and frequency of measurement in health surveys.^1,2 Two summary scores have also been developed for the SF-36.³

The reliability and validity of the SF-36 have been well documented by the developers of the instrument.^4,5,6,7 A comparison of a series of generic health status measures indicated that the SF-36 is not only psychometrically sound but is also more responsive to clinical improvement than the other instruments tested.^8,9 Moreover, health functioning changed in the hypothesized direction with increased age, socioeconomic status and disease status in a population-based longitudinal study of the SF-36, which suggests that the instrument is sensitive to changes in the health of the general population.¹⁰

Normative data are the key to determining whether a group or an individual scores below or above the average for their country, age or sex. Published norms now exist for the United States,¹ the Queensland region of Australia,¹¹ the United Kingdom,^12,13 certain regions of the United Kingdom,¹⁴ Australian women¹⁵ and US residents with a variety of medical conditions.^1,4 Comparable norms do not yet exist for Canadians. This forces researchers and policy-makers to compare data from Canadian studies to those from other countries. The initiation of the Canadian Multicentre Osteoporosis Study (CaMos) in 1995 provided a unique opportunity to incorporate the SF-36 into a population-based survey and develop age- and sex-adjusted norms for Canadians.

[Contents]

Methods

CaMos is a prospective cohort study of 9423 randomly selected women and men aged 25 years or more living in the community. The sample was drawn within a 50-km radius of 9 Canadian cities (Vancouver, Calgary, Saskatoon, Hamilton, Toronto, Kingston, Quebec, Halifax and St. John's). The study was designed to provide estimates of the prevalence and incidence of osteoporosis and osteoporotic fractures among Canadian men and women and of regional variation in the rates of these conditions. Baseline data were collected by means of an interviewer-administered questionnaire and included sociodemographic information, medical, fracture, reproductive and family history, medication use, diet, alcohol and tobacco use, and physical activity. The health status instrument was self-administered at the end of the interview. Ethical approval for CaMos was obtained through the review boards of each participating centre as well as at the coordinating centre in Montreal.

Health status was assessed with the SF-36, which contains 36 items that, when scored, yield 8 domains. Physical functioning (10 items) assesses limitations in physical activities, such as walking and climbing stairs. The role physical (4 items) and role emotional (3 items) domains measure problems with work or other daily activities as a result of physical health or emotional problems. Bodily pain (2 items) assesses limitations due to pain, and vitality (4 items) measures energy and tiredness. The social functioning domain (2 items) examines the effect of physical and emotional health on normal social activities, and mental health (5 items) assesses happiness, nervousness and depression. The general health perceptions domain (5 items) evaluates personal health and the expectation of changes in health.¹ All domains are scored on a scale from 0 to 100, with 100 representing the best possible health state. One additional, unscored item compares the respondent's assessment of her or his current health with that 1 year earlier. Summary scores for a physical component (physical functioning, role physical, bodily pain and general health perceptions) and a mental component (vitality, social functioning, mental health and role emotional) can also be derived.³

The US English-language version of the SF-36 was used because the Canadian English-language version had not been finalized at the time the study began. However, the only difference between the 2 is the use of the word "kilometre" rather than "mile" in one item; the developers of the Canadian English-language version agree that the concept — being able to walk some distance — is the same.¹⁶ For Quebec, the Canadian French-language version was obtained from the International Quality of Life Assessment Project Group.¹⁷ The data were scored by means of the Medical Outcomes Trust scoring method.^1,3

The sample was identified through the use of all postal codes within 50 km of the study centres. This list was provided to InfoDirect (Bell Canada), who in turn provided a random sample of listed residential telephone numbers in these areas. This method was selected because it was the only one available at all centres. Sample size calculations were completed for each of 12 age and sex stratifications at each of the 9 centres. Because the underlying purpose of the CaMos is to study osteoporosis, fracture and bone density, sample size calculations were based on these features. The prevalence of osteoporosis and fracture increases with increasing age and is believed to be greater among women than among men, so the largest strata are for older women. The data were therefore age- and sex-standardized using simple direct standardization to the Canadian population by weighting the total means based on the underlying population characteristics using Statistics Canada data.^18,19

An introductory letter and information brochure were sent to all sampled households. Trained CaMos interviewers telephoned each selected household about 2 weeks after the introductory material was mailed. Telephone screening identified all eligible members of a household, and a random number table was used if more than one person was eligible. Eligibility was determined on the basis of predefined age, sex, region and calendar period (quarterly) to ensure that each centre obtained the necessary number of participants in each stratum and to eliminate seasonal bias. Up to 12 contact attempts were made. If the first few attempts were unsuccessful, the interviewer telephoned the household again after 2 weeks, at various times of the day, to allow for absences such as vacation.

Not all of the people who were invited to participate agreed to participate fully in all aspects of the study. In most of these cases we collected such information as age, sex, smoking status and number of household members. To evaluate selection bias, we compared these data for subjects for whom we had SF-36 values with the data for those who did not participate fully. We created regression models that predicted SF-36 values from these potential predictors and applied the models to predict, through multiple imputation,²⁰ what the SF-36 values would have been for subjects who did not participate fully. We were thus able to predict whether the means we observed were likely to be different from those we would have observed had we been able to collect data from everyone. We found no differences that could have substantially changed the results reported here.

[Contents]

Results

Data were collected between February 1996 and September 1997. Of the 80 163 households sampled, 59.0% were ineligible, primarily because the age, sex or calendar period stratum was already filled. In addition, 7.8% were invalid or wrong numbers, and 5.2% were unreachable after 12 attempts. Of the remaining households, 28.4% declined to participate, 29.6% completed a short questionnaire only, and 9423 (42.0%) went on to participate fully in the study and complete the SF-36.

The mean age of the sample was 62.1 (standard deviation [SD] 13.4) years. Just over 30% of the sample were men (mean age 59.9 [SD 14.5] years, range 25–97 years), and 69.4% were women (mean age 63.1 [SD 12.8] years, range 25–101 years). The age distribution was similar across the centres, as was the sex distribution (proportion of women ranged from 70.6% in Quebec to 67.2% in Toronto).

The age- and sex-standardized scores for the 8 domains and the 2 summary scales (physical component and mental component) of the SF-36 varied by age (Table 1). Although several domains exhibited somewhat of a ceiling effect (proportion of subjects receiving the maximum possible score) (76.1% in the role emotional domain and 72.7% in the role physical domain), there did not appear to be a strong floor effect (proportion of subjects receiving the minimum possible score) (8.6% in the role emotional domain and 9.8% in the role physical domain).

Fig. 1 shows the Canadian, US and UK normative data for the 8 domains and the 2 summary scales (summary scores are not available for the UK data). Australian normative data are not included as they are provided by age and sex stratification only.¹¹ The Canadian norms are higher than the US norms in every domain and are higher than the UK norms in 4 domains. However, the magnitude of the differences is small, even though the confidence intervals do not overlap for several domains. For example, when comparing the Canadian and US norms, only the vitality domain (difference of 4.9) and general health perceptions domain (difference of 5.1) are close to the difference of 5 points considered to be clinically and socially meaningful.¹

The age- and sex-standardized scores for Canadian men and women varied by age and by sex (Table 2 and Table 3). As in the entire sample, several domains exhibited a ceiling effect (80.3% for men and 72.1% for women in the role emotional domain), but there did not appear to be a strong floor effect (11.9% for women and 7.5% for men in the role physical domain).

The mean scores for Canadian women and men are shown in Fig. 2. Men had higher scores than women for all domains and the 2 summary scales. Although the confidence intervals did not overlap for any of the domains or summary scales, the magnitude of the difference needs to be taken into account. Only 3 of the domains (role physical, role emotional and vitality) had between-sex differences greater than 5 points, and 2 (physical functioning and bodily pain) had differences of just under 5 points.

[Contents]

Interpretation

The Canadian scores for the 8 domains and 2 summary scales of the SF-36 are similar to those from the United States and the United Kingdom, but there is a pattern of higher scores in the Canadian sample for all domains when compared with the US data and for 4 domains when compared with the UK data. This finding is consistent with those of other researchers¹ and underscores the importance of Canadian norms for comparative purposes. The variability of the scores by age underscores the need to use the appropriate age-specific normative data whenever possible.

The differences between countries could be due to methodologic differences rather than representing true differences. For example, the US normative data are based not on a random sample¹ but, rather, on the responses of 2474 participants in the National Survey of Functional Status, who were selected to receive a mailed version on the basis of previous participation in a General Social Survey. The UK norms were based on the responses from 8889 people to a postal survey mailed to randomly selected households.¹³ These differences in methodology will introduce variation in the normative data for international comparisons. Thus, a clear description of methods is a vital part of the interpretation of normative data.

There are also sex differences within the Canadian sample, with men scoring higher than women on all domains and summary scales. These results are consistent with the data from the United States, where men scored higher than women on all domains,¹ and with those from the United Kingdom, where men scored higher than women on all but 1 domain.¹³ Although the differences are not large, there is evidence that some may be clinically and socially relevant, as a 5-point difference between groups or a 5-point change over time is considered clinically relevant.¹

For normative data to be valid, they must be based on a well-defined and representative sample of the population of interest.¹ The Canadian data are based on a sample of 9423 participants from 9 centres across Canada, which included an area within a 50-km radius of the cities in order to include the rural population. The complex sampling framework further increases the likelihood that the sample is representative.

The CaMos subjects were invited to participate, and there is evidence that there may be systematic differences between those who are and those who are not willing to participate in a study.²¹ However, we found no evidence that selection bias could have changed our reported mean values substantially, as determined through multiple imputation methods.²⁰ Moreover, because both the US and the UK data are also based on voluntary participation, this limitation applies to all the studies reported so far. We therefore conclude that the normative data that we present are valid and are based on a representative sample of residents of Canada.

We thank all the participants in the Canadian Multicentre Osteoporosis Study (CaMos). We also acknowledge the early contributions of Dr. Thomas MacKenzie, who died in October 1997.

CaMos was funded by the Senior's Independence Research Program through the National Health Research and Development Program (project no. 6605-4003-OS), the Medical Research Council of Canada–Pharmaceutical Manufacturers Association of Canada (MRC–PMAC) Health Program, Merck Frosst Canada Inc., Eli Lilly Canada Inc., Procter & Gamble Pharmaceuticals Canada, Inc. and the Dairy Farmers of Canada.

Competing interests: None declared.

Investigators in the Canadian Multicentre Osteoporosis Study Research Group: McGill University, Montreal General Hospital, Montreal (coordinating centre): Alan Tenenhouse (principal investigator), Suzette Poliquin (national coordinator), Lawrence Joseph (study statistician), Lucie Blondeau (statistician), Claudie Berger (statistician), Suzanne Lefebvre (administrative assistant); University of British Columbia, Vancouver: Jerilynn C. Prior (centre director), Brian Lentle (Radiology Study consultant), Yvette Vigna (centre coordinator); University of Alberta, Edmonton: Stuart Jackson (medical physicist), Loralee Robertson (research assistant); University of Calgary: David A. Hanley (centre director), Jane Allan (centre coordinator); University of Saskatchewan, Saskatoon: Wojciech P. Olszynski (centre director), and Pat Krutzen and Jola Kedra (centre coordinators); McMaster University, Hamilton, Ont.: Jonathan D. Adachi (centre director), Laura Pickard (centre coordinator); University of Toronto: Nancy Kreiger (study epidemiologist), Timothy M. Murray (centre director), Barbara Gardner-Bray (centre coordinator); Queen's University, Kingston, Ont.: Tassos Anastassiades (centre director), and Pamela Hartman and Barbara Matthews (centre coordinators); Laval University, Sainte-Foy, Que.: Jacques P. Brown (centre director), and Nathalie Migneault-Roy and Evelyne Lejeune (centre coordinators); Dalhousie University, Halifax: Roger S. Rittmaster (centre director), Susan Kirkland (epidemiologist), Barbara Stanfield (centre coordinator); and Memorial University, St. John's: Carol Joyce (centre director), Minnie Parsons (centre coordinator)

Send a letter to the editor
Envoyez une lettre à la rédaction

[Contents]

From *the MacKenzie Health Services Research Group and †the Division of Rheumatology, Queen's University, Kingston, Ont.; ‡the Canadian Multicentre Osteoporosis Study (CaMos) National Coordinating Centre and §the CaMos Analysis Centre, McGill University, Montreal, Que.; ¶the Department of Epidemiology and Biostatistics, McGill University, Montreal, Que.; the Departments of Medicine at **Laval University, Sainte-Foy, Que., ††the University of Toronto, Toronto, Ont., ‡‡McMaster University, Hamilton, Ont., and §§the University of Calgary, Calgary, Alta.; and ¶¶Eli Lilly Canada Inc., Toronto, Ont.

This article has been peer reviewed.

Reprint requests to: Wilma M. Hopman, Director, MacKenzie Health Services Research Group, Department of Community Health and Epidemiology, 3rd floor, Abramsky Hall, Queen's University, Kingston ON K7L 3N6; hopmanw@post.queensu.ca

References

Ware JE Jr. SF-36 Health Survey manual and interpretation guide. Boston: The Health Institute, New England Medical Centre; 1993.
Ware JE Jr. The SF-36 Health Survey. In: Spilker B, editor. Quality of life and pharmaco-economics in clinical trials. 2nd ed. Philadelphia: Lippincott-Raven Publishers; 1996. p. 337-45.
Ware JE Jr, Kosinski M, Keller SD. SF-36 physical and mental health summary scales: a user's manual. Boston: The Health Institute, New England Medical Centre; 1994.
Ware JE Jr, Sherbourne CD. The MOS 36-item Short-Form Health Survey (SF-36): I. Conceptual framework and item selection. Med Care 1992;30:473-83. [MEDLINE]
McHorney CA, Ware JE Jr, Raczek AE. The MOS 36-Item Short Form Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care 1993;31:247-63. [MEDLINE]
McHorney CA, Ware JE Jr, Lu JFR, Sherbourne CD. The MOS 36-Item Short Form Health Survey (SF-36): III. Tests of data quality, scaling assumptions, and reliability across diverse patient groups. Med Care 1994;32:40-66. [MEDLINE]
Ware JE Jr, Kosinski M, Bayliss MS, McHorney CA, Rogers WH, Raczek A. Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: summary of results from the Medical Outcomes Study. Med Care 1995;33(4 Suppl):AS264-79.
Beaton DE, Hogg-Johnson S, Bombardier C. Evaluating changes in health status: reliability and responsiveness of five generic health status measures in workers with musculoskeletal disorders. J Clin Epidemiol 1997;50:79-93. [MEDLINE]
Essink-Bot ML, Krabbe PFM, Bonsel GJ, Aaronson NK. An empirical comparison of four generic health status measures. Med Care 1997;35:522-37. [MEDLINE]
Hemingway H, Stafford M, Stansfield S, Shipley M, Marmot M. Is the SF-36 a valid measure of change in population health? Results from the Whitehall II study. BMJ 1997;315:1273-9. [MEDLINE]
Watson EK, Firman DW, Baade PD, Ring I. Telephone administration of the SF-36 Health Survey: validation studies and population norms for adults in Queensland. Aust N Z J Public Health 1996;20:359-63. [MEDLINE]
Jenkinson C, Coulter C, Wright L. Short Form 36 (SF-36) Health Survey questionnaire: normative data for adults of working age. BMJ 1993;306:1437-40. [MEDLINE]
Jenkinson C, Stewart-Brown S, Petersen S, Paice C. Assessment of the SF-36 version 2 in the United Kingdom. J Epidemiol Community Health 1999;53:46-50. [MEDLINE]
Lyons RA, Fielder H, Littlepage NC. Measuring health status with the SF-36: the need for regional norms. J Public Health Med 1995;17:46-50. [MEDLINE]
Mishra G, Schofield MJ. Norms for the physical and mental health component summary scores of the SF-36 for young, middle-aged and older Australian women. Qual Life Res 1998;7:215-20. [MEDLINE]
Wood-Dauphinee SW, Gauthier L, Gandek B, Magnan L, Pierre U. Readying a US measure of health status, the SF-36, for use in Canada. Clin Invest Med 1997;20:224-38. [MEDLINE]
Ware JE Jr, Keller SD, Gandek B, Brazier JE, Sullivan M, and the International Quality of Life Assessment Project Group. Evaluating translations of health status questionnaires. Methods from the IQOLA project. Int J Technol Assess Health Care 1995;11:525-51. [MEDLINE]
Statistics Canada provincial census data, 1991. Age, sex and marital status: the nation. Ottawa: Statistics Canada; 1993. Cat no 93-310.
Statistics Canada provincial census data, 1991. Profile of census metropolitan areas and census agglomeration, part A. Ottawa: Statistics Canada; 1993. Cat no 93-337.
Rubin D. Multiple imputation for non-response in surveys. New York: John Wiley & Sons; 1987.
Meltzoff J. Critical thinking about research: psychology and related fields. Washington: American Psychological Association; 1998.