Disease-specific quality of life: the Gallstone Impact Checklist

Margaret L. Russell,*# MD, PhD
Roy M. Preshaw,# ND
Rollin F. Brant,* PhD
Barry D. Bultz,~ PhD
Stacey A. Page,* MSc

Clin Invest Med 1996;19(6):453-60.

[résumé]


From *the Department of Community Health Sciences, #the Department of Surgery, and ~the Departments of Psychiatry and Oncology, University of Calgary, Calgary, Alta.

(Original manuscript submitted Jan. 3, 1995; received in revised form June 18, 1996; accepted June 21, 1996)

Paper reprints may be obtained from: Dr. Roy M. Preshaw, Department of Surgery, Faculty of Medicine, University of Calgary, 3330 Hospital Dr. NW, Calgary AB T2N 4N1; tel 403 220-4937; fax 403 283-4740


Contents


Abstract

Objective: To develop a disease-specific quality-of-life scale for symptomatic cholelithiasis for use in clinical trials, and to evaluate its reliability, construct validity and responsiveness.

Design: Questionnaire.

Participants: Health care professionals, patients with symptomatic cholelithiasis and their significant others.

Interventions: A 114-item questionnaire was developed from open-ended questions completed by the participants. Questions dealt with physical symptoms, activities of daily living, job performance, leisure activities, emotional factors, marital and sexual relations, support networks and financial situation. The questionnaire was administered by an interviewer to 50 subjects booked for elective cholecystectomy: frequency-importance products were calculated for each of the 114 items. A final shortened scale (the Gallstone Impact Checklist [GIC]) contained 41 items and was completed by patients with symptomatic cholelithiasis on two occasions, 4 to 6 weeks apart.

Results: The checklist requires 10 to 15 minutes to complete. Reliability of the questionnaire and its four subscales was assessed by Cronbach's * (overall questionnaire 0.88, pain 0.60, dyspepsia 0.73, emotional impact 0.78 and food and eating 0.84). Construct validity was established by comparison of questionnaire subscales with global ratings of physical and emotional health. Among subjects who reported a difference in their symptoms attributed to gallstones, there was a significant change in total GIC score and in each of the four subscales. Among patients who had undergone cholecystectomy, the absolute value of the effect size was 1.63.

Conclusions: The GIC has content validity and appears to be a reliable, responsive measure of within-person change for subjects with symptomatic cholelithiasis.


Résumé

Objectif : Mise au point d'une échelle de qualité de vie pour la cholélithiase symptomatique en vue d'une utilisation dans des études cliniques; évaluation de la fiabilité, la validité de construit et l'aptitude à la réponse.

Devis : Questionnaire.

Sujets : Professionnels de la santé, sujets avec cholélithiase symptomatique ainsi que leurs proches.

Interventions :Un questionnaire contenant 114 items a été mis au point d'après des questions à réponse ouverte complétées par les sujets. Les questions portaient sur les symptômes, les activités de la vie quotidienne, la performance au travail, les activités de détente, les facteurs émotionnels, les relations maritales et sexuelles, le réseau de soutien et la situation financière. Le questionnaire a été complété par entretien avec 50 sujets chez lesquels une cholécystectomie élective était prévue; les facteurs fréquence­importance ont été notés pour chacun des 114 items. Une liste de contrôle finale contenant 41 items a été mise au point et fut complétée par les sujets avec cholélithiase symptomatique en deux occasions avec un écart de 4 à 6 semaines.

Résultats : La liste de contrôle finale était complétée en 10 à 15 minutes. La fiabilité du questionnaire et de ses quatre sous-échelles a été déterminée par la mesure a de Cronbach (questionnaire entier 0.88, douleur 0.60, dyspepsie 0.73, impact émotionnel 0.78, alimentation 0.84). La validité de construit a été établie par comparaison des sous-échelles du questionnaire avec l'évaluation globale de la santé physique et émotionnelle. Parmi les sujets qui notaient une différence dans les symptômes attribués à la cholélithiase, un changement significatif était noté dans la cotation d'ensemble de même que la cotation de chacune des quatre sous-échelles. Parmi les sujets qui avaient subi une cholécystectomie, la valeur absolue de la taille de l'effet était de 1.63.

Conclusion : La liste de contrôle finale (nommée Gallstone Impact Checklist) possède une validité de contenu et une mesure fiable et apte à répondre aux changements intra-individuels chez les patients avec cholélithiase symptomatique.

[Table of contents]


Introduction

Elective surgery is carried out on the basis of the patient's willingness to bear the risks of an operation in order to obtain symptomatic relief and restoration of function. Such interventions are defensible only insofar as they improve patient well-being or quality of life. "Generic" quality-of-life instruments such as the Sickness Impact Profile[1] may have low content validity for a particular disease because they include items of minimal relevance to patients with the disease and omit items of particular importance to them.[2] Therefore, disease-specific tools should be used in addition to "generic" tools in the evaluation of outcomes of therapeutic procedures.[3] One obstacle to such assessments is the lack of disease-specific tools to measure outcomes.

Cholecystectomy is one of the most common surgical procedures performed in Canada,[4] and it has undergone and continues to undergo rapid technological change. Furthermore, although most patients with symptoms appear to benefit in the short term from cholecystectomy, there remain some questions as to the long-term benefits of the procedure, at least from the patient's point of view.[5,6] Outcome indicators that go beyond return to work and activities of daily living are needed for the assessment of such therapies. However, the published instruments for the assessment of cholecystectomy lack sensitivity to changes in patient status.[7]

In this study, we developed a quality-of-life questionnaire to measure clinically important changes within patients with symptomatic cholelithiasis, and we evaluated its reliability, construct validity and responsiveness.

[Table of contents]

Methods

The study protocol was reviewed and approved by the Conjoint Committee on Medical Ethics of the University of Calgary and Foothills Hospital. The questionnaire was developed with the use of previously established principles[8­10] and methods.[11] Development involved two initial phases: item generation and item selection. In a third phase, the final checklist was tested on a series of patients with symptomatic cholelithiasis to evaluate construct validity, questionnaire reliability and responsiveness to change in the patients' clinical status.

Item generation

The aim of this phase was to identify items concerning impairment of quality of life applicable to patients with symptomatic cholelithiasis. We posed open-ended questions about disease impact in eight categories: physical symptoms, activities of daily living, job or school performance, leisure activities, emotional factors, marital and sexual relations, support networks and relations, and financial situation. When questioning patients, we also addressed medical history and demographic characteristics. Questionnaires were sent to health care professionals, patients with cholelithiasis and patients' significant others. Participants completed and returned the questionnaires anonymously.

The health care professionals consisted of 10 randomly selected surgical nurses and a random sample of 32 physicians practising in the City of Calgary, stratified by specialty (family physicians, general surgeons and gastroenterologists).

Patients were recruited from surgical and gastroenterological offices and outpatient ultrasonographic facilities. The inclusion criteria were gallstones demonstrated on ultrasonographic examination or on cholecystogram and no previous surgical intervention in subjects being assessed for elective therapy. Patients, in turn, gave questionnaires to their significant others.

Data from the questionnaires were transcribed verbatim and the content was analysed with the assistance of ethnographic software (The Ethnograph, version 3.0, Qualis Research Associates, Corvallis, Ore.). Comorbid health conditions listed in patients' medical histories were coded in a hierarchical manner so that gastrointestinal complaints were given prominence if there was more than one comorbid condition. A list of 114 items likely to be important to patients with symptomatic cholelithiasis was then constructed.

Item selection

The purpose of this phase was to identify the items most important to patients with symptomatic gallstones. The 114 items generated previously were compiled into a questionnaire and pretested for clarity before being administered to another sample of patients under consideration for elective cholecystectomy. Eligible patients were those 18 years of age and older who spoke English, who were referred to a general surgeon affiliated with either the Foothills Medical Centre or the Calgary General Hospital, and who consented to be interviewed. All of the general surgeons affiliated with the two hospitals agreed to give eligible patients an invitation to participate in the study at their first office visit. In addition, all eligible patients attending the outpatient pre-admission clinics of the two hospitals were given an invitation to participate. In the course of a structured interview with a trained interviewer, subjects were asked to describe any physical, social or emotional aspects of their lives affected by their gallstone disease. These responses were content-analysed to ensure that no new or relevant data emerged from the interviews that had not been covered in the 114 items generated previously. Subsequently, subjects stated whether each of the 114 items applied to them (yes = 1, no = 0). For items that were relevant to them, the patients rated how problematic the item was on a five-point Likert scale ("extremely" = 5 to "not at all" = 0). For each item, we calculated the number of subjects who indicated that the item applied to them, and the median score for the subjects who indicated that the item was applicable. A weighted score for each item was determined by multiplying the median score by the number of subjects to whom the item applied. For each sex, the 40 items with the highest frequency-importance score were listed and compared, producing a list of 49 items. This roster was reduced to 41 items by retaining the 31 items common to both men and women as well as the sex-specific items with high frequency-importance ratings.

Evaluation

The final 41-item checklist was designed to be self-administered. Subjects rated whether each item applied to them (yes/no), and, for each relevant item, the extent to which the item was problematic (on a visual analogue scale). Patients required 10 to 15 minutes to complete the questionnaire. Perceived health was assessed by asking, "Generally speaking, compared to other persons your age, would you rate your health as excellent, very good, good, fair or poor?" Global perceptions of quality of physical and emotional well-being were measured on a 10-point ladder scale, with 0 representing the worst one could possibly feel and 10 representing the best one could possibly feel.[12] The presence of comorbid conditions was elicited from an open-ended question as well as from yes/no responses to a checklist of gastrointestinal conditions. Further questions addressed physician visits, emergency-department visits and hospital admissions.

A third series of subjects was recruited from the offices of general surgeons and the pre-admission clinics of the Foothills Medical Centre and the Calgary General Hospital. Eligible patients had gallstones confirmed by ultrasonographic examination or cholecystogram, spoke English, were on a waiting list for elective cholecystectomy, and were not scheduled for surgery for at least 3 days after being invited to participate in the study.

Questionnaires were completed in the presence of a research assistant at baseline (before surgery) and 4 to 6 weeks later. At the second session, patients were also asked whether their gallstone problem had remained the same, improved or worsened since they had completed the baseline questionnaire.

Total scores were calculated for each subject by summing the importance scores for all items applicable to the patient. Items that were not endorsed were assigned an importance score of 0.

The reliability of the questionnaire was established by computation of Cronbach's alpha13 for the total scale and for the four subscales, obtained from the baseline questionnaire. Temporal stability (test­retest reliability) was evaluated with the use of Spearman's correlation coefficient (rs) among subjects who reported that their gallstone condition was unchanged the second time they completed the questionnaire.

The construct validity of the questionnaire was evaluated with the use of the Mann­Whitney U test14 to compare change scores among subjects who reported that their clinical condition had changed the second time they completed the questionnaire with those who reported no change. The construct validity of the checklist was further established using Spearman's correlation cofficient to assess the relations between self-ratings of global physical and emotional well-being and the relevant subscales.

Instrument responsiveness was assessed by the calculation of effect size: the ratio of the difference between the mean score at baseline and the mean score after surgery to the standard deviation of the mean baseline score.[15]

[Table of contents]

Results

Item generation

In the item generation phase, 105 health care professionals and patients responded to the invitation to participate. Response rates were variable: 5/10 nurses (50%), 10/20 family physicians (50%), 5/6 general surgeons (83%) and 3/6 gastroenterologists (50%). A total of 277 invitations for patients to participate were distributed to physicians' offices and ultrasonographic facilities, of which 57 were returned by eligible patients, representing a minimum response rate of 21%. The significant others of 25 patients returned questionnaires, a minimum response rate of 44% (assuming that all patients had a significant other). The patients' demographic and selected health-status indicators are given in Table 1.

As described in Methods, 114 items were generated from the data provided in this phase.

Item selection

A total of 246 invitations to participate were distributed by physicians' offices and pre-operative clinics. Of these, 77 (31%) were returned by patients, of whom 50 were eligible. Among the patients who were excluded, exclusion was primarily because they had already had surgery. The characteristics of the subjects are summarized in Table 1. The vast majority (90%) rated their perceived health as good, very good or excellent. Content analysis of the answers to open-ended questions from the item selection respondents revealed no new or relevant information when compared with the 114 items produced during the initial item-generation phase. The frequency and importance scores of the final list of 41 items are given in Appendix 1. Conceptually, the content of the 41 items appears to address four areas: pain, dyspepsia, emotional impact and food and eating.

The analysis was repeated excluding patients who reported having a comorbid gastrointestinal condition; however, as a result of this analysis, the final list of 41 items was the same.

Evaluation

Description of subjects

Sixty-seven subjects completed the baseline questionnaire. Forty of these patients were recruited from the pre-operative clinics held at the two participating hospitals and 27 were recruited through the offices of the participating physicians. Relevant aspects of the subjects are presented in Table 1.

When asked to rate their health generally, most people indicated that their health was good to excellent (62/67, 90%). The median score for global self-rated physical well-being was 7 and for global self-rated emotional well-being was 8.

Fifty-seven subjects (7 men and 50 women) participated in the follow-up. People who completed questionnaires on both occasions were more likely to have postsecondary education than those who did not (p < 0.05); however, there were no differences in age, sex, baseline self-rated physical or emotional well-being, gastrointestinal comorbid conditions, baseline checklist score or emergency-department visits for gallstone symptoms between those who returned and those who did not. At follow-up, 39 persons had undergone elective laparoscopic cholecystectomy; none of these subjects required an open procedure.

Reliability

Cronbach's alpha for the 41-item checklist was 0.88. The questionnaire contains four subscales based on the areas that emerged during item selection: pain, dyspepsia, emotional impact and food and eating. The reliability of each subscale was demonstrated by Cronbach's alpha, which was 0.60 for the pain subscale, 0.73 for dyspepsia, 0.78 for emotional impact and 0.84 for food and eating.

Temporal stability (test­retest reliability) was evaluated using Spearman's correlation coefficient among subjects who, at the second completion of the questionnaire, reported that their gallstone condition was unchanged. Although the correlation cofficient was 0.93, as can be seen in the Table 2, the median total questionnaire scores of these nine patients varied during the interval. Three of the four subscales -- dyspepsia (rs = 0.85), emotional impact (rs = 0.69) and food and eating (rs = 0.77) -- also showed moderate to high correlations. Despite this, the emotional impact subscore, like the total questionnaire score, also varied over time (Table 2). The correlation cofficient for the pain subscore was low (rs = 0.40).

Validity

The differences in scores and subscale scores between first and second questionnaire administration were compared for subjects who did and did not report a difference in their gallstone condition. Difference was calculated with the use of the Mann­Whitney U test (Table 3). Significant differences were found for the total score (p = 0.01) and each of the four subscales: pain (p = 0.002), dyspepsia (p = 0.007), emotional impact (p = 0.03) and food and eating (p = 0.003).

Spearman rank correlation coefficients for self-rated global physical well-being were -0.26 for the pain subscale (p = 0.006), -0.52 for dyspepsia (p < 0.001) and -0.12 for food and eating (p = 0.21). For self-rated global emotional well-being and emotional impact, the correlation was -0.44. The coefficient for total questionnaire score and global self-rated physical well-being was -0.42 (p < 0.001).

Among the 42 subjects who reported a change for the better at follow-up, the median total questionnaire score was 5.3 at post-test compared with 188.7 at baseline. For those whose condition worsened (n = 7), the median score was 214.1 at follow-up versus 176.8 at study entry. Nine patients reported no change in their condition during the interval: the median post-test score was 48.9, in contrast to 104.9 at baseline. Of the 42 who had improved at post-test, 39 had undergone surgery; none of the subjects who declared that they were worse (n = 7) or those who were unchanged (n = 9) had undergone cholecystectomy. Among persons who stated that their gallstone problem had changed for the better or the worse at the post-test, the effect size was -2.07, compared with +0.52 for those reporting no change in their condition.

Responsiveness

Among patients who had undergone surgery, the absolute value of the effect size was 1.63, in contrast to 0.04 for those who had not had surgery. Among patients who had had a cholecystectomy but also reported gastrointestinal comorbid conditions, the effect size was 1.22.

[Table of contents]

Discussion

We identified items considered important in the day-to-day lives of adults with symptomatic cholelithiasis, with or without comorbid conditions, and used these to develop a disease-specific quality-of-life questionnaire, the Gallstone Impact Checklist (GIC).

The content validity of the GIC was ensured by the methods we used in instrument development: soliciting opinions of clinical experts and of patients' significant others, and most important, including items demonstrated to be important to patients themselves. We began our investigation with the assumption that the disease could broadly affect eight areas of life: physical symptoms, activities of daily living, job or school performance, leisure activities, emotional factors, marital and sexual relations, support networks and relations, and financial situation. In the item-generation phase, responses indicated that, for some patients, the disease did touch on all eight areas. However, the items of greatest relevance to subjects pertained to only three domains: physical symptoms, emotional impact and activities of daily living, specifically dietary limitations. Although cholelithiasis is a chronic disease, the manifestations are episodic and usually brief. Thus, the likelihood of exacerbations being lengthy enough to affect job or school performance, leisure activities, marital and sexual relations, or social or financial situation may be low for most people.

Selection bias may have affected our study, because many subjects with gallstone disease were invited to participate but refused. For example, these subjects may have had more severe symptoms and been less likely to cooperate because of the time commitment required. We had, however, no ethical approval to attempt follow-up of subjects who refused to participate.

The scale was designed to be responsive to within-subject change. We attempted to evaluate the scale's responsiveness in the short term by having 67 subjects with symptomatic cholelithiasis complete the baseline questionnaire. Fifty-seven subjects were available for a second evaluation within 4 to 6 weeks. On the basis of the follow-up, we found this instrument to be valid and responsive.

Although the small number of patients with unchanged status did not permit an accurate assessment of the temporal stability of the instrument, the correlation coefficients for three of the subscales were consistent with stability in rankings. However, the changes in medians indicated the possibility of some temporal drift.

We have shown evidence of construct validity for the GIC. Because gallstone attacks are episodic, but other dyspepsia may be more enduring, one would anticipate a stronger relation between global physical well-being and the dyspepsia subscale than the pain subscale. One would also anticipate an association between self-rated global emotional well-being and the emotional impact subscale. Indeed, the only meaningful correlations observed were between self-rated global physical well-being and dyspepsia, and between self-rated emotional well-being and emotional impact. Evidence of the instrument's validity as a measure of change is further established by comparisons of patient reports of change in their gallstone problem and the effect size. The group of patients who stated that their condition had changed had a greater effect size than did the group who did not. Furthermore, the direction of the difference in scores was consistent with the direction of the reported change. There is further evidence of validity in that symptoms of dyspepsia may be due to gastrointestinal comorbid conditions. Thus, the effect size of treatment should be less among those with gastrointestinal comorbid conditions than among those without them -- and this was the pattern that was observed. Future studies could address gastrointestinal comorbid conditions and dyspepsia among people with gallstones by studying a group of patients with gastrointestinal complaints unrelated to gallstones and removing symptoms common to this group and to those with gallstones, or by comparing the GIC with generic quality-of-life scales or with a scale specific to other gastrointestinal conditions, such as inflammatory bowel disease.

An effect size is a standardized measure of change in a group, or a difference in changes between two groups. Cohen[16] defined an effect size of 0.20 as small, one of 0.50 as moderate, and one of 0.80 or greater as large. Thus, an effect size of 0.80 represents a change of at least four fifths of a standard deviation of a baseline measure. According to Cohen's criteria, the GIC is an extremely responsive measure. This may best be appreciated when considering sample size for future studies: an instrument with an effect size of 1 has been estimated to require only 11 paired observations in a study of related groups for Type I error of 0.05 and Type II error of 0.10.10

As anticipated, the effect size decreased somewhat when only people with comorbid conditions were included in the analysis, since the symptoms experienced as a result of these conditions would be expected to persist after cholecystectomy. However, despite the slight decrease in magnitude of effect size, the GIC appears to continue to be responsive in this situation.

Only 15% to 20% of subjects assessed at each step were male. Thus, the GIC may be more relevant to women than to men, and further evaluation of the questionnaire among men is needed. We have described only the short-term results after cholecystectomy. Since the long-term outcomes of the procedure are of much more importance, future studies should extend follow-up over several years and should assess perceived change in status as well as use of health care services.

This preliminary evaluation of the GIC suggests that it is a reliable, valid and responsive instrument for the detection of within-person change in the clinical status of patients with symptomatic cholelithiasis. We believe that the GIC is an appropriate outcome measure for clinical trials comparing therapies for this condition.

[Table of contents]

Acknowledgements

Surgeons who recruited patients for the study included Drs. Robert E. Pow, James Nixon, Francis Sutherland, Janice Pasieka, Walley Temple, Robert Lui, William Buie, Frank Duff, Oscar Retzer, Hugh Gallie, John B. Kortbeek, Robert H. Mulloy, Bruce Rothwell, John Heine, Reg Harse and Roy M. Preshaw. We thank Mary Lang, Chris Reese, the staffs of the ultrasonographic and pre-operative assessment clinics of Foothills Hospital and Calgary General Hospital, and the radiologists and their staff at Elliot, Fong, Wallace and Associates for additional assistance in patient recruitment. We are grateful to Louise Parsons and to two anonymous reviewers for constructive comments on the manuscript.

This study was supported by grants-in-aid from Foothills Hospital and Calgary General Hospital.

[Table of contents]

References

  1. Bergner M, Bobbit RA, Carter WB, Gilson BS. The SIP: development and final revision of a health status measure. Med Care 1981;19:787-805.
  2. Patrick DL, Deyo RA. Generic and disease-specific health status measures. Med Care 1989;27(3 suppl):S217-32.
  3. Jonsson B. Assessment of quality of life in chronic disease. Acta Paediatr Scand Suppl 1987;337:164-9.
  4. Statistics Canada. Surgical procedures and treatments 1992­93. Health Rep 1995;82-217:26.
  5. Bates T, Mercer JC, Harrison M. Symptomatic gallstone disease before and after cholecystectomy. Gut 1984;25:A579-80.
  6. Ros E, Zambon D. Post cholecystectomy symptoms: a prospective study of gallstone patients before and two years after surgery. Gut 1987;28:1500-4.
  7. Cleary PD, Greenfield S, McNeil PJ. Assessing quality of life after surgery. Controlled Clin Trials 1991;12:189S-203S.
  8. Kirshner B, Guyatt GH. A methodologic framework for assessing health indices. J Chronic Dis 1985;38:27-36.
  9. Bombardier C, Tugwell P. A methodological framework to develop and select indices for clinical trials: statistical and judgmental approaches. J Rheumatol 1982;9(5):753-7.
  10. Guyatt GH, Walter S, Norman G. Measuring change over time: assessing the usefulness of evaluative instruments. J Chronic Dis 1987;40(2):171-8.
  11. Guyatt GH, Bombardier, Tugwell PX. Measuring disease-specific quality of life in clinical trials. Can Med Assoc J 1986;134:889-95.
  12. McDowell I, Newell C. Measuring health: a guide to rating scales and questionnaires. New York: Oxford University Press, 1987:215-7.
  13. DeVellis RF. Scale development: theory and applications. Newbury Park (CT): Sage Publications, 1991:25-32.
  14. Siegel S. Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill, 1956:116-27.
  15. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989;27:S178-89.
  16. Cohen J. Statistical power analysis for the behavioral sciences. New York: Academic Press, 1977:8.

| CIM: December 1996 / MCE: décembre 1996 |
CMA Webspinners / >