«Edited by ANNE MASON Research Fellow, Centre for Health Economics University of York and ADRIAN TOWSE Director, Ofﬁce of Health Economics Radcliffe ...»
It seems as though mainstream health economics continues to endorse the view that ‘utilities’ constitute the only legitimate form of adjustment weight in QALYs. This view appears to arise through vestigial attachment to the need for a methodological foundation grounded in economics or at any rate in a contiguous discipline. This in turn gives rise to the questionable position in which any ‘utility’-based method is deemed acceptable, despite the manifest failure of the classical theory that provides its source DNA and from which these derivatives are formed. This is a weak and fundamentally indefensible position from which to operate, reminiscent of the last throes of the Flat Earth Society in the early days of space ﬂight. Of course we can always ﬁnd a justiﬁcation for why the theory does not quite work as a model of real-world behaviours – we may simply regret man’s inability to conform to expected utility theory; or we may construct experiments that test alternative explanations for the behaviours that violate the classical theory. When challenged over the friable nature of the theory to which health economics is apparently wedded, the response seems to be that at least there is a theoretical underpinning, unlike the position in other disciplines that deal with similar issues and where theory is held to be absent.
THE WAY AHEADAs practitioners in the ﬁeld of health economics we can choose between two alternatives. We might take the view that social preferences needed for computing QALYs must be expressed in terms of utilities that are derived from a choice-based methodology linked to relevant theory. In this situation, it is likely that the method by which utilities are generated would simply follow as a logical progression from theory into practice. This fortunate state of affairs would be further complemented by a high degree of consensus in academic circles about the theoretical basis of such measurement and the practical ways of achieving it. Furthermore, novel techniques could be empirically tested against that existing standard as a mechanism for determining their suitability as substitutes. Alternatively, we might consider that social preferences may be expressed as utilities, but that this is not an absolute requirement. The value associated with a health state may be determined by any one of a larger set of methods, the only constraint being that it must produce a single index value on a scale that assigns a value of 0 and 1 to ‘dead’ and ‘full health’ respectively.
Both alternatives leave us well short of a sustainable position. Since different procedures for preference measurement tend to generate different values for PUTTING THE ‘Q’ IN QALYS 123 a given health state, it will require an extraordinary piece of good fortune to come up with a plausible explanation or a unifying theory that allows for transformation between competing value sets. It could be that a retreat into an exclusive utility-based approach has some merit, since this would reduce the range of candidate methods. However, it would still leave us some way short of an accepted (or acceptable) common method.
In the absence of a recognised standard, multiple measurement methods must be tolerated as having some claim to legitimacy. The occasional happy accidental convergence of results offers some comfort that perhaps the picture is less complicated than others would have us believe. Widely differing results give further support for the view that different methods necessarily yield divergent results. The usual response to such a multiplicity of choice is to take refuge in sensitivity analysis rather than attack the problem head on. Does it make any difference to the conclusions if we apply one set of values/utilities or another? If quality adjustment is such a problematic task, then, despite the theoretical niceties, is it imperative that it is always undertaken as part of any cost-effectiveness analysis? Recent attention given to this question suggests that in many studies, quality adjustment had relatively little effect on the ﬁnal costeffectiveness ratio. Its impact was important in moving ratios across a $50 000/ QALY cost-effectiveness threshold in only some 20% of the investigated cases (Chapman et al., 2004). Where quality adjustment was indicated, then low-level investment in collecting preference data – for example using ad hoc adjustments – may be sufﬁcient. Accepting the luxury of this approach leads to the inescapable conclusion that the choice of preference-elicitation method is an irrelevancy, and that ultimately any number will do. One way of addressing this decline into darkness would be to revisit the requirements of the reference case. Were NICE technical guidance to stipulate that all cost–utility analysis should be based upon a single generic instrument scored using a standard set of weights (perhaps regardless of their pedigree), then many of the problems associated with variability in quality-of-life data would be overcome. At least then the variability in reporting health outcomes could be contained.
True, where one door opens another closes and it would have to be recognised that some clinical studies would lack data based on that standard.
But that is precisely the situation that holds today.
So for now we are faced with a real world that remains free of a consensus over the means by which social preferences of the population should be established. One consequence of this laissez-faire approach is that it permits the use of utility weights that only remotely connect with the speciﬁcations demanded for NICE appraisals. At this point, what seems to be the narrow issue about how to measure social preferences assumes a broader and more fundamental importance. The worldly pragmatists argue that decisions about the cost-effectiveness of new treatments have to be made, that we cannot wait for perfect measures or analytical tools, that uncertainty is endemic, that qualms about quality are not restricted to quality-of-life data, that NICE’s
124 THE IDEAS AND INFLUENCE OF ALAN WILLIAMSdeterminations are not based solely on the cost-effectiveness evidence. All these arguments carry some weight of course, but they need to be seen from the perspective of society as a whole, not just from that vantage point of health economics or the scientiﬁc research community. Key to the long-term sustainability of NICE-type moderation of new health technologies is the extent to which the public remains convinced of the probity of the process and its outcomes. Decisions that appear to rely heavily on technically opaque methods offer natural targets for those disadvantaged by those decisions.
It is too easy to dismiss such reactions as being the expected consequences from the usual suspects. Those close to the quality-of-life technology and its application in cost–utility analysis have a responsibility to act in ways that are compatible with the discharge of their roles as both scientists and citizens. To ignore or conceal issues that bear on the process of analysis is to risk long-term consequences that could disadvantage us all.
Nearly half a century has past since Bush, Torrance, Rosser, Williams and others ﬁrst took up the challenge of measuring health outcomes. In the inter vening period, the research landscape has profoundly changed with a complexity today that might have been difﬁcult to envisage in those early days. The academic discipline of health economics was spawned during this time, and with it the emergence of cost–utility analysis in health. Despite some 25 years of sustained enquiry this central question of how to value health in QALY calculations remains both topical and largely unresolved. Perhaps now would be a good time to free ourselves from the self-imposed straitjacket of utility.
REFERENCES Bergner M, Bobbitt RA, Kressel S et al. (1976) The Sickness Impact Proﬁle: conceptual formulation and methodology for the development of a health status measure.
International Journal of Health Services 6: 393–415.
Brazier J, Deverill M, Green C et al. (1999) A review of the use of health status measures in economic evaluation. Health Technology Assessment 3: 1–164.
Chapman RH, Berger M, Weinstein MC et al. (2004) When does quality-adjusting life years matter in cost-effectiveness analysis? Health Economics 13: 429–36.
Churchmann CW, Ackoff RL and Arnoff E (1957) Introduction to Operations Research.
New York: Wiley.
Culyer AJ, Lavers, RJ and Williams A (1972) Health indicators. In: Shonﬁeld A and Shaw S (eds) Social Indicators and Social Policy. London: Heinemann, pp. 94–118.
Drummond MF, O’Brien B, Stoddart GL and Torrance GW (2005) Methods for the Economic Evaluation of Health Care Programmes (3e). Oxford: Oxford University Press.
Fanshel S and Bush JW (1970) A health-status index and its application to healthservices outcomes. Operations Research 18: 1021–66.
Gold MR, Patrick DL, Torrance GW et al. (1996) Identifying and valuing outcomes.
PUTTING THE ‘Q’ IN QALYS 125 In: Gold MR, Russell LB and Weinstein MC (eds) Cost-effectiveness in Health and Medicine. New York: Oxford University Press, pp. 82–134.
Grogono AW and Woodgate DJ (1971) Index for measuring health. Lancet 2:
Gudex C, Kind P, van Dalen H et al. (1993) Comparing Scaling Methods for Health State Valuations: Rosser revisited. CHE Discussion Paper No.107. York: University of York.
Hunt SM, McEwen J and McKenna SP (1985) Measuring health status: a new tool for clinicians and epidemiologists. Journal of the Royal College of General Practitioners 35: 185–8.
Karnofsky DA, Abenmann WH, Craver LF and Burchenal JH. (1948) The use of nitrogen mustards in the palliative treatment of carcinoma. Cancer 1: 634–56.
Kind P (2005) The West Lothian Question – should Scottish ‘voters’ be included when valuing EQ-5D health states in England? Proceedings of the EuroQoL Group Scientiﬁc 22nd Annual Plenary Meeting, Oslo, 8–10 September 2005.
Kind P and Rosser RM (1980) Death and dying: scaling of death for health status indices. In: Barber B, Gremy F, Überla K (eds) Lecture Notes on Medical Informatics 5: 28–36. Berlin: Springer-Verlag.
Kind P, Rosser RM and Williams A (1982) Valuation of quality of life: some psychometric evidence. In: Jones-Lee MW (ed.) The Value of Life and Safety. Collection of Papers presented at the Geneva Conference on The Value of Life and Safety held at the University of Geneva, 30 March to 1 April, 1981. Amsterdam: North-Holland Publishing Company, pp. 159–70.
National Institute for Clinical Excellence (2001) Guidance for Manufacturers and Sponsors: Technology Appraisals Process Series No. 5. London: National Institute for Clinical Excellence.
National Institute for Clinical Excellence (2004) Guide to the Methods of Technology Appraisal. London: National Institute for Clinical Excellence.
Patrick DL, Bush JW and Chen MM (1973) Methods for measuring levels of wellbeing for a health status index. Health Services Research 8: 228–45.
Pliskin S, Shepard DS and Weinstein MC (1980) Utility functions for life years and health status. Operations Research 28: 206–24.
Read JL, Quinn RJ, Berwick DM, Fineberg HV and Weinstein MC (1984) Preferences for health outcomes. Comparison of assessment methods. Medical Decision Making 4: 315–29.
Rosser RM and Kind P (1978) A scale of valuations of states of illness: is there a social consensus? International Journal of Epidemiology 7: 347–58.
Rosser RM and Watts VC (1972) The measurement of hospital output. International Journal of Epidemiology 1: 361–8.
Sanders BS (1964) Measuring community health levels. American Journal of Public Health 54: 1063–70.
Sellin T and Wolfgang ME (1964) The Measurement of Delinquency. New York: Wiley.
Sintonen H (1981) An approach to measuring and valuing health states. Social Science and Medicine 15: 55–65.
126 THE IDEAS AND INFLUENCE OF ALAN WILLIAMS
Sullivan DF (1966) Conceptual Problems in Developing an Index of Health. Washington:
Office of Health Statistics Analysis, Department of Health, Education and Welfare.
Thurstone LL (1927) The method of paired comparisons for social values. Journal of Abnormal Psychology 21: 384–400.
Torrance GW (1986) Measurement of health state utilities for economic appraisal: a review. Journal of Health Economics 5: 1–30.
Torrance GW, Boyle MH and Horwood SP (1982) Application of multiattribute
utility theory to measure social preferences for health states. Operations Research 30:
Ware JE and Sherbourne CD (1992) The MOS 36-item short form health status
survey (SF-36) I: Conceptual framework and item selection. Medical Care 30:
Weinstein, MC and Stason WB (1977) Foundations of cost-effectiveness analysis for health and medical practices. New England Journal of Medicine 296: 716–21.
Williams A (1974) Measuring the effectiveness of health care systems. British Journal of Preventive and Social Medicine 28: 196–202.
CHAPTER 11 Discussion of Paul Kind’s paper: ‘Putting the “Q” in QALYs’... Ben van Hout