Authors: Dalkin, S.M.,* Bate, A.,* Fletcher, A., Anderson, R., Baker, R.M., Donaldson, C., Fenocchi, L., Hibberd, V., Kumar, M.B., Redgate, S., Shenton, F., Westhorp, G., Wong, G., Wright, J.
* Share equally attributed authorship
Cite as: Dalkin, S.M., Bate, A.,Fletcher, A., Anderson, R., Baker, R.M., Donaldson, C., Fenocchi, L., Hibberd, V., Kumar, M.B., Redgate, S., Shenton, F., Westhorp, G., Wong, G., Wright J. ‘Guidance for Realist Economic Evaluation Methods: Version 3. Northumbria University Knowledge Bank: [insert web address; Date Accessed]
Public and Patient Involvement and Engagement: Victoria Bartle, Vivienne Hibberd, Susan Mountain, Margaret Ogden, Viola Rook.
International Interdisciplinary Advisory Group: Professor Hareth Al-Janabi, Professor Andy Briggs, Professor Heather Brown, Professor Jo Coast, Professor Neil Craig, Professor Simon Feeny, Dr Brynne Gilmore, Professor Joanne Greenhalgh, Dr Robert Heggie, Dr Rebecca Hunter, Professor Bruno Marchal, Dr Tomasina Oh, Professor Catherine Pitt, Dr Catherine Purcell, Dr Jack Rendall, Professor Tracey Sach, Dr Katie Shearn, Emeritus Professor Nick Tilley.
Acknowledgements: The Realist Economic Evaluation Methods (REEM) team would like to thank the study’s International Interdisciplinary Advisory Group (IIAG) for their invaluable contributions to the REEM workshops, Delphi panel (including consensus meeting), and comments on the draft guidance. We would also like to acknowledge the contributions of our Public and Patient Involvement and Engagement (PPIE) group. Finally, we would also like to extend our thanks to the following people: Professor Richard Byng, Dr Julian King, Dr Aikaterini Papadopoulou, and Professor Jo Rycroft Malone.
Selected research outputs from the National Institute for Health and Care Research (NIHR) funded study: ‘Developing Realist Economic Evaluation Methods (REEM) and Guidance to Evaluate the Effectiveness, Costs, and Benefits of Complex Interventions (NIHR135102)’ will be disseminated to the public and end-users through our online platform – the ‘Knowledge Bank’ – on Northumbria University’s website, at:
The benefits of using the Knowledge Bank are that the public and end-users can download the materials for free non-commercial use.
This glossary has been developed for use alongside the GUIDANCE FOR REALIST ECONOMIC EVALUATION and is bound by the same terms and conditions of use. These are:
- You grant permission to be followed up by Northumbria University to provide information to support evidencing of impact of the REEM research, its outputs and supporting materials, including this guidance.
- You will not use the REEM research outputs and supporting materials including this guidance for commercial purposes.
- You will refer to the REEM research outputs and supporting materials including this guidance by its title, as stated on the Knowledge Bank platform, when using and/or sharing.
- You will reference and credit Northumbria University and the creator(s) of the REEM research and supporting materials including this guidance when using and/or sharing.
The definitions in this document have been drawn or partially adapted from a range of sources, including the University of York (York Health Economics Consortium, 2016) glossary of health economic terms, and the RAMESES II quality and reporting standards for realist evaluations (Wong et al., 2016). The terms in this glossary have been selected to reflect the terms used in the REE guidance and should be read in conjunction with the guidance.
| Benefits | See Outcomes. |
| Causality | The process of how things happen or are caused. It is the relationship between cause and effect. Three main types of causation are of relevance to REE: Successionist causation looks at the variations between specific variables, uses the logic X → Y, and thrives in experimental and survey-based research. This is the dominant form of causation used in economics. Configurational causation locates causation as arising from a combination of conditions or circumstances which are part of, or affect, programmes or their outcomes, with different combinations leading to different outcomes. Researchers seek to find clusters of cases with the same sets of conditions and a common outcome. Generative causation uses the logic that underlying invisible (or hidden) mechanisms give rise to measurable patterns of outcomes. Causal explanation involves producing and ‘testing’ theories about these mechanisms and the conditions in which they work. This is the basis for realist evaluation. |
| Comparator | Comparators are used to understand the difference that a new way of doing things can make; for example, comparing a new intervention to ‘standard care’ to see if the new intervention provides more or less benefit. In REE, a comparator offers a clear and plausible alternative option or course of action to the intervention under evaluation, against which the intervention can be compared. Comparators can be drawn across interventions for example: a different intervention, standard care, the next best alternative or no intervention; or within interventions, for example: different sites, groups, settings or ways of delivering interventions. See also: Counterfactual. |
| Complex systems | A complex system is a system composed of many components that interact with one another. They are systems that are intrinsically difficult to model due to the dependencies, competitions, relationships or other types of interactions between their parts, or between a given system and its environment. Complex systems are distinct from complicated systems, which have many components but are predictable. Many health and social care interventions are complex and can be explored through complex systems modelling (see also: Modelling), the application of mathematical, statistical and computational techniques to generate insight into how such systems function. |
| Context | Context is the surrounding information, circumstances, or background that helps you understand the meaning of a word, idea, or event. In REE, this is expanded to mean features of the physical and social world that determine whether and which mechanisms operate (Greenhalgh et al., 2017f; Greenhalgh & Manzano, 2022). Context must not be confused with locality; it can include cultural norms, economic conditions, existing public policy and so on. Anything may function as context and it is only possible to work out what is functioning as context when it is linked to a mechanism and outcome, i.e. when the entire context-mechanism-outcome configuration is considered as a whole. See also: CMO configuration. |
| Context-Mechanism-Outcome Configuration (CMOC) | A context-mechanism-outcome configuration (CMOC) is a framework from realist evaluation that explains how an intervention ‘works’ by identifying the specific context (background circumstances), the mechanism (causal forces) and the resulting outcome (consequences). This is the way that most causal explanations are structured in realist evaluations. In its simplest form, a CMOC sets out what is functioning as context to ‘activate’ or ‘trigger’ a mechanism, which in turn causes an outcome. The final ‘C’ stands for configuration, which refers to the necessary linkage of the parts to make a whole explanatory statement. Realist programme theories contain CMOCs that explain causation of outcomes from the intervention. See also: Programme theory; Causality. |
| Cost | In economic evaluation, costs reflect the value of the resource inputs used in the delivery of and participation in the intervention, and the subsequent resources saved (or used) due to the intervention. See also: Resources. Costs can include: costs to the health and social care provider; costs to other sectors, or households; costs that remain the same irrespective of how much of the intervention is delivered or how many participate in the intervention (e.g. buildings, salaries); and costs that change depending on how much of the intervention is delivered or how many participate in the intervention (e.g. materials and consumables). |
| Credible causal explanations | A clear and credible explanation of how an intervention works. This might often be the simplest explanation or the most well-evidenced. |
| Counterfactual | A scenario describing what would have happened in the absence of a specific event or policy change; a ‘what if’ analysis, used to estimate the causal impact of an intervention by comparing its outcome with what would have likely occurred under different circumstances. In economic evaluation, a counterfactual helps to isolate the effect of the intervention by comparing its outcomes to what would have occurred anyway. Statistical counterfactuals are used to estimate the effect of a treatment or intervention by comparing observed outcomes with hypothetical outcomes under a different condition (usually in controlled settings). Mental counterfactuals are theoretical ‘what if’ thoughts about what would have happened. |
| Decision Problem | The decision problem outlines the specific question that the evaluation seeks to answer. This is often framed in terms of defining the background against which the decision needs to be made, the population or group of people who will be affected, the proposed intervention and any alternatives, and the desired outcomes. The decision problem guides the scope and boundaries for the evaluation and the collection of evidence needed for the evaluation. |
| Discounting | Economic evaluations are conducted at a specific, fixed point in time. However, the costs and outcomes associated with the interventions under evaluation can occur at different times and at multiple points in the present or future. In economic theory, costs and outcomes that are predicted to occur in the future are usually valued less than present costs. Evaluations tend to view future costs incurred and future benefits gained from an intervention from the perspective of the present day (or when a decision is to be made, which is usually the same). Future costs therefore tend to be discounted to reflect this (i.e. using a discount rate). The same applies to benefits (or outcomes), so the costs and outcomes can be compared. |
| Economic Evaluation | Economic evaluation compares alternatives (for example an intervention and usual care) identifying, measuring and valuing differences in the costs and benefits between these, to determine which offers the best value for money. Economic evaluation is the systematic comparison of alternative courses of action (different interventions or programmes, the different ways in which interventions or programmes are delivered, or different settings in which interventions or programmes are delivered) in terms of their costs (inputs) and consequences (outcomes), with a view to making a choice about the efficient allocation or distribution of scarce resources. There are essentially three stages to all economic evaluations: 1. identifying, 2. measuring, and 3. valuing, the costs and outcomes of the interventions or programmes under evaluation. Many economic evaluations are classified by how outcomes are measured and valued. These are: cost-effectiveness analysis (CEA), which tends to use a single measure of health outcome; cost-utility analysis (CUA), whereby multiple health outcomes are combined into generic health utility measures, such as quality adjusted life years (QALYs); cost-benefit analysis (CBA), which involves the valuation of multiple outcomes including health and non-health outcomes in money terms; and cost-consequences analysis (CCA), where an array of outcomes are presented with no attempt to aggregate them. Some economic evaluations – especially those using CBA and CCA – can also be multi-sectoral and include assessment of broader outcomes (e.g. on other family members or broader economic consequences). |
| Effectiveness | Considers whether the inputs lead to the desired outputs. I.e. does the intervention under evaluation produce the expected outcomes or level of outcomes in practice? Not to be confused with efficacy, which we consider to be how well an intervention has worked in a trial, as opposed to in practice. Efficacy asks, ‘can it work?’; Effectiveness asks, ‘does it work well enough under normal conditions’? |
| Formal (or substantive) theory | Theories that operate in different domains or disciplines. For example, incentives theory in economics, attachment theory in human development or constructivist learning theory in education. These theories are often ‘overarching’ and general, rather than specific to a particular intervention (Greenhalgh et al., 2017d). |
| Interim analysis | Analysis that is conducted before the complete dataset has been collected. This allows the researcher(s) to determine the validity of their data collection processes/tools/methods and to refine their approach as required. |
| Intervention | In health and social care, this can be a treatment, procedure, or other action taken to prevent or treat illness, disease, or improve health in other ways. It is a deliberate act or effort that aims to maintain or improve the mental or physical health of a person or population. Interventions may be direct (e.g. a surgical procedure) or indirect (e.g. a healthy eating policy). In social sciences, realist evaluators tend to refer to ‘programmes’, which can either be, or incorporate, one or many interventions. REE uses the term ‘intervention’ in a broad sense, to encompass interventions, programmes, policies, service organisation and delivery, and ways of working, etc. Most current applications of REE focus on the evaluation of complex interventions (as defined by the recent Medical Research Council framework for developing and evaluating complex interventions (Skivington et al., 2021), but this does not preclude the use of REE in evaluating ‘non-complex’ interventions or in sectors other than health. |
| Ladder of abstraction | The ‘level’ at which a theory exists. For example, a programme theory, specific to a particular intervention, is low on the ladder of abstraction; it is more concrete and less abstract, but often still ‘testable’. Conversely, a formal / substantive theory that is widely used across many domains is high on the ladder; it is more abstract and less concrete (Smith & Liehr, 2008). |
| Mechanism | Hidden processes of ‘how’ and ‘why’ an intervention ‘works’ or doesn’t work; they drive outcomes but are also impacted by different contexts. REE uses the realist definition of mechanisms, that things we experience or can observe are caused by ‘deeper’, usually non-observable, processes (Westhorp, 2014; Greenhalgh et al., 2017e). Programmes (interventions) work through ‘mechanisms’, i.e. through changing the decisions made by programme beneficiaries. “Programmes will, in one way or another, offer a range of opportunities. Whether they are cashed in depends on the potentialities and volition of the subject. We are thus claiming… that choice making [mechanisms] is the agent which engineers change within social initiatives” (Pawson & Tilley, 1997, p. 37). |
| Methodological triangulation | Using multiple research methods to study the same phenomenon, often combining qualitative and quantitative techniques. See also: Multi-method. |
| Middle-range theory | Describes the level of abstraction of a theory. A middle range theory is expressed in a manner that permits empirical testing. Middle range theories are those that “…involves abstraction, of course, but they are close enough to observed data to be incorporated in propositions that permit empirical testing.” (Merton, 1967). This term is often used in realist research and at times is confused as being programme theory or formal theory – or an additional theory required as well as both of these. However, the term “middle-range” is an adjective describing the level of abstraction of a theory. It is not another category or type of theory (Greenhalgh et al., 2017d). Programme theories in realist evaluation are usually middle-range theories, in that they can be tested (supported, refuted, refined) empirically but are also abstract enough to help explain how an intervention works. See also: Theory |
| Modelling (economic or statistical) | Mathematical equations (most often) that aim to describe a theory of choices, decisions, and behaviour (Breeze et al., 2023). In simple cases, changing a model’s input will result in a changed output and this can be used to test different scenarios. Models can also be used to triangulate different forms of data where these are having some form of combined effect. Examples of cohort or individual-based models more suited to complex interventions include system dynamic models, agent-based models, and social network models, partial least squares, complexity modelling, structural equation modelling. |
| Multi-method | “Multimethod research may be broadly defined as the practice of employing two or more different methods or styles of research within the same evaluation rather than the use of a single method (Brewer & Hunter, 2006). Unlike mixed method research, it is not restricted to combining qualitative and quantitative methods but rather is open to the full variety of possible methodological combinations” (Hunter & Brewer, 2015, p. 187). The choice of methods includes attention to the nature of what is being evaluated and its setting, and the nature of the evaluation and its questions and purposes. |
| Opportunity cost | What is foregone as a consequence of adopting a new intervention. It is the value associated with the alternative use of resources, i.e. what other benefits could the use of those same resources have achieved? This is important in a fixed budget system where increased costs will displace other services already provided. In terms of choosing to fund intervention A over intervention B, the opportunity cost of choosing A would be the potential value or the difference (incremental benefits) of A compared to B, and the difference in cost (incremental cost) of A compared to B. |
| Outcomes (also referred to as ‘Benefits’ in economic evaluation) | The consequences of the intervention. They can be positive or negative, intended or unintended, short or longer term. Outcomes can be at different levels within a system or relate to different stakeholders involved with or impacted by the intervention, including changes in economic and financial conditions, in social conditions (e.g. reduced violence or increased cooperation) or in environmental and political conditions (e.g. participation and equal opportunities). Outcome measures include the use of natural units (e.g. measures of clinical outcomes or observations), validated outcome tools (e.g. measures of health or care) or utilities (e.g. measures of health-related quality of life or wellbeing). Outcomes can also include any additional costs incurred or saved/averted that are associated with the outcomes of an intervention. |
| Paradigm | A philosophical / theoretical framework for a particular scientific school or discipline that enables the formulation of theories and generalisations, and the research performed in support of them. |
| Perspective | The point of view from which an evaluation is conducted. In economic evaluation, typical perspectives are those of the patient, hospital/clinic, healthcare system or society. The perspective you adopt for the evaluation will determine the range of costs and outcomes that will need to be considered. In economic evaluations, the societal perspective tends to reflect a full range of social opportunity costs associated with different interventions. In REE, as broad a perspective as possible should be adopted. |
| Programme Theory | A hypothesis about how an intervention is believed to work. In realist evaluation, this is often articulated in CMOC (see also: Context Mechanism Outcome Configuration). It [programme theory] is the description, in words or diagrams, of what is supposed to be done in a policy or intervention (theory of action) and how and why that is expected to work (theory of change) (Greenhalgh et al., 2017d). Different terms can be used, depending on how ‘developed’ the programme theory is: Initial Programme Theory (IPT) – This is an initial, candidate or rough version of a programme theory. It is initial because it is more speculative and less well supported by data. Realist Programme Theory – Often based on an initial programme theory, but with explicit focus on explaining how and why different outcomes are generated in different contexts (i.e. they contain CMOCs). Refined Realist Programme Theory – A realist programme theory that has been ‘tested’ (supported, refuted or refined) using primary or secondary data or literature and reflects a more developed understanding of how an intervention works, for whom, in which circumstances and to what extent. Often referred to in realist evaluation and REE as simply ‘refine programme theory’. |
| Public and Patient Involvement and Engagement (PPIE) | Research ‘with’, ‘by’ or ‘in partnership with’ members of the public [and/or patients] rather than ‘to’, ‘about’ or ‘for’ them. The term ‘public’ includes: patients and potential patients; people who use health and social care services; and/or carers (UK Public Involvement Standards Development Partnership Group, 2019). |
| Quality of Life (QoL) | A broad, multidimensional concept of an individual’s subjective evaluation of aspects of his/her life. This may include physical, social, spiritual and emotional wellbeing, as well as possibly touching on other areas such as environment, employment, education and leisure time. Within this wide-ranging definition, health-related quality of life (HRQOL) refers to the impacts of a condition and/or treatment on a patient’s functioning and wellbeing. |
| Realist Economic Evaluation (REE) | A way of assessing and comparing interventions by looking at if an intervention works (or not), how, for who, in which circumstances, and at what cost. REE is a form of evaluation that can be considered to augment realist theory of change (realist evaluation) with the theory of value creation (economic evaluation). It encompasses aspects of impact, process and economic evaluation (as defined in the UK Magenta Book central government guidance on evaluation (HM Treasury, 2020)). The approach investigates whether and how an intervention works (or not), to what extent, for whom, in which circumstances, and with what related costs and benefits. |
| Realist evaluation | An approach for understanding how, for whom and in which circumstances an intervention works (or not), as opposed to simply investigating if it works (i.e. providing a yes or no verdict on an intervention). Realist evaluation is, “An approach grounded in realism, a school of philosophy which asserts that both the material and the social worlds are ‘real’ and can have real effects; and that it is possible to work towards a closer understanding of what causes change” (Westhorp et al., 2011, p. 1). Instead of asking ‘does this work?’ realist evaluation asks, ‘what works for whom, in what contexts, in what respects and how?’ Realist approaches treat programmes (or interventions) as ‘theories incarnate’, i.e. programmes / interventions are expressions of theories about what will cause desired changes, and realist evaluation aims to reveal clear and explicit hypotheses about this. The evaluation then tests those hypotheses. Data can therefore include specific information about the programme’s context, mechanisms and outcomes. This allows the realist evaluator to explore what it is about that intervention that is creating change. |
| Realist Comparative Causal Pathways (RCCPs) | A way to compare the ways we think the intervention and comparator might work (their contexts, mechanisms and short and long outcomes) alongside their resource inputs. RCCPs are the product of bringing together the IPT(s) for the intervention and comparator alongside resource inputs and short and long-term outcomes. They include CMOC, providing a realist causal explanation, alongside economic considerations. They act as a bridge between the realist and economic disciplines to bound the design, data collection, analysis, and synthesis. |
| Realist interview | A theory-driven type of interview that focuses on unearthing data to develop and test context-mechanism-outcome configurations and their relationships within a programme theory (Manzano, 2016; Greenhalgh et al., 2017b). In practice, this may involve discussing initial programme theories with a participant to understand details about their experience which might help to refine that theory. |
| Realist Philosophy of Science | Different schools of philosophy make different assumptions about the nature of what exists in the world (ontology) and the nature of knowledge and how it is created (epistemology), as well as what constitutes ‘value’ and how causation works. A Realist Philosophy of Science view is that the universe described by science (including observable and unobservable aspects) exists independently of perceptions, and that verified scientific theories are at least approximately true descriptions of what is real (Chakravartty, 2017; Greenhalgh et al., 2017a). What distinguishes realism is its understanding of how causation works. Realist evaluations try to identify the mechanisms that cause programme outcomes, and the associated contexts, not just an association between ‘the programme’ and ‘the outcome. |
| Resources | A source of supplies, support or aid that enables someone or something to function effectively. REE uses the economic definition, which classifies resources under human, land, capital and consumable items, all of which have alternatives uses and, therefore, opportunity costs. Resources are measured and valued (costed) to estimate the resource inputs and resource consequences associated with an intervention. See also: Costs. |
| Resource mapping | A process of identifying what resources are aligned to both the intervention and comparator, where these occur in a programme theory and who incurs the resources (e.g. individuals or organisations, participants or carers). These include the inputs and resources required to implement or provide interventions (e.g. staffing, equipment, facilities, materials, and development and provision of information or learning materials) and any resources resulting from changes in short or long-term outcomes from the intervention (e.g. changes in health services resource use, demand for health services, and potentially cost savings). |
| Retroduction | Retroduction entails the idea of going back from, below, or behind observed patterns or regularities to discover what produces them (Blaikie, 2003; Greenhalgh et al., 2017c). Applying retroductive theorising to realist evaluation therefore involves starting with a programme’s effects and working backwards to think about the conditions, and the mechanisms, necessary for those effects to manifest (Jagosh, 2020; Mukumbang et al., 2021). Retroduction can be seen to incorporate abduction, deduction and induction (Blaikie, 2003). |
| Rubrics | A structured set of criteria for assessing a particular task, or element of the programme or intervention being evaluated. This helps evaluators categorise something (for example a performance) on a quantitative scale (for example, fair, good, excellent). |
| Sensitivity Analysis | Used to assess the level of confidence that may be associated with the findings of an economic evaluation. The process involves examining the changes in results of the analysis when key variables are varied over a specified range, varying key assumptions made in the evaluation (individually or together) and analysing the impact on the findings of the evaluation. Sensitivity analysis can be ‘one-way’ (varying parameters one by one), ‘multi-way’ (varying more than one parameter at the same time), ‘threshold’ (assessing the tipping point at which the value of an input parameter would alter the output of the evaluation) and ‘probabilistic’ (the distribution of outputs based on distributions of inputs). |
| Stakeholder / stakeholder involvement and engagement (SIE) | Any individual or group with a direct interest or influence in the research process or its outcomes. This includes those who could be affected by, influenced by, or benefit from the research, as well as those involved in the research itself (not to be conflated with PPIE, see PPIE) SIE entails involving stakeholders, enabling them to have an active role in the research process. |
| Theory | “A theory makes a statement about how the world works; a scientific theory is one that is testable; and a realist theory makes reference to the underlying generative mechanisms…” (Hawkins, 2016, p. 276). Westhorp (2012) and Greenhalgh et al. (2017d) identify four different types of theory that have relevance in the evaluation of complex systems: philosophical theory (ontology, epistemology, causation etc.), substantive theory (overarching theories within disciplines), programme theory (what is expected to happen in the intervention) and evaluation theory (theories about methodology and methods). See also: Middle-range theory. |
| Theory of Change | A comprehensive description and illustration of how and why a desired change is expected to happen in a particular context (Center for Theory of Change, n.d.). It is focused in particular on mapping out or “filling in” what has been described as the “missing middle” between what an intervention or change initiative does (i.e., its activities) and how these lead to desired goals being achieved. It does this by first identifying the desired long-term goals and then works back from these to identify all the conditions (outcomes) and activities that must be in place (and how these relate to one another causally) for the goals to occur. |
| Time horizon | The duration over which costs and outcomes are calculated. It is important to consider the time over which the intervention is being delivered, when the outcomes of the intervention are expected to be achieved, and the time over which the outcomes of the intervention or programme can be captured. The same time horizon should be used for both costs and outcomes and costs and benefits should be discounted accordingly (Drummond et al., 2015). |
| Transferability | The extent to which findings from one study can be applied to different contexts, settings, or populations. It is, “An evaluative criterion suitable for constructivist qualitative research to replace the (post-) positivist criterion of external validity, or generalisability, that describes the extent to which causal relationships identified in one study hold true in other populations, settings, or times.” (Stalmeijer et al., 2024, pp. 1-2 italics in original). Realist evaluation focuses specifically on causal processes (configurations of CMO) and therefore in realist evaluation the analysis enables learning that may be transferable, however this is specifically related to the transferability of the potential actions of the identified mechanism(s) under similar context conditions, i.e. mechanisms can be common across different settings (Wong, 2018). Transferability is not to be confused with Generalisability. Firestone (1993) describes three types of generalisations that are widely cited in the literature (e.g. Polit & Beck, 2010; Prabhu, 2020; Maxwell, 2021): sample to population generalisability; analytic or theoretical generalisation; case to case transfer. Mukumbang and Wong (2025) argue that analytic or theoretical generalisation is possible in mechanism-based theory-driven research methods or approaches like realist methodologies (or REE), as the premise is that the same causal forces theorised in the research setting are also found in other settings. This is used alongside contextualisation (considering the relevant contextual features in relation to the mechanisms of action) and retroduction. |
