Rachel M. Gisselquist and Miguel Niño-Zarazúa
Over the past decade, randomized controlled trials (RCTs) have become a staple of research in development economics. Proponents of RCTs have advocated for their use as the best means of identifying ‘what works’ in development, while sceptics voice strong concerns about their growing hegemony in the field. Last year, two influential books, Karlan and Appel’s More Than Good Intentions, and Banerjee and Duflo’s Poor Economics, summarized what RCTs can tell us about how to reduce global poverty. Sceptics such as Angus Deaton and Martin Ravallion point out that RCTs, even if well designed, are not the ‘gold standard’ to policy evaluation as they often rely on small samples (and small pilot interventions) that cannot tell us much about whether a policy would work if scaled up at national level, or transferred to different socioeconomic and political conditions.
Equally important are the concerns associated with the fact that RCTs are usually conducted under a short time window of analysis, and therefore are ill-equipped to deal with development processes that take place in the course of decades or even generations.
Building on this debate, UNU-WIDER initiated the project ‘Experimental and Non-Experimental Methods to Study Government Performance’ that explores the contributions and limits of RCTs in studying another major topic in development: governance. Despite a large literature on governance and on experimental methods, very little work has directly considered both subjects together in this way.
Children collect water from a water-pump well in the Abyei suburb of Molomol, where individual voluntary
returnees from North Sudan are settling with the assistance of the United Nations. Photo©UN Photo/Fred Noy
Governance is a contested concept, especially among development practitioners. This project adopts a definition of governance that builds on theories of government and the state, which point to two major roles for public institutions in providing public goods and representing public interests. How and how well governments govern is a matter central to the study of the politics of development effectiveness, and the field of political science offers a variety of explanations. Major structural explanations, for instance, highlight levels of development, class structures, and ethnic divisions. Institutionalists point to how rules and norms shape the ‘rules of the game’, often in unexpected and long-running ways, and explore the impact of a range of institutions, including electoral rules, executive structures, decentralization, and federalism. Other work focuses on how political culture affects the functioning of democratic governance, as well as on the sometimes decisive influence of political leadership.
Theories tend to deal with the two roles of government separately, offering explanations either for better representation and accountability (often framed in terms of the emergence of liberal democracy versus other forms of government), or for more effective public goods provision. Many studies focus on disaggregated governance outcomes, such effective policing, property rights, or universal health care. Indeed, far from having a single model of change in governance, the literature gives us diverse, multiple, and sometimes contradictory explanations. One simple example is [de]regulation—is more or less better?
Findings from RCTs highlight a range of strategies, projects, and other interventions that governments could adopt to improve specific aspects of governance. Some interventions that have been explored in multiple contexts include public information campaigns, financial incentives to improve the performance of public sector employees, community-based monitoring systems, and public deliberation at the local level. But a degree of uncertainty remains with regard to the underlying mechanisms (and theories) that explain the distribution of policy outcomes for a given treatment group (and its placebo) vis-à-vis the distribution for the entire population.
Limits of field experiments
Indeed, one common criticism of experimental studies is that they neither address ‘big’ questions nor ‘big’ theories of governance (or development). Comparing the questions explored in RCTs with those identified in major theories of governance suggests that there is something to this.
On the other hand, proponents of RCTs make a compelling argument that their micro approach offers more convincing explanations than grand theories, by looking at small policy reforms that at the margin can lead to desirable improvements in policy. Compelling as this may sound, this micro focus exposes one of the key weaknesses of RCTs: the low external validity of their findings. Precisely because experimental researchers tend to eschew high-level theorizing, they have little to say about what, within particular contexts, might be unique or have influenced the results, and why their findings should be expected to be generalizable. This is compounded by the fact that experiments are rarely replicated across multiple contexts.
A third limit to RCTs in the study of governance is in the type of causal factors that they can reasonably study. This constraint follows partly from the need for large numbers of units to be studied in order to gain precise statistical estimates (and low standard errors), which encourages researchers to focus on low-level factors, rather than those held by higher level units, such as national institutions. This also comes from the simple inability of researchers to manipulate some key variables, such as the level of development. In other cases, ethical considerations are the source of great concerns that should be appropriately weighted and assessed when studying a particular social phenomenon.
RCTs are similarly limited in terms of the unit of analysis upon which they can evaluate impacts, which is generally the individual or households, rather than communities or the nation. Many theories of government focus on non-linear processes that evolve over decades, while RCTs rarely look at impacts beyond the linear trajectory between two points in time, usually a few years. Take, for example, the hypothetical case of a J-shaped curve derived from the long-term relationship between economic liberalization and political stability: In the short term, economic liberalization leads to a sudden rupture between economic actors that causes an increase in political instability. An RCT may conclude that economic liberalization is bad for political stability. However, once markets and institutions are developed further, political stability may actually begin to improve.
A final issue is cost. Even if RCTs could be adapted to address some key theories of governance, it is not necessarily clear that they would be more cost-effective in testing theories than non-experimental methods.
In short, our ongoing research suggests that in the way RCTs have been designed so far they have some, but limited, utility in research to understand the underlying factors that affect the variation in government performance. They have made key contributions to knowledge by showing the effect of some targeted interventions with relatively rapid results, but major hypotheses about how government performance could improve will not be addressed using RCTs. A central question for us is to find out whether, and to what extent, the principles upon which RCTs are based could be reconciled with non-experimental (econometric) methods, to find an analytical middle ground. In the expectation that social experimentation relies on structural models of economic and political behaviour, this could potentially provide insights about whether an intervention that works in one context could work (and why) in other socioeconomic and political contexts. More on this project will be posted on our webpage: http://www.wider.unu.edu/research/current-programme/en_GB/Experimental-Methods-Study-Goverment-Performance/