Advice for impact evaluations with government: Drop the baseline


This was a case where we did randomization without a baseline. I highly recommend this when you’re working with a government because the biggest risk is implementation failure. You’ll spend a lot of time doing the baseline – spend time, spend money – and have the intervention not be implemented. So when you’re working with the government, it’s better to get power by doubling the sample of your endline and just randomized with administrative data so you’ll get the same amount of power but you reduce risk up front.

That is Karthik Muralitharan speaking at the RISE conference today. Of course, he didn’t have to say that doubling your sample also increases your likelihood of randomization resulting in balanced intervention and comparison groups, thus making a baseline less necessary.

Update: This prompted a very active discussion on Twitter, which you can read in full here. Below are a few points.

Ultimately, there are a number of factors to consider — the potential sample size, the probability of implementation failure, the importance of baseline covariates for your analysis. But still, where there is serious concern that the program may not be implemented as expected — and especially if there are decent administrative data — it’s worth consideration. I’ll give the penultimate words to Karthik’s co-author, Abhijeet Singh.

First, responding to Pauline and Andrew’s point.

Second, to Cyrus and Seema’s points.

And the last word to Karthik himself.


There are many more comments, but I won’t embed them all here. You can read the full conversation here on Twitter.


Presenting at an economics conference? Get a room close to the coffee break.

You submit your paper to a conference. You’re all fired up to share your work and to get feedback. Then no one shows up to your session! Is it because everyone hates your work? or because it’s 8am? or both? Günther, Grosse, and Klasen (published version; working paper) identify some correlates of session attendance.

We analyze the drivers of audience size and the number of questions asked in parallel sessions at the annual conference of the German Economics Association. We find that the location of the presentation is at least as important for the number of academics attending a talk as the combined effect of the person presenting and the paper presented. Being a presenter in a late morning session on the second day of a conference, close to the place where coffee is served, significantly increases the size of the audience. When it comes to asking questions, location becomes less important, but smaller rooms lead to more questions being asked. Younger researchers and very senior researchers attract more questions and comments. There are also interesting gender effects. Women attend research sessions more diligently than men, but seem to ask fewer questions than men. Men are less likely to attend presentations on health, education, welfare and development economics than women. Our findings suggest that strategic scheduling of sessions could ensure better participation at conferences. Moreover, different behaviors of men and women at conferences might also contribute to the lack of women in senior scientist positions. [Emphasis added by me]

So, do whatever you can to angle for that second-day, late-morning slot.

How does lowering the cost of schooling in early years affect later attainment?

School Costs, Short-Run Participation, and Long-Run Outcomes: Evidence from Kenya”: My paper with Mũthoni Ngatia is out as a World Bank Policy Research Working Paper. Here’s what we learned.

uniforms abstract

Even though primary education is “free” in many countries, families face many incidental expenses: uniforms, transport, and materials, among others.


In Kenya, we worked with an NGO that provided free school uniforms to children to reduce the cost of schooling.


I know that you’re going to say: Do we need another study of “giving stuff” for education and how it affects attendance? Aren’t we supposed to be focused on learning and pedagogy?


First, while attending school is no guarantee of learning, it’s a really important part of the process.


Second, we follow these students over 8 years. Few international education studies trace the time path of impact.


A school uniform can increase school participation by multiple means. Families don’t have to pay for the uniforms. AND students don’t feel stigmatized by being the only kid without a uniform.


What do we find? In the short run, providing a school uniform does increase school participation.


The impacts are particularly large for the poorest kids. Absenteeism drops by 15 percentage points for them, eliminating 55 percent of absenteeism for them.


But 8 years later, the children who participated in the program had no better educational outcomes than those who did not.


Some educational interventions have long-lasting impacts: Smaller early-grade classes in the USA have translated into better college performance.


But we can’t assume it. In this case, initial gains in school participation do not translate into more school completion.


And a few last words from the paper: “Take care when interpreting short-term results, taking into account these results and others which demonstrate that long-term impacts may vary – sometimes dramatically – from initial effects.”


“Gathering long-term data is costly, but without it, the trajectory of impacts resulting from the wide range of interventions currently being implemented remains a mystery.”


That’s it! Big short-term impacts for poor kids but disappointing long-term impacts. Check out the paper!

thank you


How do researchers estimate regressions with patient satisfaction at the outcome? A brief review of practice

Recently, Anna Welander Tärneberg and I were doing research with patient satisfaction as the outcome, and we checked how other researchers had estimated these equations in the past. Here is what we found, as documented in the appendix of our recently published paper in the journal Health Economics.


People use a lot of different methods, and many authors use multiple methods. But there is a rich history of using Ordinary Least Squares regressions to estimate impacts on patient satisfaction. In our paper, we used OLS but verified all the results with Probit and Logit regressions. To add to this list, Dunsch et al. (including me) have a new paper out last week on patient satisfaction in Nigeria, also using OLS as the main estimation method.

Are patients really satisfied?

Yesterday, my new paper — Bias in patient satisfaction surveys: a threat to measuring healthcare quality — was published at BMJ Global Health. It’s co-authored with Felipe Dunsch, Mario Macis, and Qiao Wang.

Here’s a quick rundown of what we learned.

Many health systems use patient satisfaction surveys to gauge one key element of health services: the patient experience. As part of evaluating an intervention to improve health care management in Nigeria, we piloted patient satisfaction questionnaires.

bad gus fring GIF-downsized

A common way to measure patient satisfaction is to give patients a series of statements and to ask if they agree or disagree: “The clinic was clean.” “Staff explained your condition well.” That kind of thing. We noticed that people seemed to be agreeing with everything.

sesame street nod GIF-downsized

Was that because the quality of care was good, or because saying yes is just easier?

neil degrasse tyson whatever GIF-downsized

So we randomly assigned some patients to get the standard statements, and others to get negatively framed statements. “The clinic was dirty.” “Staff were rude.” “Staff explained your condition poorly.”

questions asking GIF by Grandfathered-downsized

Reframing the questions negatively led to significantly lower reports of patient satisfaction on almost every item.

sad inside out GIF-downsized

Sometimes the drops were big, as high as 12 percentage points and 19 percentage points.

leaving gene kelly GIF-downsized

Our work was in Nigeria, but a lot of patient satisfaction questionnaires that we reviewed from a lot of countries use this positive framing.

everybody shrug GIF by Harlem Globetrotters-downsized

But that likely gives a falsely optimistic view of the patient experience.

happy will smith GIF-downsized

Patient satisfaction questionnaires that either mix positive and negative framing, or that avoid agree/disagree formats can do better.

better GIF-downsized

Of course, the quality of care is much more than patient satisfaction. But good health care systems can and should offer a positive patient experience.

parks and recreation thumbs up GIF by HULU-downsized

Check out the paper!

How Angus Deaton thinks you can make the world a better place

When Princeton students come to talk with me, bringing their deep moral commitment to helping make the world a better, richer place, it is these ideas that I like to discuss, steering them away from plans to tithe from their future incomes, and from using their often formidable talents of persuasion to increase the amounts of foreign aid. I tell them to work on and within their own governments, persuading them to stop policies that hurt poor people, and to support international policies that make globalization work for poor people, not against them.

This is (almost) the end of Deaton’s book The Great Escape: Health, Wealth, and the Origins of Inequality. This counsel reminds me of the Commitment to Development Index, which shows that there are many policies that rich countries can enact to help the poor beyond their borders besides providing foreign aid, such as easier migration rules and lower tariffs.