A few months ago I read this paper on how to do randomization; it has just come on-line, and I recommend it highly. Meanwhile, I summarized it; here are the greatest hits.In Pursuit of Balance: Randomization in Practice in Development Field Experiments
By Miriam Bruhn (WB) and David McKenzie (WB, IZA, BREAD)
Randomized experiments are increasingly used in development economics. … This paper carries out an extensive review of the randomization methods used in existing randomized experiments, presents new evidence from a survey of leading development economists, and carries out simulation results in order to provide guidance for researchers considering which method to use for randomization.
The shortest summary of results:
in samples of 300 or greater, the different randomization methods perform similarly in terms of achieving balance in outcomes variables at follow-up. In smaller samples, however, the choice of randomization method is important, with matching and stratification performing best at achieving balance. Moreover, the ex-post analysis should explicitly account for how the randomization was conducted by including the appropriate controls. [Don’t worry: they tell us how!]
How are most researchers randomizing?
most researchers have at some point used simple randomization (probably with some stratification) – 80 percent of the full sample and 94 percent of researchers who have carried out five or more experiments have done this. However, we also see much more use of other methods than is apparent from the existing literature. 56 percent had used pairwise matching … 32 percent of all researchers…have subjectively decided whether to re-randomize based on an initial test of balance. The multiple draws process described [do a bunch of randomizations, then pick best balance] above has also been used by 24 percent of researchers, and is more common amongst the researchers with 38 percent of the 5 or more experiment group using this method.
“Which methods do better in terms of achieving balance and avoiding extremes?”
on average all methods of randomizing lead to balance. however… stratification, matching, and especially the minmax t-stat method have much less extreme differences in baseline outcomes, while the big stick method only results in narrow improvements in balance over a single random draw. [In other words, on average they’re all about the same, but your less likely to occasionally get a highly unbalanced draw with stratification, matching, and minimizing the t-stat.]
“What does balance on observables imply about balance on unobservables?”
Aickin (2001) notes that methods which balance on observables can do no worse than pure randomization with regard to balancing unobserved variables.
Should you control for stratification or pair-wise matching in the analysis?
Thus, on average, it is overly conservative to not include the controls for stratum or pair in analysis. [i.e., your std errors are too big] … BUT in a non-trivial proportion of draws, it will be the case that not including stratum dummies will be anti-conservative, potentially leading the researcher to find a significant effect that is no longer significant when stratum dummies are controlled for. Hence researchers can not argue that if they ignore the randomization method, and find significant effects treating their study as if they purely randomized, that these same treatment effects will necessarily remain significant if one were to account for the method of randomization.
In the analysis, how do you control for having randomized a bunch of times and then chosen the randomization with best balance?
the correct statistical methods for covariate-dependent randomization schemes such as minimization are still a conundrum in the statistics literature