46 Pages Posted: 9 Oct 2014 Last revised: 28 Aug 2016
Date Written: October 7, 2014
Sixteen US states have begun to hold teacher preparation programs (TPPs) accountable for teacher quality, as estimated by teacher value-added to student test scores. Yet it is not easy to identify TPPs whose teachers are substantially better or worse than average. The true differences between TPPs are small; the estimated differences are not very reliable; and when many TPPs are compared, multiple comparisons increase the danger of misclassifying ordinary TPPs as good or bad. Using large and diverse data from Texas, we evaluate statistical methods for estimating teacher quality differences between TPPs. The most convincing estimates come from a value-added model where confidence intervals are widened for multiple comparisons and for the correlation between teachers from the same TPP. Using these widened confidence intervals, it is rarely possible to identify with confidence which TPPs, if any, are better or worse than average. The potential benefits of TPP accountability may be too small to balance the risk that a proliferation of noisy TPP estimates will encourage arbitrary and ineffective policy actions.
Suggested Citation: Suggested Citation
von Hippel, Paul T. and Bellows, Laura and Osborne, Cynthia and Arnold Lincove, Jane and Mills, Nicholas, Teacher Quality Differences Between Teacher Preparation Programs: How Big? How Reliable? Which Programs Are Different? (October 7, 2014). Economics of Education Review, Vol. 53, pp. 31-45, 2016. Available at SSRN: https://ssrn.com/abstract=2506935 or http://dx.doi.org/10.2139/ssrn.2506935