Note: The top half of the table reports means based on enrollees in the merged administrative-MCBS sample that we use for estimation. The bottom half of the table reports means based on a random 20% sample of all individuals who enrolled in Medicare Part D for the entire year. The two data sets differ in that 6.5% of the observations in our merged sample are individuals who enrolled during the middle of the year. We drop these 640 individuals before calculating sample means in order to ensure comparability between the two data sets.

Note: The table reports coefficients on an indicator for whether the individual answered the MCBS knowledge question incorrectly. Robust standard errors are clustered by enrollee. *,**, and *** indicate that the p-value is less than 0.1, 0.05, and 0.01 respectively.

The first row reports results from regressing an indicator for whether the individual was enrolled in a dominated plan on an indicator for whether the decision maker gave the wrong answer to the MCBS knowledge question. Control variables include the explanatory variables in Table 4 and indicators for year and CMS region. Hence, the coefficient measures the partial effect of knowledge on decision making outcomes. The first column shows that there is no relationship between knowledge and the probability of choosing a dominated plan in the full sample. Column 2 conditions on active choices made without help. The point estimate is insignificant (p=.172) but implies a 2.1% increase in the probability of choosing a dominated plan. In contrast, the point estimate is approximately zero for active choices made with help in column 3 The second row reports results after replacing the dependent variable with the amount of money (in dollars) that the individual could have saved by purchasing a cheaper plan offered by the same brand. In column 1 the point estimate is $4 but insignificant. In column 2 we see that answering the knowledge question incorrectly was associated with a statistically significant $24 increase in potential savings for people making active choices without help. This effect is equivalent to 27% of the potential savings within this group. In contrast, when we focus on active choices made with help in column 3 the point estimate is a statistically insignificant $2. The last two columns condition on dominated plan choices. For this group we see that those answering the knowledge question incorrectly who did not have help could have saved approximately $61 than those who answered the question correctly, whereas the effect for those getting help is negative and imprecisely estimated.

To assess the estimates from the logit model for non-suspect choices, we compare its implied risk premiums in a manner comparable with prior literature. Specifically, deriving the risk premium from the logit model as a 1st order approximation to a CARA model yields the following

expression for the risk aversion coefficient:

The estimates in Table 5 for the reference individual in the non-suspect group yields.000773. Table A5 translates this into a risk premium for various 50-50 bets. These results are broadly consistent with the range of prior results, e.g. as reported in Table 5 of Cohen and Einav (2007). Cohen and Einav find the mean consumer would be indifferent between a 50-50 bet of winning $100 and losing $76.5, whereas the median consumer is virtually risk neutral. In contrast, our results imply the mean non-suspect consumer is indifferent between a 50-50 bet of winning $100 and losing $96.3 although Cohen and Einav argue that preferences likely differ between their automobile insurance context other contexts like drug insurance. In the health insurance context, Handel (2013) finds that the median individual is indifferent between a bet of winning $100 and losing $94.6. In the model preferred by Handel and Kolstad (2015), the mean consumer is indifferent between a bet of winning $1,000 and losing $913. This controls for friction and inertia. In comparison, our results imply indifference between winning $1,000 and losing $739.


Note: The table reports parameter estimates from logit models estimated from data on all choices; from non-suspect choices only; and from suspect choices only. All models include indicators for insurers. Robust standard errors are clustered by enrollee. *,**, and *** indicate that the pvalue is less than 0.1, 0.05, and 0.01 respectively.

Table A7 reports results from a logit model validation exercise. The purpose is to determine whether the models estimated separately by suspect and non-suspect choices outperform the pooled model, and whether the suspect model better predicts suspect choices than the non-suspect model does and vice versa. For this exercise the estimation sample is 2008 while the prediction sample is 2009. We chose these two years because they incorporate the largest year-to-year change in the choice set in our data—a central aspect to out-ofsample validation methods (Keane and Wolpin 2007). In particular, the number of plans available fell by 10%, although three new brands entered the market, precluding our use of brand indicators in the models. The results show that both in-sample and out-ofsample predictions are closer to the data along a number of policy-relevant outcomes when we base the predictions on separate models for the given type of choice. Blue shading is used to indicate the moments where our preferred model that distinguishes between suspect and non-suspect choices outperforms the pooled model. Red shading indicates moments where the pooled model performs better.

Panels A repeat our baseline results. Panel B uses a more inclusive definition based on the union of dominated plan choices, the knowledge question, and being able to reduce expenditures by more than 33%. Panel C includes enrollment decisions from the inaugural year of Medicare Part D.


