26 Inertia tends to be lower for people who get help choosing a plan and who searched for information about CMS programs, whereas it tends to be higher for people who are older, nonwhite, and who have higher incomes. The income effect could again be due to heterogeneity in the opportunity cost of time. The directions of these effects are mostly consistent across the suspect and non-suspect groups, but the monetary implications are larger for the suspect group. The average non-suspect enrollee would have to be paid $809 to hold their utility constant if they were randomly reassigned to a different plan offered by the same insurer or $1,290 if they were reassigned to a plan offered by a different insurer. Comparable figures for the suspect group are $2,398 and $3,660. The fact that we see greater inertia for between-insurer switches compared to within-insurer switches is consistent with the inertia parameters reflecting latent preferences and hassle costs. Between-insurer switches are likely to require more time and effort than withininsurer switches as different plans offered by the same insurer tend to have the same formularies, pharmacy networks, customer service, and so on. In contrast, insurers typically differ along these dimensions, so that switching insurance companies may require new prior authorization requests, transferring prescriptions to new pharmacies, and becoming familiar with new formulary and customer service systems. That said, it could also be possible that psychological biases are greater for between-brand switches.

B. Validation Tests

A potential concern with our approach to distinguishing between suspect and nonsuspect choices is that it could be overfitting the data and consequently yielding less accurate predictions for how consumers will respond to prospective policies. We assess the model’s predictive power by using validation tests similar to Keane and Wolpin (2007) and Galiani, Murphy, and Pantano (2015). The idea is to compare the out of sample predictions from our refined model (i.e. distinguishing between suspect and non-suspect choices) with the naïve pooled model. Our validation test is powered by the largest yearto-year change in the PDP choice set that occurred during our study period. Between 2008 and 2009 the number of plans fell by 10%. We use data from 2008 to estimate the 27 naïve and refined models and then use each set of estimates to predict how consumers would adapt to their new choice sets in 2009.30 Table A5 shows that the refined model more accurately predicts the share of consumers choosing dominated plans; the share of consumers choosing the least expensive plans offered by their insurers; mean expenditures; the average amount that consumers choosing dominated plans could save by switching; and the share of consumers choosing to switch plans. The pooled model does a better job of predicting one outcome by one percentage point—the share of consumers choosing plans with gap coverage. Overall, this exercise suggests that distinguishing between suspect and non-suspect choices improves the model’s predictive power.

As an indirect test of our maintained assumption that people in the suspect and nonsuspect groups share the same underlying utility parameters, conditional on demographics and prescription drug use, we leverage the panel structure of our data to repeat the estimation for four mutually exclusive sets of enrollment decisions: (1) choices made by enrollees who always make suspect choices (n=3,749); (2) suspect choices made by enrollees who sometimes make non-suspect choices (n=617); (3) non-suspect choices made by enrollees who sometimes make suspect choices (n=759); and (4) choices made by enrollees who always make non-suspect choices (n=4,706). The results, shown in Tables A6-A7, reveal that the estimated marginal rates of substitution between cost, variance, and quality are similar between groups 1 and 2, and between groups 3 and 4, despite some reduction in statistical significance. In other words, when people who switch between the suspect and non-suspect groups make non-suspect choices they behave in similar ways to the people who always make non-suspect choices. This supports the assumptions underlying our approach of using non-suspect preference parameters to predict welfare effects for people in the suspect group.

–  –  –

30 We exclude indicators for insurance brand because some new insurers joined the market in 2009 so we are unable to estimate indicators for them in 2008.

–  –  –

Firms may adapt to a choice architecture policy by adjusting premiums to offset changes in consumer sorting. We allow for this by assuming that firms will anticipate how consumers will adjust their behavior and then reset plan premiums to maintain the net revenue per enrollee that they earned prior to the policy. This is equivalent to assuming that CMS would accompany any policy by maintaining the same plan approval and oversight processes that yielded the plan-specific net revenue per enrollee observed under the status quo.

For the baseline equilibrium the expected net revenue per enrollee in plan k is ∑ 12.


Premiums are divided by 0.255 to reflect the fact that beneficiaries pay on average 25.5% of actual premiums, with the remainder subsidized by taxpayers. The second term,, represents the cost of plan management and operations per enrollee (e.g. customer service) which we assume is invariant to policy-induced changes in enrollment. The last term is the insurer’s expected cost of drugs for the enrollee; is the gross cost of the drugs used by consumer i so that represents expenditures paid by the insurer.

Equation (13) shows the fixed point problem that we solve to obtain the new vector of premiums, 13, 0.

Because is constant it cancels out of the difference in (12). We observe,, and for all person-plan combinations and we use our parameter estimates for suspect and non-suspect choices to calculate. All that remains is to solve for. This requires modeling how choice probabilities change with premium adjustments. All else constant, increasing plan k’s premium will reduce the probability that people select it. Therefore, we iterate between solving for a vector of premiums to satisfy (13), conditional on, and updating to reflect changes in the vector of premiums, until convergence.

–  –  –

After solving for new vectors of plan premiums and choice probabilities we use the results to calculate changes in insurer revenue and government expenditures. Equation (14) defines the change in insurer revenue per enrollee.

∑ ∑ ∑ ∑ 14 ∆.

∈ ∈ While the revenue per enrollee for a given plan is held fixed by (13), the overall market revenue per enrollee may change due to changes in the way enrollees sort themselves across plans.31 Hence, choice architecture may mitigate or exacerbate adverse selection consistent with Handel (2013). Equation (15) defines the expected change in government spending per enrollee.



∑ ∑ ∑ ∑ 15 ∆.

∈ ∈..

The term in brackets is the component of the total plan premium paid by taxpayers.

iii. Bounding Consumer Welfare, Insurer Revenue, and Government Spending Section V explained our approach to bounding the welfare effect of inertia and the policy’s effect on consumer behavior. We focus on the extremes defined by the union of these bounds. At one extreme is the case where the policy is “most effective” as a nudge in the sense that it causes the suspect group to start behaving like the non-suspect group and the inertia parameters estimated for the non-suspect group reflect psychological bias and hence have no direct effect on welfare, i.e. using equation 9’ and 11’ with and. At the other extreme is the case where the policy is “least effective” as a nudge in that it does not change the suspect group’s behavior and the inertia parameters for the non-suspect group reflect the hassle cost of switching plans and/or preferences for latent plan attributes and hence are welfare relevant (i.e. using equations 9 and 11). If we held 31 This also means that average revenue per enrollee may change for any insurer offering multiple plans.

–  –  –

In early 2014, CMS proposed a series of changes to Medicare Part D that included a provision to limit each parent organization to offering only one basic and one enhanced plan in each region (Department of Health and Human Services 2014).32,33 This would have forced some current enrollees to switch plans. While the proposal was controversial and has yet to be implemented, it provides a prospective opportunity to investigate the effects of a realistic menu restriction.

Our first policy experiment uses the set of enrollees and available plans in 2010—the last year of our sample—to simulate the welfare effects of the proposed menu restriction.

Our data for that year describe 2,823 individuals, both new enrollees and those with experience. CMS must approve each PDP that an insurer offers, but the proposed regulation was unclear about how, exactly, CMS would determine which plans to retain. Therefore we start by assuming that CMS would require each sponsor to continue to offer their most popular plans; i.e. the single basic plan and the single enhanced plans with the highest enrollments. Then we consider alternative rules as robustness checks below. The menu restriction reduces the number of plans on the average enrollee’s menu from 47 to 31.

The menu restriction affects consumer welfare in at least four ways. First, people may be made worse off when their utility maximizing plans are eliminated. Second, individuals who switch plans may incur hassle costs of switching. Third, individuals in the suspect group may be made better off if the policy forces them to switch out of a dominated plan or if the policy succeeds in reducing their inertia and nudging them to place greater emphasis on cost and risk reduction in ways that induce them to switch to plans that are cheaper, higher quality, and provide better insurance against health shocks. The magniParent organizations” or “sponsors” are entities that contract with CMS to sell PDPs. They may include multiple brand names.

Basic plans may differ in design but must be deemed actuarially equivalent to the standard benefits package for some representative enrollee(s). Enhanced plans offer supplemental benefits.

33 The proposal included the rationale to “…ensure that beneficiaries can choose from a less confusing number of plans that represent the best value each sponsor can offer” (Federal Register 2014).

31 tude of each of these gains (or losses) depends on which plans are eliminated and the relative benefits of switching. Finally, when enrollees switch plans their sorting behavior feeds into equilibrium premiums. As Handel (2013) points out, the direction of this effect is ambiguous. Increased sorting may increase or decrease adverse selection depending in part on whether the sorting is driven by suspect or non-suspect choosers.

To summarize results we start by focusing on the case in which CMS requires each insurer to retain their basic and enhanced plans with the highest numbers of enrollees. Figure 1 summarizes the distributional effects on the beneficiary population. It shows CDFs of the expected consumer surplus under the “most effective” and “least effective” scenarios for the efficacy of the policy in nudging consumers [henceforth ME and LE]. The bar charts in the bottom half of the figure summarize the demographic characteristics of the people who have expected welfare gains (i.e. winners) and losses (i.e. losers). In both scenarios fewer than 25% of consumers are made better off by the policy, yet the policy appears relatively progressive. The winners are less likely to have a college degree; they are more likely to have been diagnosed with cognitive illnesses; they tend to have higher gross drug expenditures (consumer expenditures + insurer expenditures); and they are far more likely to belong to the suspect group. That said, the median consumer in every subgroup we consider is made worse off under both scenarios.

Figure 2 summarizes the mechanisms that drive the welfare effects. It reports the shares of winners and losers who are forced to switch because the policy eliminates their current plans, followed by their average reductions in their premiums, their average reductions in OOP expenditures, their average reductions in variance, and the average increases that they experience in the CMS quality index as well as the index of latent quality defined by the insurer dummy variables. The last three effects are converted to dollar equivalents by dividing the changes in each variable by the marginal utility of income for the non-suspect group.

–  –  –

Most effective nudge:  inertia is not latent preferences; policy changes behavior $1,000 Least effective nudge:  inertia is latent preferences; policy does not change behavior $500 $0 0 0.5 1 ‐$500 ‐$1,000 ‐$1,500 Note: The figure shows CDFs of the expected change in welfare from limiting each insurer to selling one basic plan and one enhanced plan, assuming that CMS requires insurers to keep the plans with the highest current enrollment. The bar charts report demographics for the average enrollee with welfare gains (“winners”) and losses (“losers”) under alternative assumptions about the efficacy of the nudge.

–  –  –

Note: The first column reports the share of winners and losers who are forced to switch because their chosen plans are eliminated. The next two columns report average reductions in premiums and out of pocket expenditures. The last three columns use the marginal utility of income for the non-suspect group to report the reduction in variance and the increases in plan quality in monetary equivalents.

In the ME scenario just under 25% of consumers are made better off. Nearly half of them are forced to switch plans but under the ME assumptions, switching does not reduce utility.Many of the people who are forced to switch, particularly those in the suspect

–  –  –

