The Missing Main Effect of Welfare State Regimes : A Replication of ‘ Social Policy Responsiveness in Developed Democracies ’ by Brooks and Manza

This article reports the results of a replication of Brooks and Manza’s “Social Policy Responsiveness in Developed Democracies” published in 2006 in the American Sociological Review. The article finds that Brooks and Manza utilized an interaction term but excluded the main effect of one of the interacted variables. This model specification has specific implications: statistically, that the omitted main effect variable has no correlation with the residual error term from their regression; theoretically speaking, this means that all unobserved historical, cultural, and other characteristics that distinguish liberal democratic welfare regimes from others can be accounted for with a handful of quantitative measures. Using replicated data, this article finds that the Brooks and Manza models fail these assumptions. A sensitivity analysis using more than 800 regressions with different configurations of variables confirms this. In 99.5 percent of the cases, addition of the main effect removes Brooks and Manza’s empirical findings completely. A theoretical discussion illuminates why these findings are not surprising. This article provides a reminder that models and theories are coterminous, each implied by the other.

B ROOKS and Manza (2006a) published a finding that policy responsiveness ex- plains social welfare spending across advanced democracies in the American Sociological Review, and their article is now one of the journal's 100 most cited.In their article they suggest that public preferences shape divergent trajectories of welfare states over the long term.Whereas previous scholars considered national values, power resources, and the path dependency of political institutions, Brooks and Manza's approach suggests that the will of the masses is a powerful political force shaping social policy output.This is an attractive argument, especially for sociologists who are skeptical of democratic processes and expect policymaking to cater to elite interests, often against what the public wants (Habermas 1989).The idea that democracy empowers the masses and that this empowerment helps to define and reproduce societal outcomes such as welfare spending has gained considerable attention to say the least.Before singing the triumphs of mass public preferences in democratic processes, I take a second look at Brooks and Manza's empirical models and attempt to replicate their research.Spoiler alert: I will show that Brooks and Manza made a specification error.
This article unfolds somewhat in reverse.First, I provide a discussion of the implications of Brooks and Manza's statistical models.They use an interaction term to allow differences in the impact of policy preferences between liberal versus

Breznau
The Missing Main Effect of Welfare State Regimes non-liberal democratic welfare regimes, but they do so without one of the two main effects.The theoretical implications of this modeling strategy are that any unobserved differences associated with liberal and non-liberal regimes are entirely unrelated to social welfare spending levels.Turning this statement around, the implication is that different types of welfare state regimes do not explain different levels of welfare state spending.Brooks and Manza do not discuss this assumption; therefore it might be a case of model specification error.Next, I replicate their research and then test for specification error via inclusion of the missing main effect.Finally, I discuss the theoretical assumptions of the two potential models and why welfare state theories explain the critical role of the democratic welfare state regime main effect.

Policy Responsiveness and Specification Error
Social scientists regularly debate policy responsiveness, asking whether social policies change in response to public preferences (Burstein and Freudenburg 1978;Page 1994;Papadakis 1992).In this debate, more or less spending is regularly used to operationalize policy.Further, social welfare policy 1 is a focus area in this debate because transfers from the government to protect against social risks and to provide for those in need have a large impact on the socioeconomic wellbeing of many individuals in advanced democratic societies (Papadakis 1992).Some argue that social policies are not democratically responsive, observing empirically that spending and opinion do not exhibit significant patterns (Brooks 1985(Brooks , 1987(Brooks , 1990;;Larsen 2006;Page and Shapiro 1983, Table 7).However, Burstein's meta-reviews (1998Burstein's meta-reviews ( , 2003) ) of the major opinion-policy research projects of the 1970s, 80s, and 90s suggest that empirical evidence favors policy responsiveness.For example, levels of public support for a policy determine later changes to that policy (e.g., Monroe 1998), and opinion and welfare policy exhibit significant covariation in the presence of other predictor variables (e.g., Wright, Erikson, and McIver 1987).More recent research also suggests that opinion may cause changes in welfare policy within a larger feedback cycle where each responds to the other over time, as shown in the United States, Canada, the United Kingdom, and the Netherlands (Raven et al. 2011;Soroka and Wlezien 2005a, 2005b, 2010;Wlezien 1995Wlezien , 2004)).
The research of Brooks and Manza (2006a, 2006b, 2007) suggests that policy responsiveness also explains variation across countries, giving the debate a new twist.Using pooled regression modeling they showed that cross-national variation in social welfare spending as a percentage of GDP is significantly predicted by variation in public policy preference levels during the period 1985-2001 across 15 advanced democracies.Based on the idea that public preferences shape policy within societies, they proposed that it is part of the structure of welfare state systems and helps keep them on their respective longer-term paths.Their work paved the way for a new research agenda and a theory of policy responsiveness that uses public policy preferences to explain spending levels between societies, or 'Why welfare states persist,' to use one of their titles.Their research is the point where this paper engages the policy responsiveness debate.Here is how Brooks and Manza support their policy responsiveness claims statistically.They first compile a dataset with their dependent variable of welfare spending as a percentage of GDP at PPP by year from the Organization for Economic Cooperation and Development (OECD)'s Social Expenditures Database.They select their countries and years based on the public policy preferences data available in the International Social Survey Program (ISSP).The result is 15 countries, most of them observed at a handful of different time points from 1985-2001.The public preferences measure constituting their independent variable, which they label mass social policy preferences, comes from two ISSP questions on the role of the government in reducing income inequality and providing labor market opportunities; answers range from "definitely should not be" and "probably should not be" to "probably should be" and "definitely should be" the government's responsibility.These questions are part of a latent scale of attitudes toward social welfare. 2 Based on a review of relevant literature they select year, per capita GDP, unemployment rate, aged population, women's labor force participation (LFP), political institutions, left party control, and religious party control as their control variables.Their public policy preferences variable and relevant control variables are lagged one year behind the dependent variable.
Next they regress social welfare spending on their independent preferences variable plus all relevant control variables and they use robust cluster standard errors to account for the non-independence of repeated observations by country. 3 They use two models, one without the religious and left party variables and one with them, to account for possible indirect effects running through partisan politics.In both models they include an interaction effect of their public preferences measure with a welfare regimes dummy variable coded as 1 for liberal (i.e., English-speaking plus Japan) and 0 for all others (i.e., social democracies of Scandinavia and conservative/Christian democracies in the rest of Western Europe). 4 This leads to two different effects in the model: a coefficient for policy preferences (representing the effect in non-liberal regimes), and a coefficient for policy preferences interacted with liberal (representing the difference of the effect in liberal regimes from non-liberal regimes).Adding the two effects together produces the effect in liberal regimes.Their results suggest that a one-point increase in public preferences explains a 3.70 (without party variables) and a 2.65 (with party variables) percentage point increase in social spending as a percent of GDP in European, non-liberal countries (Brooks and Manza 2006a, Table 4).Their results also suggest that a one-point increase in public preferences explains 1.35 (without) and 0.88 (with party variables) percentage point increases in English-speaking countries plus Japan.These effects are understood with all else equal given the control variables in the model.
Although some extensively debated their theoretical framework and their lack of attention to changes within countries over time (Kenworthy 2009;Myles 2006), I am unaware of any research that questions their statistical results directly.I do so for three reasons: (1) they report the sizes of the coefficients for policy preferences in liberal regimes but not significance levels; (2) they do not include the main effect of regime in their interaction model; and (3) their small sample size of 43 makes regression results unstable.Although one critic refers to these technical issues as a "surface" debate which does not address the underlying theory (Myles 2006), I argue

Breznau
The Missing Main Effect of Welfare State Regimes that these statistical issues are highly relevant for the future of policy responsiveness and social policy research because models are formal manifestations of theoretical assumptions, where each implies the other.Thus, their empirical strategy speaks directly to their theoretical assumptions, and if they made assumptions without discussing them, there is room for potential errors and improvements.
Two theoretical implications are in the fore after their work gained so much attention.First, their results suggest that theories of policy responsiveness are relevant for between-country variations in social spending.Second, their results suggest that there are different levels of policy responsiveness in Europe versus the English-speaking democracies.The former argument addresses the way that scholars think about opinion and policy, namely that they have path-dependent or path-reinforcing effects on one another between societies.This expands typical political theories, which focus on short-term fluctuations within polities.The latter argument opens up the question of why certain publics get policies that are more aligned with their preferences than others; or a kind of slap in the face for liberal democracies, as they appear to have a far weaker linkage between preferences and policy spending.However, the true contribution of these theoretical claims rests on the design of the empirical models and on the statistical results.
The theoretical implications alone are a good reason to continue to debate what Brooks and Manza claim; however, the specific focus of this article is on empirical reasons to reconsider their results.First, it is unsettling that their article does not report a significance level for the effect of policy responsiveness in liberal countries.What if the effect is not significant?This would lead not only to the conclusion that the public gets less of what it wants in English-speaking democracies and Japan, but that the public gets nothing of what it wants in these countries, at least not when measured using these data.Therefore, the first methodological step of this article is to replicate their work step by step.This means obtaining data from the same public sources, mostly the ISSP and OECD, and running the same robust clustered regressions with the same country-time point cases.
Second, their claim that public preferences can be decomposed into different effects that explain differences in social spending among the different clusters of countries (liberal versus non-liberal) (Brooks and Manza 2006a, 488), makes clear that they expect independent effects of public preferences on social spending in the different regime clusters.But Brooks and Manza do not model two independent effects.Instead they estimate two dependent linear effects.They are dependent on the scaling of preferences: when public preferences = 0, the two predicted regression lines are equal to the mean of social spending for the entire sample; i.e., they are forced to intersect.The value of 0 for policy preferences is defined as the average for the entire sample because Brooks and Manza mean-center this variable.
What does it mean if both regression lines intersect when policy preferences are at the sample average?Substantively there should be no unobserved characteristics that distinguish these two regimes (i.e., no unobserved heterogeneity).Thus, any factors that might cause spending differences between liberal and non-liberal welfare states must be controlled for in their models, else there be specification error.
The assumption of random errors (i.e., that residuals are homoscedastic) is one of the most basic of linear regression (Pedhazur 1997).This further suggests that

Breznau
The Missing Main Effect of Welfare State Regimes at a global (i.e., sample) average of public preferences in advanced democracies, social spending should be identical, all else equal.Although this omission might have theoretical justification, Brooks and Manza provide none; they simply do not mention that they left out the main effect.I find this troubling given that it is a standard statistical practice.The most basic form of an interactive regression equation has three coefficients and a constant (see Tate 1984): (1) If Equation (1) were applied to the Brooks and Manza example, welfare state spending would be outcome Y, policy preferences X 1 and liberal regime X 2 .But Equation (1) is not the Brooks and Manza model.What they model is actually a restricted version, as shown in Equation ( 2) below. (2) Equation ( 2) assumes that b 2 = 0: no differences between liberal and non-liberal regimes for Y when X 1 = 0 (average policy preferences); i.e., the equation assumes that the residual errors E 2 are independent of and not correlated with X 2 (the regime dummy), or any other independent variable in the regression equation (e.g., Pedhazur 1997; and any basic text on regression), as shown in Equations ( 3) and ( 4).
Finally, Brooks and Manza's full model has 43 cases, nine independent variables, and one interaction term.This brings up a concern about degrees of freedom.Many studies demonstrate that when there are fewer than ten cases per independent variable, results become untrustworthy (Babyak 2004;Peduzzi et al. 1996;Shalev 2007).Although it is not the task of this article to argue for or against the problem of identification of significant effects in small-N studies, I am concerned with the robustness and therefore I engage in a sensitivity analysis that reproduces their models with thousands of variations in independent variable configurations tested with and without the main effect.
The final part of this article revisits the theoretical underpinnings of the Brooks and Manza models and discusses why we should expect that the error terms in their regression analyses are correlated with liberal and non-liberal welfare state regimes.To do this I review the state of the art of welfare state research and discuss why various theoretical perspectives on regimes, or what some might call families of nations, suggest that there are myriad unobserved characteristics that pattern by regime categories of welfare states.This sets the stage for a discussion of theoretical specification as coterminous with model specification, a methodological reality of which sociologists should be aware.

Breznau
The Missing Main Effect of Welfare State Regimes

Replication of Brooks and Manza
Measurement I use the same data sources and methods as reported in Brooks and Manza (2006a, 483, Table 2).The variables of primary interest are welfare state spending, policy preferences, and liberal democracy regime (Y, X 1 and X 2 respectively in Equations ( 1) and ( 2)).My dependent variable welfare spending is measured accordingly as a percentage of GDP from the OECD's SOCX data (see the technical appendix).This measure is taken at each country-year time point one year after the respective country-year observations of the independent variable in the ISSP.
I measure policy preferences toward welfare by also using the two items available in the "Role of Government" and "Religion" modules in the ISSP.I use the ISSP's population weights in order that these measures most accurately represent public preferences in each country in the aggregate. 5Brooks and Manza combine these two items into "factor scores" (p.482).This is analytically vague.In general factor scores are defined as the predicted values that result from a factor analysis (Bollen 1989;Tabachnick and Fidell 2001).Brooks and Manza do not specify the type of factor analysis; however, their resulting preferences variable does not have a standardized distribution (s.d.= 1.88), so it was not predicted from a standard linear normal regression model using effects from a factor analysis, which I estimate from the data as: s.d.= 1 (principal components analysis), s.d.= 0.71 (principal factor analysis), or s.d.= 0.82 (iterated principal factor analysis), with no results for maximum likelihood with fewer than three observed variables.Furthermore, these three standard deviations would change respectively to 0.28, 0.40, and 0.32 after aggregating to country-time point means.It is most likely that Brooks and Manza created an additive index of the two items (s.d.= 1.70 in my calculations), but this standard deviation still drops to 0.68 after aggregating.Nonetheless, all four methods produce scales that correlate at exactly 1.00 because there are only two items, meaning that each has identical factor loadings producing linear predictions perfectly covarying with the additive scale.So this point is moot, and I proceed using the resulting two-item additive scale.
I expected that some things would not be identical, especially given that sources such as the OECD and ISSP sometimes make adjustments to their publically available data, and sometimes do not report these (Breznau 2015).It is important to note that the standard deviation of 1.88 seems out of place here and it might be useful to look at the original work of Brooks and Manza; unfortunately they currently do not publically share their data or any extra methods information. 6However I am able to recover their original policy preferences measure in another way: for a research note giving descriptive and graphical re-visitation of their empirical work, Kenworthy (2009) was given the original policy preferences variable by Brooks and Manza, and he shares this publically on his professional website. 7With their measure of policy preferences I am able to more accurately reproduce their original findings, which are under scrutiny here as having suffered from specification error.I argue that my production of a policy preferences variable is no more or less "correct" than theirs, and therefore I include their variable and mine in two versions of my models.Most importantly, my measure correlates at 0.983 with their measure, so the two should be essentially the same in any model.To aid in comparability I linearly transform my policy preferences variable to a mean of 0 and a standard deviation of 1.88, and I transform the welfare spending variable to match their mean and standard deviation, which does nothing to the predicted values.

Replication Models
I first reproduce the policy responsiveness models of Brooks and Manza (2006a, 485-486).To do this, I analyze the data using OLS regression with robust clustered standard errors, where country is the clustering variable and country-time point the unit of analysis.The results in Table 1 are metric coefficients with a single significance star to denote a p <0.05 level (Brooks and Manza's original significance criterion).The first column under model 1 contains the original results from Brooks and Manza (2006a, Table 4).The second column contains results from my reproduction of their dataset and my own measurement of the policy preferences variable.The third column contains results from my reproduction of their dataset but includes their policy preferences measurement.These three variations are repeated in the next three columns under model 2, which is identical to model 1 except for inclusion of the two political party variables.
The results show slight divergence across the models for the main policy preferences effect representing the effect in non-liberal regimes (model 1: 3.70, 4.14, and 4.63; model 2: 2.65, 2.87, and 3.43), and for the interaction coefficients added to the main effect coefficients, thus yielding the effect in liberal regimes (after addition, 1.35, 1.41, and 0.91 in model 1, and 0.88, 0.17, and 1.48 in model 2).However, results generally point toward similar trends, with significant main policy preferences coefficients across all models (policy preferences) and a significant negative difference of the interaction coefficients across all models (policy preferences*liberal).The lower panel of Table 1 reveals that across all versions the predicted point estimates for welfare spending with all variables at their means, and then policy preferences taken at one standard deviation below and one above the mean, look strikingly similar.Therefore, I conclude that I more or less replicated Brooks and Manzas' original findings.Now I test whether the strict assumption that the error terms are uncorrelated with the liberal regime variable required for estimation of Equation (2) (assumptions mathematically summarized in Equations ( 3) and ( 4)) holds.Table 2 provides Pearson's correlations for the four error terms that correspond to the four regression models shown in Table 1.
The correlations with the residual errors in all four models are of non-trivial size, ranging from -0.23 to -0.35, suggesting that these models suffer specification problems.However, the correlations are only significant for models 1A and 2A; in models 1B and 2B the correlation has a p-value below 0.2, or roughly put a 20 percent chance that the observed correlation is not different from zero.Thus this test is suggestive, but so far inconclusive, of specification error.Adding the main effect for liberal regime will offer more conclusive evidence.1 except for the addition of the dummy variable for Liberal (=1) compared to non-Liberal Regimes (=0); i.e. the main effect for the interaction as denoted by the subscripted "m" added to each model name from Table 1.

Adding the Main Effect to Test for Specification Error
Next I add the main effect for liberal regime into the models; i.e., b 2 from Equation (1), which was omitted from the Brooks and Manza interaction demonstrated by Equation (1) and reproduced in Table1.Table3 reports results of regressions otherwise identical to those from Table1 but with the addition of the dummy variable for liberal welfare state regime.
Initial results demonstrate that coefficients for both policy preferences and its interaction with liberal regime are insignificant using both my own calculation of policy preferences and Brooks and Manza's calculation.The effects seem to jump into the main effect, which is far more significant than the five percent cutoff (p-values <0.001 in all four models), suggesting that liberal welfare states spend somewhere between 6.93 and 9.37 percentage points less of their GDPs on social

Breznau
The Missing Main Effect of Welfare State Regimes welfare goods and services than non-liberal welfare states, all else equal.These findings suggest that the models suffer from serious specification errors.
Using the predicted margins from the models representing Equation (1) (Table 3) and Equation (1) (Table 1), I plot the respective regression lines from models 2B and 2B m , where the main effect is added to offer a visualization of the specification errors.The graphs in Figure 1 present regression lines plotted from predicted values.Other than the lines, the graphs are identical in their scatterplots of the observed values of public preferences (x-axis) and social welfare spending (y-axis) by country-time point.The left panel, labeled "Without the Main Effect of Regime," represents the policy responsiveness theoretical model of Brooks and Manza, where the intercept of the regression lines are forced to meet at zero (i.e., when all else is equal).The right panel, "With the Main Effect," represents the standard interaction model, where each effect is independent by regime.The blue triangles represent liberal country-time points, and the blue dotted line represents the effect of policy preferences on spending in the liberal regime.The red squares represent nonliberal (social democratic/Christian/corporatist) country-time points, and the red solid line the effect of preferences in this regime.The lines in the panel on the right, with the main effect, show that essentially all of the cross-national variation between liberal and non-liberal preferences and social spending is accounted for in a difference of mean levels in spending, not a difference in slopes.The regression lines have slopes that are indistinguishable from zero.In other words, regimes account for an omitted variable or set of omitted variables whose presence implies theoretical and statistical specification error.
But I would like to be absolutely sure before concluding this.Given the small sample size in the models shown in Table 3, I conduct sensitivity analyses.Although these do not rule out all problems, they hopefully curtail some of the known pitfalls of small-N research (Bollen, Entwisle, and Alderson 1993;Breznau 2015;Ebbinghaus 2005).Critically, I attempt to rule out the possibility that the original Brooks and Manza variables are not uniquely leading to their results.In their book, Brooks and Manza (2007) recreated their analyses with the addition of Spain and found different results for predicting welfare spending and an insignificant interaction term for policy preferences*liberal, in contradiction to the significant interaction in their ASR paper (2006a and Table1).This further suggests that these models are highly sensitive to small changes.I use theory and some random selection to create alternative variable configurations to check the sensitivity of these results.First I include the public preferences variable and each other variable separately, leading to regressions with none or only one independent control variable, and I do this for both replicated datasets.Then I add demographic controls, mixing and matching aged population, unemployment, and women's labor force participation (LFP), and then test each admixture with each of the other control variables and again for each dataset.Then I take political variables and utilize different mixtures of political institutions, left, and religious party control. 8 Then I try the different admixture of political variables with each of the other control variables and do this for both datasets.Next I re-run all the models on my dataset using the public preferences variable from Kenworthy's dataset and re-run all of the models on Kenworthy's dataset using my own calculation of the public preferences  The results in Figure 2 show great variation in coefficient sizes, ranging from -1 all the way up to around 8. The average is 2.20, but it is widely dispersed, with a standard deviation of 3.52.Only 58.2 percent of the coefficients are significant.Therefore, without the main effect, simply mixing around the variables suggests that the conclusion of Brooks and Manza is already questionable.However, here I give them the benefit of the doubt, as their theory suggests that all of their control variables must be in the model.The ultimate test comes with the addition of the main effect.
I re-run all of the models used to produce the results in Figure 2, now with the addition of the main effect, consistent with standard interaction modeling (see Equation ( 1)) and consistent with a theory of a unique effect of policy preferences in liberal democratic welfare states versus non-liberal (social democratic and conservative/Christian) ones.The results are in Figure 3.
Out of 610 resulting coefficients for the effect of policy preferences, the average was -0.02, which is essentially zero, and only occasionally were any of the coefficients significant (17 percent of the time).On the other hand, the liberal regime main effect was significant in 404 out of 406 models (99.5 percent), and always negative.3 are robust.This suggests that no matter what theoretical model is used to derive a given set of control variables, in 99.5 percent of the cases, policy preferences is not a significant predictor of social welfare spending when using a standard interaction regression with three effects, as in Equation (1): policy preferences (main effect b 1 ); policy preferences*liberal (interaction effect b 3 ); and liberal (main effect b 2 ).Thus Brooks and Manza's implicit assumption that b 2 = 0 is unquestionably false given these data and methods.

Welfare State Theory Explains the Error
Why should anyone expect that there are unobserved differences in social welfare policy and the amount of spending required to sustain it between different welfare state regimes?A review of the literature reveals that welfare states are meaningfully grouped into distinct systems that have discrete historical, political, legal, economic, and cultural similarities (within regimes) and differences (between regimes).In the welfare state regime literature, these distinctions revolve around how individuals secure material welfare or insure welfare against social risks, and how individual welfare stratifies by socioeconomic status.A political economy perspective often defines regimes based on the relationship of the state and the economy.The varieties of capitalism literature uses a theory expounded by Hall and Soskice (2001) that advanced democracies may be framed as either liberal-or coordinated-market economies.Political science, economics, and other policy-oriented fields often rely on the worlds of welfare framework that categorizes states into groups based on The families of nations perspective provides a more socially conscious explanation of these worlds of welfare (Castles and Obinger 2008;Castles 1993).It suggests that shared historical experiences and the diffusion of welfare state systems led to similarities in welfare policy, public preferences, labor market characteristics, sociocultural values, and even divorce rates within families of welfare states.
A regime is often identified by scholars as having discrete political, economic, and social features such that the countries within the regime experience diffusion and what institutional theorists might call isomorphism.This is especially true, for example, when there is a common language and alliances across nations, as with the English-speaking countries, or when there are close geographic proximities allowing for easy diffusion (or militaristic impositions) of ideas and practices, as with European countries (Castles 1998;Djelic 2008;Strang and Meyer 1993).This is what Pierson (2000) or Hall and Taylor (1996) might call institutional path dependency within regimes.Scholars focusing more on values and social norms (Mau 2004;Mehrtens III 2004;Nordenmark 2004;van Oorschot 2007) might call these instead divergent national values or cultural regimes (Coughlin 1980;Vrooman 2012) with certain ideological roots (Wilensky 1975).What all these macro-comparative theoretical perspectives have in common is that they find meaningful shared characteristics among certain countries that are strong enough to characterize them as stable regimes, notwithstanding debates over which countries go into which regime and how many meaningful regimes exist (Arts and Gelissen 2002).The

Breznau
The Missing Main Effect of Welfare State Regimes most uniquely and repeatedly identified regimes fall into two categories juxtaposing the European welfare states (coordinated-market economies or social democratic/Christian/conservative) and English-speaking liberal welfare states (and sometimes Japan).This is the distinction that Brooks and Manza employ.Although the histories and cultures of these families of nations are discussed at length by many of the aforementioned social scientists, I offer a brief summary to substantiate this big theoretical dichotomy of advanced democracies.
Brooks and Manza refer to a liberal welfare state regime.This is what Castles would refer to as the English-speaking family of nations.Its members derive from the British Empire, which eventually transitioned into the countries comprising Great Britain along with a few small outlying territories, and spawned many now independent, English-speaking nation states with various Anglo, Saxon, and Celtic roots (I am purposefully over-simplifying these complex societies here).Historical Britain had many similarities to the early welfare states that formed in continental Europe in the wake of Bismarckian Germany (Briggs 1961); however, it diverged from its continental peers with an earlier end to serfdom, its naval-driven empire and trade-building, and faster industrialization (Macfarlane 1978).Britain, and its wake of colonial nations, was home for individualistic and liberalistic values in the early stages of state formation (Orloff and Skocpol 1984).British imperialism spread the English language, created a common historical period, and culturally connected its colonies as they eventually became independent states with a legacy of British power and resources shaping them prior to, during, and after independence.Many of the social customs and power structures in these nations are a product of liberal thought, and even after independence various liberal ideas moved freely amongst the educational and political institutions of these English-speaking nations (Ashford 1987).
The European family of nations evolved out of repeated struggles in a tightly packed geographic space.Large historical, religious, and cultural cleavages characterized the societies of Europe, and these led to vast conquests of land that ripped apart and rebuilt empires until the end of the World Wars and the establishment of today's national borders.The revolutions of democracy in Europe met with entrenched territorial vassals, guilds, and status systems, and this required a multitude of interests to be incorporated into popular government, leading to consensus-based democracies with power-sharing across parties and a stronger representation of the working class (Castles 1993;Collier and Messick 1975;Hicks and Swank 1992;Manow 2009).Also, there is a long history of efforts to instill a European economic and social space in order to prevent destructive war and increase economic development for all.The resulting European family has strong distributive and redistributive social states that contrast sharply with the English-speaking models of liberal-pluralism, greater inequality, and unequal distributions of wealth and power (Iversen and Soskice 2009;Lijphart and Crepaz 1991).For example, the English-speaking world, despite being the "...home of labour activism at the turn of the century... lacked an industrial working class that was to be the basis of such movements in Europe later in the century" (Castles and Mitchell 1993, 125).Also in Europe, church and state are mixed together, leading to early developments of strong welfare policies to assist those in need (Huber, Ragin, and Stephens 1993),

Breznau
The Missing Main Effect of Welfare State Regimes whereas the English-speaking nations have maintained the greatest levels of church and state separation in the world (Fox 2004).
It is presumptuous to simply group nations into these two families without acknowledging that each nation follows its own path throughout history.Some of the intra-family similarities of nations were arrived at by divergent or even conflicting paths (Dore, Lazonick, and O'Sullivan 1999;Hall and Soskice 2001), and Europe in particular contains many (sub)families (Castles and Obinger 2008).Various clustering schemes disagree to some extent over families of advanced democracies, for example dividing European countries into social democratic/Scandinavian, conservative/Christian, and southern/Mediterranean (Arts and Gelissen 2002), or the placement of single countries, such as whether the Netherlands is social democratic or conservative (Seeleib-Kaiser, Van Dyk, and Roggenkamp 2008), or Australia is more of a radical outlier compared to its English-speaking counterparts (Castles 1997).Again, despite the different schemes there is an overarching distinction between the nations of Europe and those of the English-speaking British diaspora (Arts andGelissen 2001, 2002).This distinction is clear when looking at outcomes of government spending on social welfare (Castles and Mitchell 1993;Wilensky 1975); economic inequality or decommodification (Esping-Andersen 1990); women's LFP and adaptation to new social risks (Bonoli 2007); majoritarianism (Lijphart 1999;Obinger and Wagschal 2001); taxation (Castles and Obinger 2007); federalism (Obinger, Castles, and Leibfried 2005); social justice norms (Mau 2003); and public preferences (Coughlin 1979;Jaeger 2006;Mehrtens III 2004).Based on this large body of empirical work, the families of nations approach concludes that it is no coincidence that "...we can identify similarities of policy outcomes in nations that happen to share historical and cultural attributes" (Castles 1993).
The big question is whether all of the differences identified by scholars to delineate these two families of welfare states can be measured and used as variables in a regression model.In the case at hand, it seems that Brooks and Manza ask too much of their model by implicitly assuming that their variables account for all of the differences in spending (a proxy for the underlying generosity of the policies) observed between the liberal and non-liberal welfare states.Accordingly, I have shown that the dummy variable for liberal and non-liberal regime correlates with the error term E 2 in Equation (2).In other words, I confirm that Brooks and Manza were asking too much of their model.In fact, any scholar who tries to capture all of these distinctions quantitatively may be fighting a losing battle.Welfare state scholars such as Castles focus on historical and cultural differences, and many of these differences cannot be quantified.Even if scholars could quantify all the things that lead to meaningful families of nations, they would encounter a degrees of freedom constraint that would not let them enter all of the variables into a model with somewhere between eight and 30 countries.It is therefore no stretch to conclude that Brooks and Manza made a model specification error.

Discussion of Findings
First I reproduced the findings of Brooks and Manza using two slightly different datasets.Then I found that the effect of policy responsiveness disappears with

Breznau
The Missing Main Effect of Welfare State Regimes the addition of the main effect of liberal democratic welfare state regime.These results hold in two different replications in this group of 15 countries, and hold in sensitivity analyses producing 610 coefficients in models with the regime dummy variable compared to those without.Models allowing for a main effect difference between regimes have public preferences effects that average zero with only a 17 percent rate of significance (down from 58 percent in the models without main effects); meanwhile the main effect for regime is significant 99.5 percent of the time.
The primary result of this work is that closer replication of Brooks and Manza reveals model specification error.They did not allow for unobserved characteristics of welfare state regimes to enter into the model, even though they argue that ". . . the effect of social policy preferences on welfare state spending effort is smaller within liberal democracies in comparison to social and Christian democracies" (2006a, 487).In replicating their work without the main effect, there is an ostensible difference in the effect of social policy preferences in each regime, but this finding evaporates with the main effect because welfare state spending varies substantially between these two regimes (somewhere around eight points as a percentage of GDP), and this variance occurs for reasons not captured in the Brooks and Manza model.
Although what the public prefers does not seem to account for meaningful variation in social welfare spending, this replication says nothing about policy preferences shaping social spending within countries.In fact, most research on policy responsiveness suggests preferences shape policy outputs such as spending, although it is important to point out that social policy also shapes public preferences in many of these same studies (Pierson 1994;Soroka and Wlezien 2010).This withincountry finding supports arguments by Larsen (2008) and Jaeger (2006), who suggest there may be institutional or regime logics to welfare preferences but also suggest these logics marry preferences and policy into meaningful covariation based on omitted variables that are likely to be more historical or cultural in the development of democratic welfare state societies.
Furthermore, we should be careful not to add or remove things from our models without first considering the theoretical implications.Theory and model specification must align.As Bollen (1989, 71) points out, "a model is a formal representation of a theory."Using an interaction effect thus carries specific theoretical assumptions that may be expressed as logical arguments, and the fact that interactions are known to trip up some researchers is another reason to exercise such caution (Macdonald 2011).In the present example, theory clearly points toward families of nations (or democratic welfare state regimes) having unobserved heterogeneity, and this means having a variable to account for these differences is crucial in a statistical model.A logically derived theory to explain the Books and Manza interaction models does not exist in the welfare state literature, and the statistical tests in this paper demonstrate why.If there was such a theory it would read something like: all of the unique features of welfare state regimes can be accounted for using a handful of quantitative measures.But there is no support for such an argument here, or in the writing of Brooks and Manza.
An important point I hope to make is scientific.When results are reproducible, theory is strengthened, and when results are not reproducible, theory is also strengthened because new research seeks to explain why.However, this win-win

Breznau
The Missing Main Effect of Welfare State Regimes scenario only comes through careful replication and re-analysis (Breznau 2015;Stodden 2015).In comparison to other disciplines, sociology currently lags in replication research practices (Freese 2007).Admission that our models have specification errors, and willingness to share our data and syntax with others, will only prove that we are committed scientists instead of mere academic rent seekers (Frey 2003;Sørensen 1996Sørensen , 1358)).Hopefully in the future all researchers and journals will share all data and source code with the public as mapped out in the The Transparency and Openness Promotion Guidelines of 2014.9To conform to these necessary conditions for scientific progress, a technical appendix to this article is available with supplementary tables and all Stata syntax, and I humbly hope that others will find and learn from my mistakes.
Notes 1 Refers to public (government provided) insurance against (un)employment, old age, health, poverty, housing, and various other social risks.
2 They are from a larger six-item battery fielded in a smaller number of ISSP surveys asking about pensions, health care, and unemployment (see Brooks and Manza 2006a).
3 They justify their selection of a robust cluster OLS regression instead of a fixed-effects multilevel regression because a modified Hausman test reveals that residuals are not correlated with the dependent variable (Brooks and Manza 2006a, 487).

Figure 1 :
Figure 1: Predicted Regression Lines for the Effect of Policy Preferences on Social Welfare Spending, without and with the Main Effect of Regime

Figure 2 :
Figure 2: Policy Responsiveness Models Predicting Welfare Spending: Sensitivity Test with Different Variable Configurations.N=610 Public Preference Coefficients

Figure 3 :
Figure 3. Policy Responsiveness Models with the addition of the Main Effect for the Liberal Welfare States Dummy Variable.N=610 Public Preferences Coefficients

Table 1 :
Policy Responsiveness Models Predicting Social Welfare Spending; N = 43 country-time points Original Brooks and Manza results copied from their "Policy Responsiveness" research published in ASR.Brooks and Manza original data not publically available, thus margins calculated by hand.b This model substitutes Brooks' and Manza's original policy preferences variable for the author's measured variable.

Table 3 :
Replication of Brooks and Manza Findings on Policy Responsiveness with the Addition of the Main Effect, N = 43 a Variables Model 1A m Model 1B m Model 2A m Model 2B m p < .05. a Regression models identical to Brooks and Manza as seen in Table