## Mixture Models for Ordinal Data

Richard Breen and Ruud Luijkx, Mixture Models for Ordinal DataSociological Methods & Research 2010 39: 3-24.

Cumulative probability models are widely used for the analysis of ordinal data. In this article the authors propose cumulative probability mixture models that allow the assumptions of the cumulative probability model to hold within subsamples of the data. The subsamples are defined in terms of latent class membership. In the case of the ordered logit mixture model, on which the authors focus here, the assumption of a logistic distribution for an underlying latent dependent variable holds within each latent class, but because the sample then comprises a weighted sum of these distributions, the assumption of an underlying logistic distribution may not hold for the sample as a whole. The authors show that the latent classes can be allowed to vary in terms of both their location and scale and illustrate the approach using three examples.

Key Words: ordered probability models, mixture models, latent class, odds ratios

Richard BreenYale University, New Haven, CT, USA, richard.breen@yale.edu

Breen, R., & Luijkx, R. (2010). Mixture Models for Ordinal Data Sociological Methods & Research, 39 (1), 3-24 DOI: 10.1177/0049124110366240

### One Response to Mixture Models for Ordinal Data

1. Comment for SMR Blog on

I am thankful for the invitation of the Editor to provide an opening comment on “Mixture Models for Ordinal Data” by Richard Breen and Ruud Luijkx. I learned a great deal reading the paper, and it is an excellent contribution to the literature. Three thoughts occurred to me as I read it.

Thought 1: The inherent value of flexible models

As Breen and Luijkx explain clearly, traditional ordered outcome models (whether as ordered logits, probits, or in other more exotic flavors) typically have a built-in common distribution assumption. Underneath the observed ordered and discretized outcome, a continuous latent variable is posited that has the same shape for all units in the population. Although SMR readers surely know about this built-in assumption, it is probably generally unknown to SMR readers whether this assumption drives anyone’s published results to a sufficient degree that it has generated any misleading substantive conclusions in the published literature. One cannot really know without adopting a more flexible model.

Along these lines, Breen and Luijkx propose a mixture model, where they posit the existence of underlying subpopulations, across which the scale and location parameters of subpopulation-specific latent continuous variables are allowed to vary. Whether this model is better than its more restrictive traditional counterpart is a deep question, since there are tradeoffs of all sorts when models are allowed to grow in complexity. However, there is no doubt at all that the model is of great value, since it allows for a consideration of the consequences of the fixed parameters assumption in the traditional counterpart.

An exemplary feature of this article, beyond its core methodological innovation, is that it demonstrates the value of the mixture model by showing alternative results that emerge when a traditional ordered logistic regression model is estimated in the mixture form proposed by Breen and Luijkx (for three separate substantive domains). The upshot of these demonstrations is that the mixture models fit the data better, using very few additional parameters. For example, in their educational attainment demonstration, Breen and Luijkx push the log-likelihood down by 25 points with the addition of only two latent class parameters to the model. At face value, this is an impressive result. This is the general pattern across all three demonstrations. But, what does it mean? This leads me to my second thought.

Thought 2: What do we make of the better model fit?

There are a variety of ways to think about how to interpret the mixture model’s improved fit to the data. As Breen and Luijkx note, one interpretation is that the mixture model reveals that the traditional model is misspecified in a fixable way. So, for the educational attainment example, it could be that the ability covariate should be given a more flexible spline-like coding. The mixture model can then be run again, and perhaps the improvement in model fit will no longer be substantial. In that case, the analyst would then need to explain why the more flexible coding of ability fits the data better, which for the educational attainment example could be something either about the actual process of educational attainment and its relationship to test scores or, instead, a measurement feature of the ability variable itself.

The second way to think about the better model fit is that there are genuine latent classes in the population. This is the interpretation that Breen and Luijkx tend to favor, at least in the way they have written this paper. So, for the educational attainment example, they write as if the latent classes are deeper ability groups that have differential propensities to obtain different amounts of education. Identifying these latent class effects and giving them their own predictive parameter is a qualitatively different modeling decision than only giving the observed ability variable its own parameter. Breen and Luijkx’s realist interpretation of the results for the educational attainment example is very reasonable, but it does give rise to the question: Why are these latent classes not picking up a deeper dimension of resource-based inequality, one which is not fully captured by the six dummy variables for social class? Breen and Luijkx are fully aware of the indeterminacy of any realist interpretation of the latent classes, but there is a real danger that a practitioner could adopt an overly strong (or perhaps arbitrary?) interpretation of the latent class parameters. This brings me to my last point.

Thought 3: What kind of heterogeneity has been dealt with?

If there is one thing to criticize about the paper, it is the following: Breen and Luijkx do not give enough guidance on what these sorts of models accomplish in modeling underlying heterogeneity. Many SMR readers have probably reviewed papers for various substantive journals using models such as these ones (often citing Heckman and Singer from the 1980s) but where the authors have argued that the latent classes help to address problems of causal inference that are generated by selection on unobserved variables. It is, of course, true that specifying these latent classes will modify the parameter estimates for everything else in the model because underlying heterogeneity is given a place within the model. But, because the latent classes are assumed to be independent of other variables in the model, one should not attach too much meaning to movements in these coefficients. Breen and Luijkx note this obliquely in a few places, such as in the conclusion where they note that the next step for mixture models like theirs is to specify latent classes that are not independent of other variables in the model. But, given that there is confusion on this point in the substantive literature, I would have preferred to see a few more cautionary statements on the present model’s inability to solve omitted variable bias problems of the sort that some researchers (not Breen and Luijkx) think that they do.

Overall, I congratulate Breen and Luijkx on an excellent contribution to the literature, which I very much enjoyed reading. If my comments are in error in any way, then I hope someone will respond with a correction on this blog. It is possible (or likely?) that my morning coffee was not strong enough.