Philippa Clarke and Blair Wheaton Addressing Data Sparseness in Contextual Population Research: Using Cluster Analysis to Create Synthetic Neighborhoods Sociological Methods & Research 2007 35: 311-351.

February 7, 2010

The use of multilevel modeling with data from population-based surveys is often limited by the small number of cases per Level 2 unit, prompting a recent trend in the neighborhood literature to apply cluster techniques to address the problem of data sparseness. In this study, the authors use Monte Carlo simulations to investigate the effects of marginal group sizes on multilevel model performance, bias, and efficiency. They then employ cluster analysis techniques to minimize data sparseness and examine the consequences in the simulations. They find that estimates of the fixed effects are robust at the extremes of data sparseness, while cluster analysis is an effective strategy to increase group size and prevent the overestimation of variance components. However,researchers should be cautious about the degree to which they use such clustering techniques due to the introduction of artificial within-group heterogeneity.

Key Words: multilevel models • data sparseness • cluster analysis • Monte Carlo simulations • survey research

Gustavo Angeles, David K. Guilkey, and Thomas A. Mroz, The Impact of Community-Level Variables on Individual-Level Outcomes: Theoretical Results and Applications, Sociological Methods & Research 2005 34: 76-121.

February 7, 2010

The authors study alternative estimators of the impacts of higher level variables in multilevel models. This is important since many of the important variables in social science research arehigher level factors having impacts on many lower level outcomes such as school achievement and contraceptive use. While the large sample properties of alternative estimators for these models are well known, there is little evidence about the relative performance of these estimators in the sample sizes typical in social science research. The authors attempt to fill this gap by presenting evidence about point estimation and standard error estimation for both two-and three-level models. A majorconclusion of the article is that readily available commercial software can be used to obtain both reliable point estimates and coefficient standard errors in models with two or more levels as long as appropriate corrections are made for possible error correlations at the highest level.

Key Words: multilevel models • hierarchical models • multilevel error structure • Monte Carlo simulations