Cees H. Elzinga: Complexity of Categorical Time Series, Sociological Methods Research 2010 38: 463-481.
Categorical time series, covering comparable time spans, are often quite different in a number of aspects: the number of distinct states, the number of transitions, and the distribution of durations over states. Each of these aspects contributes to an aggregate property of such series that is called complexity. Among sociologists and demographers, complexity is believed to systematically differ between groups as a result of social structure or social change. Such groups differ in, for example, age, gender, or status. The author proposes quantifications of complexity, based upon the number of distinct subsequences incombination with, in case of associated durations, the variance of these durations. A simple algorithm to compute these coefficients is provided and some of the statistical properties of the coefficients are investigated in an application to family formation histories of young American females.
Keywords: categorical time series, complexity, sequence comparison, sequence analysis