DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

Latent Class Analysis and Modeling:
 A Pharmaceutical Case Study
Bruce Ratner, PhD.

With so many challenges facing the pharmaceutical industry — pricing pressures, few blockbuster drug prospects, impending patent expirations, the soaring cost of developing new drugs — the strength of pharmaceutical companies is being threatened. While the focus seems to be shifting to personalized medicine — the right drug for the right patient — personalized drugs will most likely mandate significant changes to pharmaceutical companies' business practices. The purpose of this article is to illustrate the rapidly developing statistical Latent Class Analysis (LCA) methodology by presenting a pharmaceutical case study, for which I built a Consumer Segmentation Model, and an Acceptance of New Medication-Concept Predictive Model. The specific objectives of  the study are:
  1. Identify patient segments in market
  2. For each segment identify patient’s differential needs, symptoms, psychographics, and treatments
  3. Profile target patient populations on all variables of interest.
  4. Profile accepters/rejecters of concept for mew medication for each segment
Below, I adumbrate Latent Class Analysis and Modeling to pique the data analyst’s curiosity, so as to learn about this new and promising technique.

Latent Class Clustering
LCA enables the development of a latent categorical variable (unobserved) from an analysis of the relationship among several indicator variables (observed questionnaire items). LCA, which is often referred to as a “categorical data analogue to factor analysis,” can be used a clustering method. The Latent Class Clustering (LCC) solution is a factor with “factor” loadings. The LCC-factor is a discrete nominal-level variable that defines the desired mutually exclusive and exhaustive classes (clusters) or segmentation.

The LCC-factor loadings are conditional probabilities, which represent a measure of the degree of association between each of the indicator variables and each of the latent classes. Analogous to factor loadings – which represent the correlation between each of the indictor variables and each of the factors – the conditional probabilities indicate the probability that an individual in a latent class (segment) will score a particular way on the indicator variable (questionnaire item). Consequently, the conditional probabilities from LCC allow the analyst to interpret the nature of the classes of the latent variable (to render a description of the segmentation). In addition, after defining the latent classes (segmentation), LCC makes possible the assignment of individuals to the appropriate latent class (segments).

Latent Class Regression
Traditional regression of a continuous dependent variable assumes homogeneity across an entire population, which does not allow for the existence of different segments. Latent Class Regression (LCR) allows for a categorical (class) dependent variable, which can be nominal, ordinal, or counts, yielding improved predictions. LCR estimates a regression model under the assumption that the coefficients differ across classes (segments). LCR can assign individuals to each class, without the traditional regression assumptions of normally distributed prediction error or of homogeneity along the entire regression line.

For more information about this article, call Bruce Ratner at 516.791.3544,
1 800 DM STAT-1, or e-mail at