A post in which I open myself up to slings and arrows

To Rasch or not to Rasch, that is the question.

TO RASCH: Rasch measurement theory articulates ideal requirements and provides objective criterion by which to assess items and people. Because of this, Rasch analysis (RA) allow for hypothesis-driven statistical tests while other IRT models provide only descriptive information.

NOT TO RASCH:  With its strong requirements regarding how items and respondents must perform, the Rasch philosophy is a highly prescriptive version of measurement, well beyond the general idea of measurement as “assigning numbers to things using rules”. As noted in a Rasch Measurement Transactions article (Granger, 2008), “Building measures using RA requires that the data fit the model, not that the model fit the data.” Because of the stringent criteria of what constitutes “measurement,” Rasch theory encourages researchers to discard items and to discard observations if responses do not meet with the model’s expectations. Note that if this was the practice in other sciences, we would still be living in a world where the sun is “known” to revolve around the Earth, the Earth is “known” to be 8000 years old, and infections are most effectively “cured” by the letting of blood.

While Rasch modeling analyses provide tests of whether a given item or person conforms to Rasch model expectations or not, the Rasch philosophy is predicted on a philosophical tenet, namely that the Rasch model is true (i.e., all item discrimination parameters are equal).  The broader measurement community uses hypothesis-driven statistical tests to select the model that best represents the world as responders report it to be. Such statistical tests will support the use of a Rasch model if it is found to most accurately model the data rather than because it is consistent with one’s beliefs about measurement.

 

TO RASCH: Rasch requires much smaller samples than other IRT models.

NOT TO RASCH: It is true that a model with fewer parameters, when compared to a model with more parameters, will require smaller samples for accurate estimation. That being said, Rasch advocates often take the idea that small samples can be effectively analyzed with Rasch models beyond reasonable boundaries.  For instance, a commonly cited table suggests 30 is the minimum sample needed for dichotomous items and 50 is the minimum required for polytomous (Likert-type) items with 5 response categories. While you can certainly submit a small sample to a Rasch modeling program and obtain results, there is no empirical evidence to support the contention that accurate item parameter estimates can be obtained from samples as small as those commonly advocated as “sufficient” by the Rasch community.

 

TO RASCH: Rasch modeling provides intuitive and informative figures, such as item-person maps, that make the analysis more accessible to non-experts.

NOT TO RASCH: Item-person maps, which plot the estimated difficulty parameters against the estimated person parameters, are not unique to Rasch models. Such plots may be constructed from the item and person parameter estimates of any IRT model. For example, the item-person map below is from a flexMIRT® graded response model analysis of 18 items, each with 5 categories, estimated using a sample of 3000.

item-person map

 

TO RASCH: The Rasch family of models makes no assumptions about the ability distribution, which most other IRT models do.

NOT TO RASCH: Contrary to the word on the street, non-Rasch IRT models do not make any assumptions regarding the ability distribution. The estimation method historically used to obtain parameter estimates does make assumptions regarding the ability distribution for statistical convenience, but these assumptions are not required given the advancement in statistical theory and computing resources over the last several decades. For instance, the use of empirical histograms or Ramsay Curve Item Response Theory, neither of which make assumptions regarding the shape of the ability distribution, are now readily available in commercial IRT software programs, such as flexMIRT®.

 

TO RASCH: The use of Rasch modeling prior to the typical quantitative phase of scale development can help establish content validity of a developing scale/PRO.

NOT TO RASCH: From Mike Linacre, “Rasch is not concerned with content validity […]. Rasch cannot know what is and what is not included in the content area.” Further discussion of this topic inevitably reverts back to the small sample size issue addressed earlier.

As advised by Polonious, “This above all: to thine own self be true.”  If you adhere to the Rasch philosophy of measurement than, by all means, use Rasch models. But as for me and my house, we shall serve the BIC.