VPG’s Psychometrics blog provides our take on assessment and quantitative methods.

Celebrating 50 Years of Structural Equation Modeling with LISREL

Scientific Software Inc., a wholly-owned subsidiary of Vector Psychometric Group, LLC, has released LISREL 11 on 06/24/2021.  The substantially upgraded new version marks the golden jubilee of a seminal development in the history of Structure Equation Modeling (SEM).  A little over a half century ago Professor Karl Jöreskog published a monograph in the Educational Testing Service (ETS) Research Bulletin series, entitled A General Method for Estimating a Linear Structural Equation System. The technical development was supported by the availability of … Read More

A rose by any other name is still a 0.8

In the “wild,” IRT scores are typically on the standard normal metric, meaning that bell-shaped distribution when it has a mean of 0 and a standard deviation of 1. Due to SCIENCE!, we know that almost all of the scores in such a distribution will fall between -3 (well below average) and +3 (well above average) with a lot of the scores piling up closer to 0. For statistics-minded folks, this distribution is pretty familiar and easy to interpret (You … Read More

Computerized Adaptive Testing

  Computerized adaptive testing (CAT1) refers to the procedures underlying a test/assessment/whatevs that is generated “on the fly” by a computer, using a person’s previous answers to determine what question the person will see next. My amazing, trade-mark infringing, artwork notwithstanding, CAT is not magic – it is, in fact, SCIENCE! A reader-friendly conceptual overview of CAT was recently published here: (paywall warning!). I’ll wait while you go read that… Finished? Great! OK – so now we know that … Read More

Let’s talk about DIF, baby!

Let’s talk about you and me (and our respective defining group memberships), let’s talk about all the good things (obtaining accurate scores) and the bad things (model misspecification) that may be. Let’s talk about DIF. Differential item functioning (DIF) is the psychometric jargon term for when items perform differently depending on characteristics of the person answering the item. Analyses for detecting DIF should be part of any initial calibration study of a COA, PRO, or test and are readily available … Read More

That other can o’ worms

Last time, we had a riveting discussion about reliability and mentioned that validity is also something to consider when talking about tests/assessments/scales and the scores that come from them. So let’s do that. Previously, we used Score = truth + error to talk about measurement and the idea of reliability. That is, the more garbage (error) there is in your scores, the less reliable they will be. If your scores are full are noise, then your boat is already sunk. … Read More

Ol’ Reliable

Sorry to disappoint, but this is not a post about Spongebob’s favorite jellyfishing net – it’s about something way more awesome, the psychometric concept of RELIABILITY! Reliability refers to the idea that you’re measuring whatever it is you’re measuring in a repeatable and precise fashion. For a scale, this would mean that if you step on, get a reading of 150.2 lbs, step off and step back on, the scale should say close to 150.2 again. If you get on … Read More

From beneath you, it devours

I’m showing my 90’s girl power roots in that title, referencing a theme from Season 7 of Buffy the Vampire Slayer. In this season, the big bad called “the First Evil” starts bubbling up from the Hellmouth (a doorway into our world for all sorts of demons and vampires) that the town was built on, intent on destroying the world. While I certainly am not claiming that statistical models are evil, although many graduate students would argue with me on … Read More

A post in which I open myself up to slings and arrows

To Rasch or not to Rasch, that is the question. TO RASCH: Rasch measurement theory articulates ideal requirements and provides objective criterion by which to assess items and people. Because of this, Rasch analysis (RA) allow for hypothesis-driven statistical tests while other IRT models provide only descriptive information. NOT TO RASCH:  With its strong requirements regarding how items and respondents must perform, the Rasch philosophy is a highly prescriptive version of measurement, well beyond the general idea of measurement as … Read More

Normality? We don’t need no stinkin’ normality!

Very important in statistical theory is a statistical distribution called the normal distribution. It’s sometimes called the “bell curve” because of its shape. Looking at the plot of a normal distribution below, the high point around 0 means that’s where the majority of observations/people are, with fewer and fewer folks as you get more extreme in either the positive or negative direction. When the mean value of a normal distribution is 0 and the standard deviation is 1 (as in … Read More

When you assume…

Like a lot of fathers, my dad is fond of ugly ties, bad jokes, and meant-to-be-funny sayings. When stuck behind a slow car, “Drive it or milk it” is an odds-on favorite to come out of his mouth. If you are talking to him and say, “I’m not sure but I guess…” about a topic, his response will likely be, “You know what happens when you assume, don’t you?” referring to the adage, “When you assume, you make an ASS … Read More