Mục lục bài viết

Neuro-QoLTM Reference Populations

Neuro-QoL measures use scores that are more than just numbers: Neuro-QoL scores have meaning. For example, a Neuro-QoL measure score of 50 is the mean score for that measure. But average for whom? This is where “score meaning” comes in. Each Neuro-QoL measure has its own reference population , that is, its own specific group of people (e.g., the U.S. general population, kids with a neurological condition) who have been sampled and then thoroughly assessed with the Neuro-QoL measure. The average score of 50 for that Neuro-QoL measure is their average score. Thus, when someone is newly assessed with a Neuro-QoL measure, and we observe that her score is 50, we have a score value, and we can say that her score is the same score as the average score for the Neuro-QoL measure’s reference population – the population group to which we turn to or refer for score meaning.

A T-score is a standardized score, like z-scores and IQ scores. All standardized scores have a “middle” score; it is zero for z-scores, 100 for IQ scores, and 50 for T-scores. This middle score is the mean of a large sample that is representative of a relevant population— a reference population . The large sample used to represent the reference population is called the Centering Sample .

This can all get very confusing because sometimes the calibration sample (the sample used to estimate item response theory parameters) and centering sample (the sample used to define the middle of the score range) were the same. But sometimes they were different. For example, a measure may be calibrated in a clinical sample but then centered in the general population. The mean of T=50 for that measure reflects the average in the general population, not the clinical sample.

When developing a measure with standard scores, an important consideration is what the middle score means. The scores of such measure are purposefully “centered” at the mean of a specific sample or subsample. Neuro-QoL uses T-score, so the middle score is always 50. Centering scores in this way allows quick interpretation of where an individual is on a symptom or outcome compared to others in the reference population. A score of 50 on Neuro-QoL Anxiety, for example, is comparable to the U.S. “average”. T-scores have a standard deviation of 10, so a score of 60 would indicate anxiety that is a standard deviation higher than the U.S. average.

Centering Sample and Calibration Sample

It is helpful to remember that the middle score of a standard score range has to be defined. For measures that use a T-score metric, 50 is the mean and 10 is the standard deviation, but they do not start out that way. The scores are first estimated using an item response model and the IRT-calibrated scores are transformed to a T-score metric using a linear transformation. But first you have to decide which score on the IRT metric is going to be the middle score—a score of 50. This is done by collecting scores from a large sample that represents the reference population and then calculating the mean for that sample. That score becomes the middle score (e.g., 50 for T-scores). A linear transformation spaces all other scores along the continuum so they have the correct values relative to the middle score (mean of the centering sample) used to represent the middle score.

IMPORTANT: The Centering Sample and the Calibration Sample may not be the same sample.

The purpose of a calibration sample is to estimate item parameters (item characteristics such as difficulty and discrimination) using an item response theory model. Here’s where it can get confusing. Sometimes a single sample was used as both the calibration sample AND the centering sample. Other times one sample was used as the calibration sample and another was used as the centering sample.

The Reference Population tables show the calibration and the centering samples for Neuro-QoL. Most users will be particularly interested in the last column (Centering Sample). If you want to know what a score in the middle is (e.g., 50 for those scored on a T-score metric), go to the Centering Sample column. For example, if you go to the row for Upper Extremity Function – Fine Motor, Activities of Daily Living, you will see that the item parameters were estimated (calibrated) using a hybrid of individuals getting care in neurology clinics and individuals from the general population. BUT, the centering sample was the general population. A score of 50 on this measure is comparable to the general population average level of upper extremity function.