exams - Why do some tests have a (nonzero) minimum score?

Saturday, 2 March 2019

exams - Why do some tests have a (nonzero) minimum score?

Some tests have minimums in their possible score range. Cisco's 300-1000 point range, and the SAT's 200-800 point per section range come to mind.

What purpose does this serve? I assume there is some statistical logic behind it. Maybe it would make more sense to me if I understood how they go about calculating the score from a given number of (in)correct questions.

Answer

I might be able to help answer this from a background in Psychometrics. Where I work we produce many tests that are all standardised and then equated to be put onto the same scale. These scales however, from one test to another, are unrelateble, unless of course the two differing tests have an equating study completed to determine the shift factor to transfer a scale from say Test 1 to the scale of Test 2.

To construct a scale, we first analyse the test data, so student response data and item(question) data. We do the analysis using the Rasch Model, which only takes into account two variables, the students' abilities and the items' difficulties. This allows us to construct a dataset that contains the logit levels of the students' abilities and of the items' difficulties.

Definition of Logit:

A logit is a unit of measurement to report relative differences between candidate ability estimates and item difficulties. Logits are an equal interval level of measurement, which means that the distance between each point on the scale is equal (1-2=99-100).

Once the logit tables have been created they can be used to create a scale by applying a simple linear transformation, such as:

scale score = 10 * logit difficulty + 250

In some of the work I do we have scale scores that actually are below 0, however most of the work I do, scale scores are constructed such that the minimum is around 200 or so. The construction of the scale is for the most part entirely arbitrary.

If you wish to see how the logits of students and items are calculated please read:

https://en.wikipedia.org/wiki/Rasch_model#The_mathematical_form_of_the_Rasch_model_for_dichotomous_data

Also as an extra note: There are other models for doing test analysis, such as the 2PL (Introduces an additional parameter to Rasch Model(1PL), the items discrimination), the 3PL (Introduces an additional parameter to the 2PL, which is a guess factor, this creates a minimum probability of getting the item incorrect which depends on your guess value), there is also a 4PL which adds an additional parameter(the slip paremeter, that creates a ceiling probability, that is not 1, for getting an item correct).

I hope this helps and provides some extra information that may be of use.

Blog

Saturday, 2 March 2019

exams - Why do some tests have a (nonzero) minimum score?

No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?