Results 21 to 30 of about 225,155 (290)
In numerous studies focusing on assessment and evaluation of teaching Turkish as a foreign language, researchers have frequently identified issues related to the standardization and low validity and reliability of exams.
Seçil Alaca, Funda Keskin
doaj +2 more sources
The difference between estimated and perceived item difficulty: An empirical study
Test development is a complicated process that demands examining various factors, one of them being writing items of varying difficulty. It is important to use items of a different range of difficulty to ensure that the test results accurately indicate ...
Okan Bulut, Ayfer Sayın
doaj +2 more sources
Applying a two-parameter item response model to explore the psychometric properties: The case of the ministry of Science, Research and Technology (MSRT) high-stakes English Language Proficiency test [PDF]
Perhaps the degree of test difficulty is one of the most significant characteristics of a test. However, no empirical research on the difficulty of the MSRT test has been carried out.
Shahram Ghahraki +2 more
doaj +1 more source
The Effect of Item Pools of Different Strengths on the Test Results of Computerized-Adaptive Testing
Item response theory provides various important advantages for exams carried out or to be carried out digitally. For computerized adaptive tests to be able to make valid and reliable predictions supported by IRT, good quality item pools should be used ...
Fatih Kezer
doaj +1 more source
The development and evaluation of valid assessments of scientific reasoning are an integral part of research in science education. In the present study, we used the linear logistic test model (LLTM) to analyze how item features related to text complexity
Moritz Krell, Samia Khan, Jan van Driel
doaj +1 more source
Examining Item Difficulty in NLP: To What Extent Do Examinees Affect Item Difficulty? [PDF]
Recent research in Natural Language Processing (NLP) has focused on estimating the difficulty of text content, culminating in a shared task conducted in 2025. However, since many researchers in NLP are not experts in educational psychology, the item difficulty in these shared task datasets is commonly defined by the proportion of examinees who answer ...
Ehara, Yo
openaire +2 more sources
This work presents a comparative analysis of various machine learning (ML) methods for predicting item difficulty in English reading comprehension tests using text features extracted from item wordings.
Lubomír Štěpánek +2 more
doaj +1 more source
This study aims to analyze an assessment instrument, mainly the characteristics of the test items, by using a Quest program. This study is a descriptive quantitative study in one school in Yogyakarta.
Ikhsanudin Ikhsanudin +3 more
doaj +1 more source
Parameters and Models of Item Response Theory (IRT): A Review of Literature
Introduction: Item response theory (IRT) has received much attention in validation of assessment instrument because it allows the estimation of students’ ability from any set of the items.
Gyamfi Abraham, Acquaye Rosemary
doaj +1 more source
Psychometric Properties of 3-, 4-, and 5-Option Item Tests: Do Test Takers’ Personality Traits Make a Difference? [PDF]
Prior research has yielded mixed results regarding what contributes psychometrically sound multiple-choice (MC) items. The purpose of the present study was, therefore, twofold: (a) to compare 3-, 4-, and 5-option multiple-choice (MC) tests in terms of ...
Fatemeh Khaleghi, Rajab Esfandiari
doaj +1 more source

