S. C. Chuah, F. Drasgow, and R. Luecht, How big is big enough? Sample size requirements for CAST item parameter estimation, Applied Measurement in Education, vol.19, issue.3, pp.241-255, 2006.

F. Gao and L. Chen, Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model, Applied Measurement in education, vol.18, issue.4, pp.351-380, 2005.

S. J. Haberman, S. Sinharay, and K. H. Chon, Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions, Psychometrika, vol.78, issue.3, pp.417-440, 2013.

S. Kim, T. Moses, and H. H. Yoo, A comparison of IRT proficiency estimation methods under adaptive multistage testing, Journal of Educational Measurement, vol.52, issue.1, pp.70-79, 2015.

J. H. Neel, A New Goodness-of-Fit Test for Item Response Theory, Journal of Modern Applied Statistical Methods, vol.3, issue.2, pp.581-593, 2004.

A. Sahin and D. An?l, The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory, Educational Sciences: Theory & Practice, vol.17, 2016.

D. Svetina, A. V. Crawford, R. Levy, S. B. Green, L. Scott et al., Designing small-scale tests: A simulation study of parameter recovery with the 1-PL. Psychological Test and Assessment Modeling, Educational and Psychological Measurement, vol.55, issue.4, pp.508-527, 2013.

G. Yavuz and R. K. Hambleton, Comparative Analyses of MIRT Models and Software (BMIRT and flexMIRT), vol.77, pp.263-274, 2017.