Joint modeling of item product data and process data using psychometric and machine learning methods, item response theory modeling, computer-based testing, psychometric research and analysis for large-scale high-stakes tests.

I am a Professor at the 糖心少女 (UMD), College Park specializing in educational measurement and psychometrics in large-scale assessment, and also Director of the Maryland Assessment Research Center (MARC). I received my Ph.D. from Florida State University in Measurement, Statistics, and Evaluation. Prior to joining the faculty in QMMS at UMD, I worked as a psychometrician at Harcourt Assessment on different state assessment programs.

The overarching goal of my methodological research is to improve the practice of educational and psychological assessment and develop solutions to emerging psychometric challenges. Many of these are due to the use of more complex innovative assessment formats. I believe that ultimately this work will promote the use of assessment data for cognitive diagnosis to facilitate learning and instruction. I have successfully established a coherent program of research by integrating item response theory (IRT) models, cognitive diagnostic models (CDM), and machine learning methods with an aim to exert a positive and lasting effect on measurement practice in large-scale assessments and for cognitive diagnosis purposes.

My research agenda can be summarized into five general categories: methodological research on local dependence due to the use of complex innovative testlets, modeling classification and cognitive diagnosis, Bayesian model parameter estimation, research on computer-based testing, and practical psychometric issues in large-scale tests. My methodological research has been recognized by two national awards, academic work including edited books, book chapters, refereed journal papers, and national and international invited and refereed presentations, as well as research grants and contracts. I proposed a multilevel testlet model for mixed-format tests that won the 2014 Bradley Hanson Award for Contributions to Educational Measurement by the National Council on Measurement in Education (NCME) and the co-edited book on Process Data for Educational and Psychological Measurement won the 2023 NCME Annual Award.

I have participated in a variety of professional service activities. I serve on the Technical Advisory Committee (TAC) for the Maryland state testing programs. I have also served on the Research and Psychometric committee for the PARCC consortium testing program representing Maryland. Further, my research expertise has been recognized by being voted Chair-Elect for the Rasch Special Interest Group for the AERA, Co-Chair, and Chair of the AERA Division D2 program, Co-Chair and Chair of the AERA Division D Significant Contribution to Educational Measurement and Research Methodology Award Committee . As a co-chair, I hosted the 2023 International Meeting of the Psychometric Society (IMPS) on the 糖心少女campus.

I have served on the editorial board for multiple measurement journals and for the Springer book series: Methodology of Educational Measurement and Assessment.  I served as a guest editor for two special topics on 鈥Process Data in Educational and Psychological Measurement鈥 and 鈥Cognitive Diagnostic Assessment for Learning鈥 in Frontiers in Psychology,  two issues on "Machine Learning and Deep Learning in Assessment" for the Journal of Psychological Test and Assessment Modeling, and 'Data Augmentation in Computational Psychometrics' for Journal of Educational Measurement. I co-organized several MARC annual conferences and co-edited books on different cutting-edge topics in assessment including technology-enhanced innovative assessment, the applications of artificial intelligence in assessment, and enhancing effective instruction and learning using assessment data: theory and practice.

2003               The Spaan Fellowship, Funded research in Second or Foreign Language Testing, University of Michigan, Ann Arbor , MI.

2002               The Lenke Psychometric Fellowship, Harcourt Educational Measurement, San Antonio, TX.

1998-2001    University Fellowship, Florida State University, Tallahassee, FL.

2023 The Annual Award for Exceptional Achievement in Educational Measurement by the National Council on Measurement in Education.

2014 The Bradley Hanson Award for Contributions to Educational Measurement by the National Council on Measurement in Education.
2010 The American Educational Research Association Research Grant sponsored by the National Science Foundation.
2005 The Revere Award for Customer Focus, Harcourt Assessment, Inc., San Antonio, TX.

Books Edited (Selected)

  1. Jiao, H., & Lissitz, R. W. (2024). Machine learning, natural language processing and psychometrics. Charlotte, NC: Information Age Publisher.
  2. Jiao, H., He, Q., & Veldkamp, B. P. (2021). Process data in educational and psychological measurement. Frontiers in Psychology.
  3. Jiao, H., & Lissitz, R. W. (2021). Enhancing effective instruction and learning using assessment data. Charlotte, NC: Information Age Publisher.
  4. Jiao, H., & Lissitz, R. W. (2020). Applications of artificial intelligence to assessment. Charlotte, NC: Information Age Publisher.
  5. Jiao, H., Lissitz, R. W., & Van Wie*, A. (2018, Eds.). Data analytics and psychometrics: Informing assessment practices. Charlotte, NC: Information Age Publisher.
  6. Jiao, H., & Lissitz, R. W. (2017, Eds.). Technology enhanced innovative assessment: Development, modeling, and scoring from an interdisciplinary perspective. Charlotte, NC: Information Age Publisher.
  7. Jiao, H., & Lissitz, R. W. (2017, Eds.). Test fairness in the new generation of large-scale assessment. Charlotte, NC: Information Age Publisher.
  8. Jiao, H., & Lissitz, R. W. (2015, Eds.). The next generation of testing: Common core standards, Smarter-Balanced, PARCC, and the nationwide testing movement. Charlotte, NC: Information Age Publisher.
  9. Lissitz, R. W., & Jiao, H. (2014, Eds.). Value added modeling and growth modeling with particular application to teacher and school effectiveness. Charlotte, NC: Information Age Publisher.
  10. Lissitz, R. W., & Jiao, H. (2012, Eds.). Computers and their impact on state assessment: Recent history and predictions for the future. Charlotte, NC: Information Age Publisher.

Chapters in Books (Selected)

  1. Liao, M. & Jiao, H. (2024). Module assembly and routing of cognitive diagnostic multistage adaptive test. In D. Yan (Eds.), CD-MST. NC: Information Age Publishing.
  2. Jiao, H., Xu, S., & Liao, M. (2024). Exploration of the stacking learning algorithm for automated scoring of short-answer questions in reading. In M. D. Shermis & J. Wilson (Eds.), The Routledge International Handbook of Automated Essay Evaluation (pp. 40-54). Routledge, Inc. 
  3. Jiao, H., Liao*, D., & Zhan*, P. (2019). Utilizing process data for cognitive diagnosis. In M. von Davier & Y. Lee (Eds.), Handbook of Diagnostic Classification Models.
  4. He, Q., Liao*, D., & Jiao, H. (2019). Clustering behavioral patterns using process data in PIAAC problem-solving items. In B. P. Veldkamp & C. Sluijter (Eds.). Theoretical and Practical Advances in Computer-Based Educational Measurement. Springer. Methodology of Educational Measurement and Assessment (book series), Springer. .
  5. Jiao, H., & Li*, C. (2018). Progress in International Reading Literacy Study (PIRLS) data. In The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. Thousand Oaks, CA: Sage.
  6. Jiao, H., & Liao*, D. (2018). Testlet response theory. In The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. Thousand Oaks, CA: Sage.
  7. Jiao, H., Lissitz, R. W., & Zhan*, P. (2017). Calibrating innovative items embedded in multiple contexts. In H. Jiao & R.W. Lissitz (Eds.), Technology-enhanced innovative assessment: Development, modeling, scoring from an interdisciplinary perspective. Charlotte, NC: Information Age Publishing.
  8. Jiao, H., Kamata, A., & Xie, C. (2015). A multilevel cross-classified testlet model for complex item and person clustering in item response modeling. In J. Harring, L. Stapleton, & S. Beretvas (Eds.), Advances in multilevel modeling for educational research: Addressing practical issues found in real-world applications (pp.139-161). Charlotte, NC: Information Age Publishing.
  9. Luo, Y., Jiao, H., & Lissitz, R. W. (2015). An empirical study of the impact of the choice of persistence model in value-added modeling upon teacher effect estimates. In L. A. van der Ark, D. Bolt, W.-C. Wang, J. A. Douglas & S.-M. Chow (Eds.), Quantitative psychology research (pp.133-143). Springer, Switzerland.
  10. Jiao, H., & Lissitz, R. W. (2014). Direct modeling of student growth with multilevel and mixture extensions. In R. W. Lissitz & H. Jiao (Eds.), Value added modeling and growth modeling with particular application to teacher and school effectiveness (pp.293-306). Charlotte, NC: Information Age Publisher.
  11. Jiao, H., & Chen*, Y.-F. (2014). Differential item and testlet functioning. In A. Kunnan (Ed.),The Companion to Language Assessments (pp.1282-1300). John Wiley & Sons, Inc.
  12. Chen*, Y.-F., & Jiao, H. (2014). Does model misspecification lead to spurious latent classes? An evaluation of model comparison indices. In R. E. Millsap et al. (Eds.), New development in quantitative psychology, Springer Proceedings in Mathematics & Statistics, 66, DOI 10.1007/978-1-4614-9348-8_22, Springer Science +Business Media, New York.
  13. Jiao, H., & Lissitz, R. W. (2012). Computer-based testing in K-12 state assessments: An Introduction. In R. W. Lissitz & H. Jiao (Ed.), Computers and their impact on state assessment: Recent history and predictions for the future (pp. 1-21). Charlotte, NC: Information Age Publisher.
  14. Templin, J. & Jiao, H. (2011). Applying model-based approaches to identify performance categories. In G. Cizek (Ed.), Setting performance standards: foundations, methods, and innovations (pp. 379-397). New York, NY: Routlege.
  15. Jiao, H., Wang, S., Kamata, A. (2007). Modeling local item dependence with the hierarchical generalized linear model. In E. V. Smith & R. M. Smith (Eds.), Rasch Measurement: Advanced and Specialized Applications. JAM press.
  16. Jiao, H. (2004). Evaluating the Dimensionality of the Michigan English Language Assessment Battery. Spaan Fellow Working Papers in Second or Foreign Language Assessment: Volume 2 (pp. 27-52). University of Michigan, Ann Arbor, MI.

Articles in Refereed Journals (Selected)

  1. Ren*, J., & Jiao, H. (2024). Bayesian joint modeling of item responses and response time in a statistical learning task. Journal of Cognitive Science, 25(2), 237-274.
  2. Bulut, O., Beiting-Parrish, M., Casabianca, J. M., Slater, S. C., Jiao, H., Song, D., ... & Morilova, P. (2024). The Rise of Artificial Intelligence in Educational Measurement: Opportunities and Ethical Challenges. arXiv preprint arXiv:2406.18900.
  3. Liao, M., Jiao, H., & He, Q. (2024). Explanatory cognitive diagnostic models incorporating item features. Journal of Intelligence. 
  4. Jiao, H., He, Q., & Yao, L. (2023). Machine learning and deep learning in assessment. Psychological Testing and Assessment Modeling. 65 (1), 179-190.
  5. Fu, Y., Zhan, P., & Jiao, H. (2023). Joint modeling of action sequences and action times in problem-solving tasks. Behavioral Research Methods. DOI: 10.3758/s13428-023-02178-2
  6. Zhu, H., Jiao, H., Gao, W., & Meng, X (2023). Bayesian change point analysis approach to detecting aberrant test-taking behavior using response times. Journal of Educational and Behavioral Statistics. 48, 490-520. 
  7. Zhou*, T. & Jiao, H. (2022b). Data augmentation in machine learning for cheating detection: An illustration with the blending learning algorithm. Psychological Testing and Assessment Modeling. 64(4), 425-444.
  8. Qiao, X., Jiao, H., & He, Q. (2022). A multiple group joint modeling of item responses, response times, and action counts with the Conway-Maxwell-Poisson distribution. Journal of Educational Measurement.
  9. Zhou*, T. & Jiao, H. (2022a). Exploration of the stacking ensemble machine learning algorithm for cheating detection in large-scale assessment. Educational and Psychological Measurement.  
  10. Liao*, M. & Jiao, H. (2022). Modeling multiple problem-solving strategies and strategy shift in cognitive diagnosis for growth. British Journal of Mathematical and Statistical Psychology.
  11. Qiao*, X. & Jiao, H. (2022). Explanatory cognitive diagnostic modeling incorporating response times. Journal of Educational Measurement. DOI: 10.1111/jedm.12306.
  12. Jiao, H, & Liao, M. (2021). Testlet response theory. Educational Measurement: Issues and Practices.
  13. Liao, M., Patton, J., Yan, R., & Jiao, H. (2020). Mining process data to detect aberrant test takers. Measurement: Interdisciplinary Research and Perspectives.
  14. Jiao, H. & Lissitz, R. (2020). What hath the Coronavirus brought to Assessment? Unprecedented Challenges in Educational Assessment in 2020 and Years to Come. Educational Measurement: Issue and Practices. .
  15. Liao*, D., He, Q., & Jiao, H. (2019). Mapping background variables with sequential patterns in problem solving environments: An investigation on US adults鈥 employment status in PIAAC. Frontiers in Psychology. .
  16. Zhan*, P., Jiao, H., Liao*, D. & Li, F (2019). A longitudinal higher-order diagnostic classification model. Journal of Educational and Behavioral Statistics. Advanced Online Publication. URL .
  17. Zhan*, P., Ma, W., Jiao, H., & Ding, S. (2019). A sequential higher-order latent structural model for hierarchical attributes in cognitive diagnostic assessments. Applied Psychological Measurement. Advanced Online Publication. URL .
  18. Zhan*, P., Jiao, H., Man, K, & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics. Advanced Online Publication. URL
  19. Zhan*, P., Wang, W.-C., Jiao, H., & Bian, Y. (2018). The probabilistic-inputs, noisy conjunctive models for cognitive diagnosis. Frontiers in Psychology. URL .
  20. Zhan*, P., Jiao, H., Liao*, M., & Bian, Y. (2018). Bayesian DINA modeling incorporating within-item characteristics dependency. Applied Psychological Measurement. 43, 143鈥158.
  21. Qiao*, X., & Jiao, H. (2018). Comparing data mining techniques in analyzing process data: A case study on PISA 2012 problem-solving items. Frontiers in Psychology.
  22. Zhan*, P. Jiao, H., & Liao*, D. (2017). Cognitive diagnosis modeling incorporating item response times. British Journal of Mathematical and Statistical Psychology. doi: 10.1111/bmsp.12114
  23. Luo, Y., & Jiao, H. (2017). Using the Stan program for Bayesian item response theory. Educational and Psychological Measurement. DOI: 10.1177/0013164417693666
  24. Li*, T., Xie*, C., & Jiao, H. (2016). Assessing fit of alternative unidimensional polytomous item response models using posterior predictive model checking. Psychological Methods.
  25. Li*, T., Jiao, H., & Macready, G. (2015). Different approaches to covariate inclusion in the mixture Rasch model. Educational and Psychological Measurement. DOI: 10.1177/0013164415610380
  26. Jiao, H., & Zhang*, Y. (2015). Polytomous multilevel testlet models for testlet-based assessments with complex sampling designs. British Journal of Mathematical and Statistical Psychology, 1, 65-83. Online first锛孌OI:10.1111/bmsp.12035.
  27. Wolfe, E., Song, T. W., & Jiao, H. (2015). Features of difficult-to-score essays. Assessing Writing.27, 1-10.
  28. Wolfe, E. W., Jiao, H., & Song, T. (2015). A family of rater accuracy models. Journal of Applied Measurement. 16
  29. Chen*, Y.-F. & Jiao, H. (2014). Exploring the utility of background and cognitive variables in explaining latent differential item functioning: An example of the PISA 2009 reading assessment. Educational Assessment.19, 77-96.
  30. Jiao, H., Wang, S., & He, W. (2013). Estimation methods for one-parameter testlet models. Journal of Educational Measurement, 50, 186-203.
  31. Li*, Y., Jiao, H., & Lissitz, R.W. (2012). Applying multidimensional IRT models in validating test dimensionality: An example of K-12 large-scale science assessment. Journal of Applied Testing Technology, vol. 13, n2.
  32. Jiao, H., Macready, G., Liu*, J., & Cho*, Y. (2012). A mixture Rasch model based computerized adaptive test for latent class identification. Applied Psychological Measurement, 36, 469-493.
  33. Jiao, H., Kamata, A., Wang, S., & Jin, Y. (2012). A multilevel testlet model for dual local dependence. Journal of Educational Measurement, 49, 82-100.
  34. Jiao, H., Liu*, J., Haynie, K., Woo, A., & Gorham, J. (2012). Comparison between dichotomous and polytomous scoring of innovative items in a large-scale computerized adaptive test. Educational and Psychological Measurement, 72, 493 - 509.
  35. Jiao, H., Lissitz, B., Macready, G., Wang, S., & Liang*, S. (2011). Exploring levels of performance using the Mixture Rasch Model for standard setting. Psychological Testing and Assessment Modeling, 53, 499-522.
  36. Jiao, H., & Wang, S. (2010). A multifaceted approach to investigating the equivalence between computer-based and paper-and-pencil assessments: An example of Reading Diagnostics. International Journal of Learning Technology, 5, 264-288.
  37. Wang, S., & Jiao, H. (2009). Construct equivalence across grades in a vertical scale for a K-12 large-scale reading assessment. Educational and Psychological Measurement, 69, 760-777.
  38. Wang, S., Jiao, H., Young, M. J., Brooks, T., & Olson, J. (2008). Comparability of computer- based and paper-and-pencil testing in K-12 reading assessments: A meta-analysis of testing mode effects. Educational and Psychological Measurement, 68(1), 5-24.
  39. Wang, S., Jiao, H., Young, M. J., Brooks, T., & Olson, J. (2007). A meta-analysis of testing mode effects in Grade K-12 Mathematics Tests. Educational and Psychological Measurement, 67(2), 219-238.
  40. Jiao, H., Wang, S., & Kamata, A. (2005). Modeling local item dependence with the hierarchical generalized linear model. Journal of Applied Measurement, 6(3), 311-321.

     

External Funding

Funding agency: Maryland State Department of Education 
Title: Psychometric research and analysis for Maryland state assessment programs.

Funding agency: The Council of Chief State School Officers

Title: Applying sampling weights in Kindergarten Readiness Assessment

Funding agency: The Partnership for Assessment of Readiness for College and Careers, Inc.
Title: Investigating New York City students鈥 performance on and experience with the 2015 PARCC pilot tests.
Funding agency: Management Systems International
Title: Alignment Study for University Readiness Test in Egypt.
Funding agency: National Council on Measurement in Education
Title: A multilevel testlet model for mixed-format tests
Funding agency: American Educational Research Association/National Science Foundation #DRL-0941014
Title: Latent differential item functioning analysis for testlet-based assessments
Funding agency: National Council of State Boards of Nursing, Joint Research Committee
CO-PI with PI: Kathleen C. Haynie
Title: A Partial Credit Modeling Study of NCLEX Innovative Items
Funding agency: University of Michigan, Ann Arbor
Title: Evaluating the dimensionality of the Michigan English Language Assessment Battery

 

COURSES TAUGHT 

Graduate Courses

Course Title
Instrumentation
Applied Measurement: Issues and Practices
Classification and cognitive diagnosis
Computerized adaptive testing
Psychometrics in large-scale assessment
Quantitative methods I
Modern measurement theory-Item response Theory