Course abstract
In the era of artificial intelligence (AI), the field of educational testing faces significant challenges, while also welcoming new technological advancements, particularly in test development and scoring. Two key innovations include automated item generation (AIG) and automated scoring (AS).
Developing high-volume, high-stakes tests is an intricate process. Only recently that generative AI has facilitated the development of complex test items on a large scale. In this course, we will introduce a novel framework, "the item factory", for managing large-scale test development including automation of item generation, quality review, quality assurance, and crowdsourcing techniques. We will present an overview of the latest natural language processing (NLP) techniques and large language models for automatic item generation, alongside evidence-centered design and psychometric principles and practices for test development. We will discuss the application of engineering principles in designing efficient item production processes (Luecht, 2008; Dede et al, 2018; von Davier, 2017).
As automated scoring (AS) becomes increasingly prevalent for formative and summative assessments, it becomes an integral part of the assessment landscape due to their advantages in reporting time, cost, objectivity, consistency, transparency, and feedback. This short course aims to demystify automated scoring and provide practitioners with a comprehensive understanding of its workings. We will offer an overview of the design, development, evaluation, and quality control of automated scoring systems, along with practical advice and considerations for practitioners on the applications and integration of these systems into formative and summative assessments (Yan, Rupp, & Foltz, 2020).