EMERGING DIALOGUES IN ASSESSMENT

Collaborative AI for Assessment: Building Momentum Across Two Campuses

October 30, 2025

  • Yan Z. Cooksey, Ph.D., Director of Assessment, Southern Methodist University
  • Amy J. Heston, Ph.D., Professor of Inorganic Chemistry, Walsh University
  • Tatiana C. Tolson, Undergraduate Student, Walsh University

Abstract

This collaborative project between Southern Methodist University (SMU) and Walsh University examines how artificial intelligence (AI) can be integrated into higher education assessment to streamline work, improve quality, and support innovation. Across both campuses, AI accelerated routine tasks, improved the clarity and usefulness of assessment artifacts, and prompted new approaches to learning design. While SMU concentrated on program-level assessment, Walsh emphasized course-level enhancement. Together, the cases identify practical methods, faculty development needs, and opportunities for continued inquiry into responsible AI use. The findings also underscore the value of cross-campus collaboration for advancing student learning and institutional effectiveness. 



Purposeful partnerships in assessment help sustain momentum toward improvement. This article reports a collaborative initiative between two assessment professionals (one an assessment administrator, the other a faculty member) and an undergraduate student researcher whose Honors project focused on assessment innovation. Together, the team examined practical applications of Artificial Intelligence (AI) tools for assessment at program and course levels. 

The initiative pursued three goals: (1) share effective strategies, (2) promote student-led inquiry, and (3) provide insights into ethical, practical uses of AI in assessment. Drawing on shared experience, the authors identify themes and propose next steps that show how AI can strengthen assessment from planning through evidence use across institutional contexts.

Program-Level Assessment Innovation with AI: SMU

Generative AI, particularly tools like ChatGPT, is prompting a re-examination of assessment toward more authentic, process- and reflection-oriented designs (Lee & Soylu, 2023). At Southern Methodist University (SMU), AI-supported strategies improved the quality, efficiency, and strategic alignment of program-level assessment. 

Program assessment reports at SMU serve as primary records for documenting student learning and continuous improvement. Faculty used AI tools (e.g., ChatGPT) to draft clear, coherent, and structured narratives that facilitated more meaningful faculty engagement and fostered productive discussions around data use and program improvement. Mission statements were synthesized with AI support to incorporate stakeholder input and align with institutional goals, and drafts served as starting points for collaborative faculty revision, thereby preserving accuracy and ownership. 

Developing specific, measurable, achievable, relevant, and time-bound (SMART) outcomes is often challenging. AI translated general outcome ideas into SMART-aligned statements tailored to program objectives. The result was improved clarity, measurability, and cross-departmental consistency, which supported accreditation needs. AI also supported instrument design. Faculty used AI to propose both direct and indirect measures aligned with program learning outcomes and to generate rubric structures with criteria, performance descriptors, and achievement levels. These drafts supported transparent, equitable assessments aligned with institutional priorities. In parallel, faculty iteratively refined assignments with AI to better target critical thinking, integrative knowledge, and creative expression, yielding authentic evidence of learning.   

Since adopting these strategies, SMU has observed greater faculty confidence in assessment tool design, higher quality and coherence in assessment reports, and more effective use of results for planning. AI complemented rather than replaced faculty expertise, and human judgment remained essential for tailoring AI-generated content to disciplinary and contextual needs. In practice, AI functioned as a drafting and sense-making partner while faculty provided validation, localization, and final decisions. 

Bridging program design and course practice is central to the two-campus model. The program-level work at SMU established a framework for how AI can improve the architecture of assessment, including outcomes, measures, rubrics, and reporting. Walsh University extended this framework into the learning environment, testing AI where students engage with content, practice skills, and demonstrate outcomes. This handoff, from institutional alignment to classroom execution, creates a throughline that clarifies how AI supports learning and evidence-building across the curriculum.  

Course-Level Learning Innovation with AI: Walsh

Patterns in student achievement in freshman chemistry prompted Walsh University to test whether AI could enhance learning in Principles of Chemistry I Laboratory (CHEM 101L). The work centered on two areas, curation of media for foundational skills and creation of measurable module-level outcomes (MLOs) with aligned assessments, while systematically critiquing AI outputs for accuracy and usefulness. Overall, the goals were to 

  1. explore the possibilities for various AI tools, 
  2. provide effective instructional content for CHEM 101L, 
  3. create effective prompts through best practices in prompt engineering, 
  4. identify instructional materials, 
  5. evaluate AI output for relevance, accuracy, and effectiveness, 
  6. align new content to Quality Matters (QM) Specific Review Standards (SRS) (Quality Matters, 2023), 
  7. identify and record AI hallucinations, and 
  8. ask AI to create module-level outcomes (MLOs) and analyze each one. 

After practicing prompt engineering, the undergraduate student researcher used Gemini to locate short, engaging videos to strengthen foundational lab skills. The final prompt stated “You are an expert in chemistry lab courses in higher education. Please find YouTube videos and active links to the video content to help my learners gain foundational chemistry lab-based skills. Videos should be 6 min or less. Content should be engaging for students and may include a lab tour of a medical center, large scale university lab, hospital, pharmaceutical lab, environmental lab, or prep lab.” Hallucinations and tool limitations were documented. Valid results by tool were Perplexity: 3 of 5; ChatGPT: 0 of 5 with inactive links; Copilot: 2 of 3; Gemini: 7 of 7 with active links. Gemini could also refine videos for a requested time limit of six minutes whereas other AI tools failed to do so after the initial prompt. The exercise enhanced AI literacy by emphasizing verification, curation, and accessibility checks rather than accepting outputs at face value. 

A second task asked AI to propose six MLOs for each lab and create four aligned multiple-choice questions. It involved this prompt: “Decide six module level outcomes for the lab provided below. Based on these outcomes, create four multiple choice questions that align with the module level objectives. Define choices A, B, C, D and indicate the correct answer by bolding it.” In order to finalize this prompt, the undergraduate researcher inserted the lab instructions after the last sentence of the prompt to ground outcome generation and item writing. The finalized MLOs for the sodium analysis of a pretzel lab were 1) demonstrate knowledge of quality control principles 2) illustrate competency in gravimetric analysis techniques 3) perform precipitation reactions and solid collection procedures 4) calculate stoichiometric relationships and 5) interpret and analyze experimental results. The finalized MLOs for the volumetric analysis lab included 1) demonstrate knowledge of titration principles 2) utilize titration techniques 3) perform calculations related to acid-base titrations 4) explain the concept of standardization 5) compare theoretical and experimental values and 6) interpret titration results. Item prompts and distractors were subsequently reviewed for accuracy, cognitive level, and alignment. 

Overall, Gemini provided the best videos for enhancing chemistry concepts in CHEM 101L in alignment with QM SRS. ChatGPT, Copilot, and Perplexity produced inactive links with no valid content even after refinement. Particularly, regarding Quality Matters General Standard 8, Gemini’s video content was accessible through YouTube’s settings including clear audio, closed captions with appropriate punctuation, options for playback speed, and the ability to advance, rewind, and resize the video (Quality Matters, 2023). Another one of Gemini’s advantages was found related to creating learning outcomes that align with best practices for measurability. Essentially, Gemini produced the most measurable outcomes that were clearly stated for learners. 

Reflections and Conclusion

Across SMU and Walsh, several themes converge. AI can reduce administrative burden, improve the design and documentation of assessment, engage students in authentic inquiry, and catalyze innovation at both program and course levels. The collaboration shows how institutional architecture and classroom practice inform one another. Program-level work sets expectations for quality and alignment, and course-level work tests those expectations in contexts where students learn and demonstrate outcomes. 

Continued progress depends on three priorities. First, faculty development in AI literacy and prompt engineering can help instructors and program leaders frame effective inputs and evaluate outputs. Second, adoption of best practices for integrating AI into curriculum and assessment design can improve transparency and usefulness of evidence. Third, ongoing cross-institutional research and dialogue about ethical use can help institutions balance innovation with integrity, including the documentation of hallucinations, the verification of sources, and the preservation of disciplinary expertise.

AI offers transformative potential when guided by human expertise and ethical reflection. This initiative strengthened assessment practices while reinforcing both institutions’ missions to advance excellence, collaboration, and student success.


References

Lee, J., & Soylu, M. Y. (2023, March). ChatGPT and assessment in higher education (White paper). Center for 21st Century Universities, Georgia Institute of Technology. https://c21u.gatech.edu/sites/default/files/publication/2023/03/C21U%20ChatGPT%20White%20Paper_Final.pdf

Quality Matters. (2023). QM higher education rubric workbook: Standards for course design (7th ed.). Maryland Online, Inc.