EMERGING DIALOGUES IN ASSESSMENT

Using ChatGPT to Develop Survey Questions: A Survey Researcher’s Experience Using Artificial Intelligence (AI) for Item Development


July 17, 2023

Jennifer Ann Morrow, Ph.D., University of Tennessee-Knoxville

ChatGPT (Chat Generative Pre-trained Transformer; https://openai.com/), is a generative artificial intelligence (AI) chatbot-style tool that produces human-like text (Stokel-Walker & Van Noorden, 2023) and has becoming increasingly popular on college campus with approximately 43% of college students reporting using ChatGPT or a similar AI (Welding, 2023).  There has been a lot of negative press regarding students using ChatGPT to write their papers and complete assignments and it has caused quite a stir on college campuses (Nietzel, 2023). Is ChatGPT the new cool tool for academic cheating or is it simply a tool to aid in writing and research?

I first became introduced to ChatGPT by one of my graduate students who told me I needed to check it out. They said it was like a super-enhanced Ask Jeeves (https://en.wikipedia.org/wiki/Ask.com), which intrigued me (and made me laugh at the old reference) so I signed up for a free account (there is also a paid version). As one who trains emerging assessment and applied research professionals in survey research, I spend a lot of time in class talking about different methods to create survey items (e.g., brainstorming, literature review, expert panel). I wanted to see how well ChatGPT generates survey questions for a specific topic and population, so I tested it out. 

Reflections on Using ChatGPT for Developing Survey Questions

After creating a free profile, I entered my first prompt, “create survey questions measuring sense of belonging in college students” since this is an area of research that I’m knowledgeable about. ChatGPT generated ten questions, 1 closed-ended question and 9 open-ended questions. The closed ended question was, “On a scale of 1-10 how much do you feel a sense of belonging on your college campus?” It didn’t specify the direction of the anchors (i.e., was 1 low or high sense of belonging), and it didn’t label the values of any of the response choices. Most of the open-ended questions had decent stems (e.g., “How often do you meet with other students outside of class?” “Do you feel comfortable approaching your professors for help outside of class?”) but were not appropriate in their current form as open-ended questions since the wording restricted the type of response (e.g., one-word answer, yes/no response) that participants could provide. However, these question stems would be a good starting point to create a closed-ended set of items with a Likert-type response scale. I then used more specific prompts (e.g., create survey questions measuring sense of belonging in college students using a 5-point Likert response scale) and the items generated by ChatGPT needed less modification for me to consider using them in a survey. I kept repeating this process with more specific prompts until I got sets of items that were what I considered quality items for this topic. Overall, it took me about 30 minutes and four prompts to get a decent set of survey questions about college student sense of belonging that I could use in a survey.

I also prompted ChatGPT to generate a set of demographic questions that I could include with my sense of belonging questions for my survey. My first prompt was “create a list of demographic questions I can use on a college student survey.” Ugh, definitely not specific enough! It generated six open-ended questions (e.g., “What is your race?”, “What is your age?”) that were very generic. I had to use multiple specific prompts (e.g., create a list of multiple choice questions containing a “prefer not to answer” and “other” option if applicable to measure college students’ demographics) before ChatGPT generated a list that was acceptable to me. 

Overall, I found ChatGPT to be a good starting point (not the final say!) in developing survey items for a topic that you are at least familiar with. With my background knowledge measuring sense of belonging it was easy for me to weed through the bad items and generate more specific prompts. If you are knowledgeable in a specific topic, ChatGPT would be a great tool to use to start generating items for your survey and then comparing those results to other measures/literature for that topic. If you want to generate items in a field that you are not very familiar with, I would still use it as a starting point, but I would use other methods to generate and modify items before including them in a survey. 

Suggestions for Assessment Professionals, Faculty, and Researchers

While ChatGPT gets a lot of bad press in regards to students using it to write their papers and provide test answers, it can be a useful tool for assessment professionals and researchers as they create items for their surveys and assessments. However, ChatGPT is not perfect and should not be used as the only method of creating items. If you are looking to venture into using AI as one of your item development methods, here are some suggestions and things to consider:

Use very specific prompts. ChatGPT produces the best results when you use very specific prompts. If you want to create items that are relevant for your assessment, be sure to include information on the content area, population that you want to use the items with, as well as the type of question (i.e., closed or open), and response categories (e.g., Likert, multiple choice) you would like. 

Don’t rely on one prompt to generate your items. It may take a few prompts to get the results that you need. I noticed that even slightly changing the wording/specificity of your prompt will generate different responses. I keep tweaking my prompts until I don’t get any new or relevant results. 

Don’t rely on citations provided by ChatGPT. I have not had good luck with ChatGPT in providing me with accurate information when it comes to citing sources. Many of the results that ChatGPT aren’t even real articles! I recommend using Google Scholar to check on citations.

Pretest your assessment and survey items. As with any method(s) that you choose to utilize to create items, always pretest them with a sample from the population of interest before utilizing them in your assessment or research project. No method of item development is without its flaws, so having others review and provide feedback on your items is always a good idea.

So, in conclusion, I would look at ChatGPT and other AI applications as just another tool that assessment and research professionals can use in their work. While it certainly can be used for nefarious purposes, such as writing your assignment for you, it also has a lot of potential for assessment professionals and researchers. Item development, providing feedback on assessments, and suggesting readings in a specific research area are just some of the ways that we can use ChatGPT to our benefit. We know that many of our students are using it, so let’s embrace it and figure out ways AI can enhance what we do.

 


References

Nietzel, M.T. (2023, March). More than half of college students believe using ChatGPT to complete assignments is cheating. Retrieved from https://www.forbes.com/sites/michaeltnietzel/2023/03/20/more-than-half-of-college-students-believe-using-chatgpt-to-complete-assignments-is-cheating/?sh=3386491f18f9

Stokel-Walker, C., & Van Noorden, R. (2023). What ChatGPT and generative AI mean for science. Nature614(7947), 214-216. 

Welding, L. (2023, March). Half of college students say using AI on schoolwork is cheating or plagiarism. Retrieved from https://www.bestcolleges.com/research/college-students-ai-tools-survey/