Uncovering dataset limitations and baked-in bias – Gen AI Essentials

Digital speech bubbles appearing on a computer screen

Generative AI tools, such as ChatGPT or Copilot, are trained on large datasets drawn from the internet. The datasets are trained and fine-tuned to become more accurate in their responses. When you ask GenAI a question, it will predict the most likely answer based on its training dataset and how it was trained/tuned. It does this incredibly well but can inevitably make mistakes arising from the quality of the original dataset (e.g. baked-in bias, misinformation, limited perspectives), how your question was framed, and how it is fine-tuned or programmed to work (i.e. it is designed to try to answer your question and please you in its response even if that response is factually incorrect).

Bite-sized task

In this activity, you will carry out some simple tasks with Copilot that uncover the limitations of the data that GenAI is trained on. These tasks are adapted from talks given during the FAH AI GenAI working week (see Further Links). They are useful starting points to think about the limitations in AI outputs and consider how far we need to work with AI to get appropriate and useful content.

Step 1 – learn

The way that GenAI and its training works can mean that the information that is presented to you at the first time of asking is generic, stereotyped or incorrect.

Scan this article to learn more: Why Chatbots lie to us, article in Science, July 2025 Why AI chatbots lie to us | Science

Step 2 – do

Open Copilot chat.
Type in ‘Create an image of a watch showing 3.25pm.’
Is the image correct? Does it show the time you asked for? If not, what time does it show? Does it attempt to include the idea of ‘3.25’ in some other way?
If it showed you the correct time – congratulations! If it didn’t, consider why this might be? What might the activity reveal about the dataset it is drawing from?
Now create another image. Ask Copilot to ‘create an image of a successful person’.
What kind of image is returned to you? Are there any assumptions made about who or what a ‘successful person’ might look like/dress like/work as?
You may wish to try these activities in an open AI tool like ChatGPT or Claude. What kinds of different, or similar, responses do you get? As AI has developed over the last few years, you will find different tools responding in different ways that reflect changes to GenAI algorithms in response to feedback from users.
Now ask Copilot to give you a text on something you know to be false, e.g. ‘Write a 200 word biography of John Smith, Vice-Chancellor of the University of Southampton’. What comes back to you?

Step 3 – reflect

The watch activity tends to reflect the training dataset for watches: marketing material that shows watches at 10.10 (the most beautiful view). Responses to that instruction are likely to alter over time as AI evolves.

These examples are simple and straightforward to critique, but it will not always be so obvious when AI is producing incorrect or limited information.

How might you use these activities with students to demonstrate the dataset limitations in GenAI? Would you need to adapt the activities in simple ways to suit your discipline or programme?

How might you use the activity to highlight a need for AI literacy skills in ‘talking to AI’?

Join the conversation

Post your thoughts on the weekly Teams post to join the conversation.

Further links

FAH – GenAI working week - session recordings and a toolkit for talking about AI with your students that includes example tasks

A blogpost that recounts a writer’s experience of being repeatedly lied to by AI Diabolus Ex Machina – by Amanda Guinzburg

Contributor biography

Kate Borthwick is Professor of Digital Education in Languages, Cultures and Linguistics, in the Faculty of Arts and Humanities. She is the Lead for AI in education at the University and chair of the University Digital Education Advisory Group.  She is Director of the University open online course programme and is an award-winning lecturer and learning designer.