
It is likely you’ve read one of the many media reports citing research showing that AI can pass university-level assignments. It is true that GenAI can be very good at producing convincing responses for many of the tasks we currently set for assessments, for example summarizing a range of information and ideas, synthesizing key points, comparing multiple sources, reviewing research landscapes or literature. It is very good at stating and framing ideas with clarity and producing structure for written work amongst many other things. However, it can also make mistakes, be biased, and produce misleading, inaccurate, or bland, generic information.
In terms of assessment, this means that GenAI could generate an excellent or a poor response, depending on the design of the task. GenAI can also be used to complete different elements of a task. How well we design and explain a task, can influence how far a student might use GenAI to complete the task (authorised or unauthorised).
How can I make my assessments more secure?
At the moment, there is no ‘silver bullet’ to make assessments completely GenAI-proof. However, the Quality Assurance Agency (QAA) takes a broad view that a holistic rethink of assessment is needed. It emphasises creating space in the curriculum to enable close engagement with assessments and avoid overload; making greater use of programme-level assessment, and the development of authentic assessment that asks students to apply knowledge in real-world situations.
In practical terms, it is useful to:
- Clearly state if/how AI can be used in teaching and assessment for your module and remind them what the assessment sets out to do i.e. test students’ own knowledge, skills and understanding
- Implement AI Declarations if appropriate (possibly for each task or for the whole module/programme). This can include limited or other use of AI.
- Talk to students about the use of AI in their module/discipline and remind them of AI’s strengths and weaknesses in your context. Is inappropriate, unauthorised use of AI likely to result in a lower mark?
- Articulate the human skills, knowledge and experience that are developed through your teaching/module/discipline
- Get students to reference their use of AI, if appropriate (see the library’s guidance on this GenAI – Searching and Generative AI (GenAI) – LibGuides@Southampton at University of Southampton Library)
Over the longer term, reflect on intended learning outcomes and assessment – does your assessment enable the demonstration of the learning that you intend in your module?
Some suggested changes to teaching and assessment might include:
- Developing reflective elements in assessment/tasks
- Creating personalised tasks that draw on individual experience
- Incorporating requirements for analysis of specific course-level activities or events
- A focus on the process of learning rather than the final product
- The inclusion of group work and peer assessment
- The creation of physical artefacts
- Introducing a variety of assessment types e.g. an oral examination or presentation
Find more detailed information on the GenAI Sharepoint site and talk to the CHEP Assessment Consultancy.
Bite-sized task
Step 1 – learn
In this task, you are going to experiment with getting Copilot to generate a literature review. Watch these screencasts about using Copilot to create a literature review.
Creating a literature review
Using Copilot to apply the literature
Step 2 – do
- Start with a literature review of a key, fairly confined, area that you are an expert on. Get Microsoft Copilot or another GenAI tool like ChatGPT, to generate a literature review for this area.
- Then start a new chat (this is important – you don’t want it to remember the literature review). Try asking Copilot to create an application of the literature related to your field.
- Finally – new chat again – try asking it to do the literature review first and then try the application in context in the same chat.
Step 3 – reflect
At this point, you haven’t done anything to fit the GenAI output to an assignment word count or structure. But, as a starting point, thinking about the level of knowledge and understanding you appear to be seeing:
- What mark band would you put the literature review in? Does this look like 1st class work, 2:1, 2:2 or lower?
- Has the GenAI used the major, relevant sources you might expect? Has it missed any of the most important sources, is it comprehensive, is it a fair selection of them, how does this compare to what you get from students?
- Are there any odd sources you don’t recognise? Are these just obscure or are they falsified?
- Is there much difference between the two attempts at application in context?
- How do these compare with the sort of work you might expect from students? What mark band would you put these in?
- If this kind of task is offered as an assessed task, how might it need to be reviewed or changed to ensure learning outcomes are met?
Join the conversation
Post your thoughts on the weekly Teams post to join the conversation.
Further links
Academic Responsibility and Conduct toolkit
What can I do to make my assessment more robust? – GenAI working group
Advancing Assessment strategic major project
FEPS Generative AI in Education (2024) Designing Robust Assessments
UCL Arena Centre for Research-based Education (2023) Designing assessment for academic integrity.
Contributor biography
Kate Borthwick is Professor of Digital Education in Languages, Cultures and Linguistics, in the Faculty of Arts and Humanities. She is the Lead for AI in education at the University and chair of the University Digital Education Advisory Group. She is Director of the University open online course programme and is an award-winning lecturer and learning designer.
Screencasts and questions prepared by Matt Perks. Matt is a Senior Teaching Fellow in Secondary Science education. He is based in the Southampton Education School and is an ARC (Academic Responsibility and Conduct) officer.
