All 35 entries tagged Assessment
View all 103 entries tagged Assessment on Warwick Blogs | View entries tagged Assessment at Technorati | There are no images tagged Assessment on this blog
May 20, 2024
Graphical abstract as a form of assessment
Graphical abstract as a form of assessment by Andre Pires da Silva
A graphical abstract is a pictorial summary of the main findings of a research paper. It is typically used by journals to highlight the paper's key points in a concise visual format.
The format of graphical abstracts varies by journal. Some require a single panel where everything is summarised, while others may have multiple panels showing the introduction, methods, results, and conclusions. Graphical abstracts follow specific conventions:
- They have a clear start and end, read from top to bottom or left to right.
- They provide context for the results, such as the type of tissue represented.
- The figures are different from those in the main paper, emphasising new findings.
- They do not include data but show the findings conceptually.
- They exclude excessive details from previous literature and anything speculative.
- They have simple labels and minimal text, with no distracting clutter.
To test the capabilities of generative AI in creating graphical abstracts, an example from a complex paper on nematode sexual forms was used. The original graphical abstract clearly depicted the main points of the paper. However, when generative AI attempted to produce a graphical abstract based on the same paper, the result was confusing, cluttered, and failed to capture the main points accurately.
Analysing this failure through the lens of Bloom's Taxonomy, a hierarchical framework for cognitive skills, can provide insights. AI excels at lower-level skills like remembering and understanding but struggles with higher-level skills like analysing, evaluating, and creating.
While AI can remember and list information it has been trained on, many scientific fields lack sufficient training data, leading to potential inaccuracies. AI can produce abstracts by analysing information, but may miss the most important aspects that require nuance. Creativity, the highest cognitive skill, remains a significant challenge for AI.
In assessing students' understanding in a developmental biology course, various methods were employed, including multiple-choice questions, short answers, and graphical abstracts. The multiple-choice questions required interpreting datasets not directly solvable by AI, as the context was provided during lectures. The short-answer questions involved analysing complex anatomical figures from papers not readily available for AI training.
For the graphical abstract assignment, students were given simple instructions on the format and a word limit for the legend summarising key conclusions. They could use various digital tools or hand-drawings. The assigned paper discussed two theories of embryonic patterning: positional information and reaction-diffusion.
When the paper was submitted to generative AI to produce a graphical abstract, the result was cluttered and nonsensical, failing to represent the main ideas accurately. Even with simplified instructions, the AI-generated graphical abstract remained inadequate.
In contrast, student-produced graphical abstracts effectively communicated the key concepts. Some clearly depicted the relationship between the two theories, whether one was upstream or downstream of the other, or if they interacted in parallel. Others used effective visual representations, although some lacked sufficient guiding text or clarity in conveying the relationship between the theories.
The experience of grading the graphical abstract assignments was efficient, taking only a few minutes per submission. Creating new exams based on this format is straightforward, as instructors can select different research papers for each iteration.
From the students' perspective, the graphical abstract assignment is valuable as it requires them to communicate complex ideas clearly and critically select the most important aspects of a paper.
While companies offer graphical abstract creation services, they are currently time-consuming and expensive, limiting their widespread adoption.
Looking ahead, implementing other assessment formats like short video productions, as done in science communication classes, could further challenge AI capabilities in this domain.
Overall, the graphical abstract assignment provides a valuable assessment tool that requires higher-order cognitive skills, promotes scientific communication, and remains a challenge for current AI systems to generate effectively.
April 29, 2024
Insights from a faculty session to Integrate AI in teaching practice
This blog is in two parts and was written by Dr. Neha Gupta and Dr. Susanne Beck, Assistant Professors, ISMA Group, Warwick Business School.
Part 1: Planning to deliver a faculty session to integrate AI into teaching practice (Date: 19th April 2024)
This blog share ideas under consideration in preparation to lead a faculty session about integrating AI in Teaching and Learning practices in various forms in a Higher Education setting. The session will be one of the parallel breakout sessions during the annual event at Warwick Business School, the Teaching and Learning Symposium 2023-24, where faculty from all groups (i.e. various disciplines) engage in peer dialogues, discussions, and activities around how the learning and teaching needs in the higher education landscape are evolving. The broad aims of the session are to inspire discussions and ideas about how to use Generative AI (GenAI) and emerging technologies to foster relevant skills enhancing students' employability.
The leading faculty (co-authors of this blog) plan to use a pool of resources from the WIHEA AI Learning Circle, JISC and a Harvard AI pedagogy project to stimulate discussion on the use of AI practices across higher education. A demo of hands-on examples and of AI Tools and prompts used by colleagues from WBS and beyond, such as Ethan Mollick, will help the attendees see how practically they can engage with AI, for example in setting up assessment tasks with the use of ChatGPT. A notable aspect of the session will be the demonstration of AI tools. For example, CODY AI, a web-based AI tool capable of generating bots to address student queries efficiently by using LLM will be demonstrated using the existing knowledge base from the student handbook to answer dissertation related queries. This demonstration will exemplify how AI can streamline administrative tasks, such as responding to common student inquiries, thereby optimising staff resources and minimising response times.
As the job market evolves, students must be equipped with both domain-specific knowledge and technological proficiency. Integrating AI into teaching not only prepares students for future careers but also empowers them to engage with and leverage technology responsibly. The AI technology is out there and students are going to be using AI tools in their future work places. During their job interviews they will be asked about these tools and about their opinion on these tools. As educators, it is our responsibility to provide students with opportunities to experiment with these tools during their learning journey and allow them to form their own experiences and opinions. Perhaps, educators should recognise that they should have an open mind to experiment with emerging AI tools that offer immense potential in enhancing teaching and assessment practices. Yet, its implementation must be guided by ethical considerations and a commitment to fostering critical thinking skills among students.
Part 2: Insights from faculty session to Integrate AI in teaching practice (Date: 25th April 2024)
This blog share insights from the delivery of faculty session at WBS teaching and learning symposium (an account of which is mentioned in the blog above). The potential of AI tools revolutionising the student supporting task with the use of Cody AI for answering queries sparked discussions amongst colleagues and on its implementation across various educational contexts. For example colleagues were concerned about:
- Copyrighted information shared on public domains unless such tools are contractually brought into the university ecosystem where such challenges can be managed through a more formal implementation of such AI tools in a university setting.
- Hallucinations or information made-up by AI tools given the underlying LLM layers. In the demonstrated case of Cody AI bot, however, the answers the Bot gives is primarily based on the knowledge base provided by the user.
Though colleagues had a consensus that committing to innovation and the integration of such AI tools into teaching practices holds the promise of both, creating more efficient handling of student queries as well as enhancing their learning experiences in higher education.
The break out session further delved into using AI for assessments. Hands-on examples of prompts and outcomes where shared within the session, exploring benefits for both educators and students. For educators, GenAI tools can be used to develop creative assignments more efficiently, that require students to critically engage with AI generated content. For instance, instead of preparing a recap-exercise at the beginning of a class, asking students what they remember, teachers can ask students to critically review a text about a given topic and identify (purposefully included) false claims, and share their thoughts with their neighbours. Besides subject knowledge, this exercise sensitises students that even text that might sound good, may be factually wrong. Both the text as well as the instructions can be generated by ChatGPT in an instant, making it easily replicable and customisable for educators (see another example, asking students to write a critical essay, here).
For students, such assignments can help them develop skills such as critical thinking. But through the use of GenAI they may also be empowered to leverage individualised learning opportunities and stimulate their curiosity. For example, in his recent book, Ethan Mollick showcases a potential methodology to encourage students to experiment with tasks they have no experience in. For his entrepreneurship class, he asks students to take the development of a business idea a step further and come up with a website or even develop an app for their business – especially when they have no experience with coding. This opens a new space for students to experiment and become creative, another skill enhancing their future employability.
For students to thrive through the use of GenAI in the classroom, however, the discussions in the session emphasised two important boundary conditions: First, students need to be given the space to experiment with using AI, as well as other emerging technologies. Providing them with space includes aspects such as rewarding ambitious ideas rather than penalising if they fail in persuasion. Second, a responsible usage of AI needs to find its place in the students’ curriculum. Teachers cannot expect students to be fully knowledgeable about the most recent capabilities and risks related to such a dynamic technology. Schools and educators need to provide them with the necessary training.
At the end of the breakout session, the attending faculty were invited to join a discussion, imagining themselves a) in the student’s role and share What could be students concerns when receiving an assignment that asks you to use AI? What steps an educator takes to address these concerns?; or b) in the teacher’s role, thinking about What could be your concerns when designing an assignment that asks students to use AI? What would they (teachers) need to address their concerns? The discussion generated below key takeaways that underscored the importance of ethical AI integration, ongoing teacher professional development in AI literacy, and the need for a balance between technological advancement and human-centric pedagogy:
- Invest time to train ourselves first then further share AI related knowledge with our students.
- Avoid falling into the AI trap – i.e. students still need step by step guidance in terms of what is expected from them in their assessment task with minimum ambiguity in the instructions.
- Incorporate AI as a step towards innovation by evolving our teaching practices by going beyond the AI tool and being valuable as a knowledge expert (both in setting up assessments and teaching content) (see also Mollick & Mollick, 2024).
- Teaching and learning tasks should be aligned to learning outcomes and not incorporate AI just for the sake of it or for perceived pressure. AI and emerging technologies should be considered powerful means to achieve learning outcomes more effectively.
Feel free to reach out to Dr. Neha Gupta neha.gupta@wbs.ac.uk for more details about the session.
March 25, 2024
Collaborating with AI – Writing an assignment
Rob Liu-Preece is the Academic Technologist for PAIS, Sociology and Philosophy at the University of Warwick. He has also been an IELTS marker for Writing and Speaking for 20 years and previously taught Academic Skills to international students both in the UK and overseas for 20 years.
This is the second of two posts written by Rob about AI and the ANTF Project:
While many have expressed fears that the advent of AI may threaten future employment, others have emphasised that those able to work with AI may well be in the most secure forms of work. This assignment is designed to deploy a relatively undemanding writing task which students will hopefully be motivated to carry out, being on a topic of their choice and hopefully enable them to sustain interest through the stages of writing, reading an AI answer, re-drafting and then commenting on it. I hope students will benefit from being unconstrained by academic demands as they compose a short piece of writing on a topic familiar to themselves. Additionally, that might help increase the learning take-aways they generate having completed the process. I have included an example to help students undertake the novel experience of writing with AI operating as a kind of writing assistant.
Aims of the Assignment
- To provide a motivating topic for students to write about, review and redraft.
- To practice collaboration on writing with AI.
- To enhance students understanding and awareness of significant features of their writing.
- To improve their appreciation and ability to compare human-created and AI-generated writing.
- To practice collaborating with AI to produce an improved finished piece of writing.
Learning Objectives:
- To increase understanding of the difference between a human and machine written response.
- To improve the ability to collaborate with AI to enhance a written response.
- To recognise the strengths and weaknesses of AI generated content.
Instructions:
You need to complete 4 activities for this assignment.
- Choose a cultural artefact important to you. Write 500 – 600 words on why it’s important and what insights it carries.
- Then write a prompt for Chat GPT and generate an answer.
- Re-write your original incorporating new content gleaned from Chat GPT. Highlight the additional content in italics.
- Write 2-4 paragraphs comparing human-created and AI-generated writing.
October 16, 2023
Past reflections, future outlook
Past reflections, future outlook: Update from the Warwick Learning Circle on diverse and inclusive assessment practices
by Kerry Dobbins, Isabel Fischer, Sam Grierson and Leda Mirbahai
Warwick’s International Higher Education Academy (WIHEA) has many features - one of them is that members only remain active members for three years before becoming alumni. The imminent handover of the co-leadership of the open learning circle on diverse and inclusive assessment practices made us, outgoing and new co-leaders, reflect on past initiatives and future directions.
Let’s start with an exciting future - what is new?
As we continue to move forwards with this work, we will become the ‘Inclusive Assessment Learning Circle’. This name change recognises our intention to embed the work on diverse assessments into a broader outlook which sees assessment in HE as a vehicle to promote equity and social justice. The wellbeing of students and staff will feature strongly in our ongoing discussions.
In line with our focus, the learning circle will be inclusive of:
- Assessment strategies and methods that are diverse, authentic, applied and decolonised - this broad understanding and acknowledgement of relevant issues in teaching, learning and the designing of assessments reflects the current landscape in HE and indicates the interest of our members. We hope to support the university reimagining the assessment design narrative, taking a view from programme level through to single assessment and placing assessment at the heart of curriculum design.
- Our members - we want members to be and feel actively engaged and involved in the learning circle.
- Students’ experiences – the work of the learning circle will be firmly focused on working with students so that collectively we can work to understand and support students’ needs and their personal and professional ambitions through assessment.
As for ‘past reflections’: What are the existing features of the learning circle worth keeping? There are in particular four features that worked really well in the past:
- Since its inception membership has grown steadily for three reasons: The membership of this particular LC is open to WIHEA as well as non-WIHEA members from across Warwick and other institutions, also internationally. The open membership has enriched the discussions and enabled establishment of networks nationally and internationally. Secondly, creating sub-groups not just raised interest, it also offered leadership opportunities to more members. The most popular sub-group was AI in education which in turn was split into six further strands / interest groups: Artificial Intelligence in Education (warwick.ac.uk). Lastly, the topic of the learning circles (assessment in its broadest term) has significant implications not only for educators but also for our learners.
- Many of our invited keynote speakers at the start of the bi-monthly meetings and at our mini-conference on assessments, captured the essence of their talks in blogs. Most of the blogs were published within WJETT (see the end of this blog for some examples, others were in other outlets, such as SEDA and SCiLAB. Even if not captured in blogs, did the keynote speeches result in interesting discussions and network opportunities.
- Extensive student corporations, with students supported by two WIHEA grants. One of the student participants even drafted an academic article based on her learning about assessments during her membership.
- Hosting workshops enabled dissemination of our findings and new resources, including the outputs of our funded projects. It also provided a platform to encourage co-learning and sharing amongst participants and facilitators (including student facilitators). Overall we hosted three workshops with the details of the workshops available from our webpage.
Finally, this is blog 15 in our diverse assessment series, some of the more recent blogs can be found here (with further links to previous blogs shown within some of the blogs below):
- Using AI for Formative Feedback: https://blogs.warwick.ac.uk/wjett/entry/using_ai_for/
- Assessments: Capturing Lived Experience and Shaping the Future
- Blog 7: Democratising the feedback process (Linda Enow): https://blogs.warwick.ac.uk/wjett/entry/democratising_the_feedback/
July 24, 2023
Using AI to evaluate film criticism – Lewis Beer
Introduction
Can ChatGPT evaluate the quality of insights in a student essay? In January 2023, Daisy Christodoulou published an article (Can ChatGPT mark writing?, No More Marking Blog) exploring this question. She tested ChatGPT using English essays, and found that while its grading and feedback were more or less aligned with her own, the AI was sometimes unable to spot fundamental problems in essays. Christodoulou offers some fascinating insights, but one thing she does not explore in any detail is the relationship between ChatGPT’s qualitative evaluation of an essay and the content of the essay itself.
In this post, I will share the results of my own brief experiment, in which I pasted two film reviews into ChatGPT and asked it to evaluate (and compare) the quality of insights in these reviews. My aim here was to use these texts as substitute ‘essays’, and consider how the AI-generated responses might help or hinder a marker in giving meaningful feedback.
The dialogues quoted from below were conducted on 23 March 2023, using the free ChatGPT 3.5. (I later repeated the dialogues with both this and ChatGPT 4.0, but found the responses from 23 March gave more substantial and interesting feedback. I will not discuss those later dialogues in this post.)
Review 1: Peter Bradshaw
To begin with, I asked ChatGPT to ‘Evaluate the quality of insight in the following review of the film Nomadland’, then pasted the text of Peter Bradshaw’s review of Nomadland from The Guardian. I wanted to see how it would respond to this vague request, in which I provided no criteria on which to base an evaluation.
The response did a good job of summarising the review using selected words from the text. For instance, at the end of its evaluation the chatbot said:
[Bradshaw] does offer a mild critique that the film could have been more explicit in its condemnation of the economic forces causing the characters' hardships. However, they still recognize the film's greatness and Chloé Zhao's exceptional filmmaking.
This is clearly pieced together from the following passage in Bradshaw’s review, in which I have highlighted the details used by ChatGPT:
[The film is] brought off with exceptional intelligence and style. Arguably it is not angry enough about the economic forces that are causing all this but it still looks superbly forthright. There is real greatness in Chloé Zhao’s film-making.
The paraphrasing techniques used here – transposing words like ‘exceptional’ and ‘greatness’ into slightly different contexts, and spelling out that ‘all this’ means ‘the characters’ hardships’ – are effective in ‘feedback’ terms, because they seem to indicate an understanding of the writer’s (i.e. student’s) argument. However, Bradshaw says the film is ‘not angry enough’, not that it is ‘not explicit enough’, and this may be a significant error.
In its final verdict on the review, ChatGPT said:
Overall, this review appears to be well-written and insightful, providing a thorough analysis of the film’s themes, characters, and style while also highlighting its strengths and potential weaknesses.
Without my having provided any criteria, the AI assumes that an insightful film review will explore themes, characters, and style, and that it will offer a balanced assessment of the film’s strengths and weaknesses.
More of ChatGPT’s ‘marking criteria’ emerged when I asked it to ‘Identify some ways in which the review could be improved’. It offered some valid points in response to this prompt. For example:
The review could have delved deeper into the film's technical aspects, such as its cinematography, editing, and sound design. While the review mentions the film's style and how it differs from traditional narrative-driven films, it could have gone into more detail about how the film achieves this style and what specific techniques the filmmakers used.
This is arguably an area where Bradshaw’s review falls short. Bradshaw does discuss how Nomadland differs from traditional narrative-driven films, and he does connect this with Zhao’s stylistic choices; in this part of his review, he might have discussed film-making technique in more detail. So the AI’s choice to structure feedback like this (‘While you did talk about x, you could have delved deeper into y...’) could be helpful for a student looking to improve their essay by building on their strengths.
On that note, here is another of ChatGPT’s constructive criticisms:
The review could have included a more critical evaluation of the film. While the review praises the film’s strengths, it could have offered a more balanced assessment that acknowledges both the film’s strengths and its weaknesses.
This picks up on the detail quoted above, from the AI’s initial evaluation, noting that Bradshaw’s critique of the film is ‘mild’. The AI has perhaps noticed that Bradshaw’s more negative language is limited to the very end of his review, and is couched in the word ‘arguably’. Again, if we imagine this as feedback being provided to a student, ChatGPT’s evaluations do a good job of mixing praise and criticism: ‘You balance your argument by acknowledging the film’s weaknesses, but you only do this briefly at the end – you could have included a more critical evaluation.’
Other responses, however, show ChatGPT’s limitations, and would constitute problematic essay feedback. For instance:
The review could have offered more specific examples of how the film explores its themes and characters. While the review mentions some of the film's themes, such as the impact of economic hardship on older Americans, it could have gone into more detail about how the film portrays these themes and how they are relevant to contemporary society.
This is not really a fair critique: Bradshaw does highlight specific examples of how the film explores ‘the impact of economic hardship on older Americans’, and he does allude to contemporary issues such as factory closures, the dominance of Amazon, and the importance of the tourist trade in this part of America:
...looking for seasonal work in bars, restaurants and – in this film – in a gigantic Amazon warehouse in Nevada, which takes the place of the agricultural work searched for by itinerant workers in stories such as The Grapes of Wrath.
Fern, a widow and former substitute teacher in Empire, Nevada – a town wiped off the map by a factory closure – who is forced into piling some possessions into a tatty van and heading off...
At times, the film looks like a tour of a deserted planet, especially when she heads out to the Badlands national park in South Dakota, where there is also tourist-trade work to be had.
ChatGPT also says:
The review could have provided more context for the film's production and reception. For example, the review could have mentioned the awards and critical acclaim that the film has received, or how it fits into Chloé Zhao's broader filmography.
Some of this is fair – the review was published after Nomadland’s Oscar success, so Bradshaw could have mentioned this – but it misses the contextual details Bradshaw includes about the film’s production:
Zhao was even allowed to film inside one of Amazon’s eerie service-industry cathedrals.
The movie is inspired by Jessica Bruder’s 2017 nonfiction book, Nomadland: Surviving America in the Twenty-First Century, and by the radical nomadist and anti-capitalist leader Bob Wells, who appears as himself.
The people she meets on the road are, mostly, real nomads who have vivid presences on screen.
As with the previous criticism, ChatGPT has not acknowledged key details of the review in its initial assessment, so its critique is not balanced: it is like a marker who blames a student for ‘not doing x’ when the student in fact spent several paragraphs on ‘x’. (Human markers sometimes do this, of course.)
Review 2: Beatrice Loayza
I then asked ChatGPT, ‘Is the following review of the film Nomadland more incisive than the previous one?’, and pasted the text of Beatrice Loayza’s review of Nomadland, from Sight & Sound. Again, I deliberately did not provide any assessment criteria. ChatGPT’s answer was ‘yes’, for several reasons – some valid, some less so. First of all, it said, Loayza ‘provides a detailed analysis of the film's themes and cinematography, as well as the performance of Frances McDormand’. This is fair, and picks up on one of the criticisms of Bradshaw cited above (namely his lack of attention to technical aspects). Loayza comments on specific camera techniques, naming the cinematographer and describing the light effects he achieves. She also does more than Bradshaw to explain why McDormand’s performance is so effective.
ChatGPT picks up on another of its own criticisms of Bradshaw by praising Loayza’s critical perspective on the film:
However, the review also criticizes the film's lack of force and clarity in its insights into labor in the 21st century and the exploitation of older Americans. The author points out that the film's depiction of workers exploited by Amazon feels too easygoing and questions the film's liberal naivete in addressing the conditions of the nomadic lifestyle. Overall, the review provides a more nuanced and thoughtful analysis of the film.
This draws upon the following passage in Loayza’s review; again, I have highlighted phrases that ChatGPT seems to have picked up on:
[The film’s] insights into labour in the 21st century, and the exploitation of an older generation of Americans, lack force and clarity. At the very beginning of the film, Fern is employed by Amazon’s CamperForce programme, which provides base wages and free parking space to seasonal workers in their 60s and 70s. In 2020, Amazon doubled its profits during a global pandemic, which makes Zhao’s easygoing depiction of workers exploited by the company feel rather toothless. That the film aims to capture the ways in which a kind of working-class American outsider struggles without fully addressing the conditions of that struggle casts over it the shadow of a questionable liberal naivete.
To re-cap:
- In its initial assessment of Bradshaw’s review, ChatGPT noted that his critique of the film was ‘mild’
- In suggesting improvements, it built on this comment by recommending a more balanced approach
- And in drawing a comparison with Loayza’s review, it notes her more substantial version of Bradshaw’s criticism.
At each stage, the AI appears to be drawing upon specific evidence from the texts, rather than just ‘hallucinating’ these evaluative comments.
Elsewhere in its comparison between Bradshaw and Loayza, however, ChatGPT did hallucinate some differences in order to justify its verdict. I will not cite these here, as this post is already very long, but the inaccuracies were of a similar kind to those in the summary of Bradshaw discussed in the previous section.
Final thoughts
If these film reviews were formative essays that I had to mark, I could use ChatGPT’s feedback to offer legitimate praise and criticism, suggest improvements, and judge the relative merits of the two essays in relation to each other. However, I would also notice that ChatGPT misses important details in these texts and draws some un-founded contrasts between them.
In the course of this experiment, I tried several variations on the above prompts. Here are some things to note if you want to try a similar experiment yourself:
- I fed the reviews into ChatGPT several times, and in a different order. When I asked it to make a comparative evaluation, it tended to prefer the second review (even if this was Bradshaw’s). When I asked if it could reverse its comparative evaluation (i.e. ‘Can you argue that the other review is more insightful than the first?’), its responses varied: sometimes it doubled down on its first opinion, sometimes it conceded that an alternative opinion could be justified. Again, the reasons given for these opinions ranged from ‘valid’ to ‘hallucinatory’.
- This post demonstrates what Chat-GPT is capable of in the hands of a technically ignorant, time-poor amateur like me, but by using the right prompts and follow-up prompts, it would no doubt be possible to collate more credible ‘essay feedback’, and then ask the AI to present and construct this in an appropriate way. Have a look at the other articles and resources linked to on the AI in Education Learning Circle webpage, try an experiment of your own, and share the results in the comments below. In particular, you might think about the learning outcomes and marking criteria specific to your discipline, or your module, and consider how you might train ChatGPT to use these in evaluating a piece of text.
July 17, 2023
Who Uses AI, How, and When?
By Matthew Voice, Department of Applied Linguistics, University of Warwick
As I mentioned in my previous WJETT blog post, I have participated in a strand of the ‘AI in Education’ learning circle during the last few months. Over the course of our strand’s meetings regarding the role of AI in formative feedback, our focus has primarily been on the nature of the emerging generative AI technology. Our research and conversation has a great deal to contribute with regards to what large language models mean for the future of formative feedback. However, given that these models are tools, it is worth reflecting on the user, too.
Our learning circle represent educators and students from a wide range of degree programmes, and motivations for engaging with higher education will vary from student to student across all courses. Throughout our discussions, conversation has largely focused on the role of AI in the improvement of formative feedback for students who are motivated by a desire for academic excellence. Given that students in this position will likely be motivated to engage with extracurricular activities within the university (e.g. joining the WIHEA as a student member, partaking in surveys of student experience or voluntary training workshops), their voices will perhaps be heard most clearly during our present conversations.
But the experiences of these students are not representative of our cohort as a whole, or indeed of all points across an individual’s university journey. Many students may view academic writing – either across their degrees, or for certain assignments or modules – as an obstacle to be overcome. In these cases, the objective of academic writing shifts away from excellence, and towards the production of work which will simply allow the student to attain a passing grade. Engagement with GenAI for these students might not mean refinement and improvement; it may simply be a resource for the fastest or most convenient means of content generation.
Integrating GenAI into teaching and formative feedback requires a recognition of this spectrum of experience and motivation. In my previous WJETT blog I recommended that future discussion and planning should consider the reasonableness and realism when supporting students to use and think critically about GenAI. By this, I mean:
1) Reasonableness: Is the task being asked of the GenAI (e.g. to produce formative feedback on a draft assignment) one that it is capable of achieving?
2) Realism: Does our (educators, policy planners, researchers) understanding of engagement with GenAI reflect actual independent student use cases?
Assessing reasonableness through understanding what GenAI is capable of achieving will require ongoing review, and the development of support and training for staff in order to keep pace with developments. This, I think, has largely been the focus of the work done by the ‘AI in Education’ learning circle and the final report this group has produced. Going forward, we also need to consider how well we understand our students’ independent engagement with this sort of assistive technology. What tools and resources are they familiar with? How do they understand them? What do they do with them, and at what point(s) in their academic writing?
Grounding future policy and pedagogic resource development in relation to a realistic model of students’ use and understanding of GenAI will be a task as complex as anticipating the impact of future technological development in large language models. By acknowledging this, and undertaking this work, we best position ourselves to ensure that outputs and resources which arise from working groups such as ours will be meaningful to staff and students working across the university.
July 10, 2023
AI Feedback Systems: A Student Perspective – Mara Bortnowschi
The buzz is endless – AI is taking Higher Education by storm. Since the launch of ChatGPT, everyone seems to have an opinion, and rightfully so. It’s so new and we have yet to fully understand its potential and the impact it will have. Within academia, the general sentiment mostly surrounds concern for responsible use, as many students have heard their professors and lecturers warning them against its use. However, its rapid growth and widespread adoption demonstrate that it’s not going anywhere soon so instead of avoiding it, it should be given the time of day to understand the risks and challenges but also the opportunities it presents. Furthermore, I think the student voice in these discussions has been underrepresented, but really students can be the key to harnessing this technological advancement as an asset to enhancing learning and education.
The WIHEA group have already explored a number of subjects regarding AI in education from student perspectives that can be found on the group’s Artificial Intelligence in Education webpage. These have included emerging questions AI presents, the risks and ethics of academic integrity, evolving assessment styles to mitigate and integrate AI into assessment and how teaching may evolve. I will now explore some of the opportunities that are being presented with the widening availability and access to AI tools for students to enhance their learning and to generate formative feedback. While summative marking has been and continues to be required to be marked by human markers according to the UK Quality Code in Assessment (UKSCQA, 2018), formative feedback has more flexibility, and we are now presented with an opportunity to test and utilise the capabilities of these AI technologies in providing timely, constructive, and developmental feedback.
Existing feedback systems
This notion will be particularly explored with regards to formative elements of summative assessments. Feedback should allow a student to understand strengths and weaknesses of their work and if engaged with effectively, can be used to improve academic performance, and thus learning. Especially throughout the pandemic, we have seen the role of feedback change massively: as more of education has shifted online, reliance on formative assessments has increased as assessments for learning. This is in contrast to summative assessments which more so represent assessments of learning (Wyatt-Smith, Klenowski and Colbert, 2014). Formative assessments also are an opportunity for autonomous learning by developing one’s own skills and relying on self-motivation. It would also be fair to say that formative feedback can be self-assessment of sorts, as even though the formative feedback is generated externally, it is the engagement with, and learning you apply from it that will ultimately make a difference in each student’s performance.
AI generated feedback
So what could incorporation of AI in these feedback systems change? Well, the use of algorithms in generation of feedback is not an entirely new concept. Algorithms, such as Grammarly and Sketch Engine, have been around for a while and they can generate feedback on academic writing and are for the most part freely available, or students are granted access to them by their institutions. But with more complicated algorithms that use machine learning, we can apply them to provide specific and personalised feedback. To make this even more applicable, by integrating what could be different elements of summative marking criteria or rubrics, they could provide some of the most relevant feedback at a moment’s notice.
This application is indeed being explored right here at the University of Warwick. Isabel Fischer, a WBS professor, is trying to pilot a deep learning formative feedback tool that has the potential to provide more comprehensive feedback that was developed with WBS marking criteria at its core. By simply submitting a pdf or word document, the algorithm instantly produces a document of in depth feedback on the four aspects of WBS marking criteria. This could be just the start of developing similar department-specific feedback tools taking into account department-specific assignments, marking criteria, and writing styles for drafts of academic writing. While there are definitely some considerations to look out for, this is fascinating and shows great promise as a tool to increase student autonomy in adapting how they might approach assignments to still have the opportunity to personally benefit from formative feedback.
Considerations of using generative AI
The considerations I mentioned earlier are worth discussing as students are turning to generative AI technologies like ChatGPT more and more. While these technologies are being developed to simulate human intelligence, there are some things they are simply not capable of. For example, it lacks expressions or expressive language. If using them to generate feedback on your writing, you should be aware that they will not always be able to grasp the nuances or expressive language in that writing. In other words, any feedback you receive from AI should be approached critically. You decide what you implement from feedback you receive, and you are responsible for identifying and understanding what truly can improve your work. This is all part of the responsible use of AI, but really also goes for human generated feedback. Your assignment at the end of the day will still be marked by a human marker with in-depth subject-specific knowledge and skills that they are asking you to learn and demonstrate in your assignment. I think this is the quick, irresponsible and neglectful way people have been quick to exploit resources like ChatGPT, where they do not doubt any response it has generated and implement them into a piece of work, only to find that its references are wrong or entirely don’t exist. Firstly, this should not be the way we utilise it, as this is blatant plagiarism, but secondly, a critical approach should be used to (for example) verify references, and critically understand that the way AI answers can lack certain elements of context. Regardless, the point still stands: responsible applications of AI technologies should not be about using it to do your work, but using them to enhance or improve your outputs.
Engagement with AI technologies and feedback
A new level of engagement with AI has been seen since the release of ChatGPT and DALL-E. Perhaps this is rooted in the great advancement that this represented or, more sinisterly, the opportunity to exploit the technology to minimise workload. Regardless, everyone’s interest has been piqued, and the level of engagement has been massive, exceeding what anyone might have expected particularly from students. At the University of Warwick alone, students have made a total of 850,000 total site visits to ChatGPT in the first two months only on the university’s Wi-Fi (SWNS, 2023). I think it’s important to try to understand why this might be in order to be able to channel this traffic for good rather than just fear this alleged ‘cheating epidemic’ that the media has been dubbing it.
In contrast to the older algorithm technologies that have been around, like for example the previously mentioned Grammarly and Sketch, which experienced much more moderate levels of engagement and use. Reasons vary from lack of awareness, to limited breadth of feedback to language, or to lack of confidence in the feedback they provide. AI has surpassed some of these limiting factors in that it is capable of generating a wider breadth of feedback that can include language, style, structure and more. The confidence in the feedback it produces is reassured by the continuous concern from educators. If professors are afraid AI technologies can be used to write entire assessments, then they must be capable of doing so.
Further benefits
As a result, we have seen students be a lot more open to trying to use ChatGPT, and I think we should utilise this eagerness in a way that encourages students to enhance their academic achievements. By introducing resources such as Isabel Fischer’s feedback tool or teaching students how to formulate prompts for ChatGPT to generate similar constructive feedback, we can guide a smooth integration of ChatGPT into Higher Education practices. And there are so many more benefits we have the potential to see. For one, this alleviates a massive workload off staff. If such tools are available to take care of the writing style and structure, staff’s role in formative feedback can remain more focused on content. The speed (or even instantaneity) with which AI can produce feedback also makes feedback more accessible. Furthermore, students can engage with it as many times as they like, inputting multiple drafts, as they are less limited by staff work capacity. Also, different students work on different timescales and with different approaches when faced with an assignment. This further widens accessibility to students that might start assignments later that what might normally be a formative deadline. Communicating these advantages is key in order to achieve these outcomes and to harness this technology towards enhancing educational experience for both staff and students.
Conclusion and personal experience
In my experience thus far with using ChatGPT, I have had mixed feelings. On the one hand, I am very apprehensive of the fact that its use is quite contentious at the moment, with some professors explicitly prohibiting its use or consultation. On the other hand, it is a resource that is available and it feels foolish not to use what is at your disposal. Throughout the research that went into this article and discussion with faculty members about its potential to provide feedback, I have been introduced to a very clearly constructive way to engage with ChatGPT, that seems to make both staff and students happy. While we are still in the early stages of understanding the potential and risks of generative AI technology, at the end of the day this is a tool that will have implications for Higher Education and we are being faced with the possibility of either embracing it, in various ways such as to generate formative feedback, or let it escape our control at the cost of academic integrity, because it is clear that prohibiting its use will not prevent people from exploiting it.
For further queries: marabortnowschi@yahoo.ca or mara.bortnowschi@warwick.ac.uk (may expire soon)
Reference List
SWNS (2023). University of Warwick fears cheating epidemic as data reveals huge number of students visiting AI website during exams. [online] Kenilworth Nub News. Available at: https://kenilworth.nub.news/news/local-news/university-of-warwick-fears-cheating-epidemic-as-data-reveals-huge-number-of-students-visiting-ai-website-during-exams-176836 [Accessed 19 Jun. 2023].
UKSCQA (2018). UK Quality Code for Higher Education Advice and Guidance Assessment. [online] Available at: https://www.qaa.ac.uk/docs/qaa/quality-code/advice-and-guidance-assessment.pdf?sfvrsn=ca29c181_4 [Accessed 16 Jun. 2023].
Wyatt-Smith, C., Klenowski, V. and Colbert, P. (2014). Assessment Understood as Enabling. The Enabling Power of Assessment, pp.1–20. doi:https://doi.org/10.1007/978-94-007-5902-2_1.
June 26, 2023
Using AI for Formative Feedback: Current Challenges, Reflections, and Future Investigation
By Matthew Voice, Applied Linguistics at the University of Warwick
One strand of the WIHEA’s working group for AI in education has focused on the role of AI in formative feedback. As part of this strand, I have been experimenting with feeding my own writing to a range of generative AI (ChatGPT, Google Bard, and Microsoft Bing), to learn more about the sorts of feedback they provide.
The accompanying presentation documents my observations during this process. Some issues, such as the propensity of AI to ‘hallucinate’ sources, are well-documented concerns with current models. As discourse on student use of AI begins to make its way into the classroom, these challenges might provide a basis for critical discussion around the accuracy and quality of the feedback produced by language models, and the need for student to review any outputs produced by LLMs.
Other common issues present different challenges for students using LLMs to elicit formative feedback. For instance, the prompt protocol in the presentation revealed a tendency for AI to provide contradictory advice when its suggestions are queried, leading to a confusing stance on whether or not an issue raised actually constitutes a point for improvement within the source text. When tasked with rewriting prompt material for improvement, LLMs consistently misconstrued (and therefore left absent) some of the nuances of my original review, in a fashion which changed key elements of the original argumentation without acknowledgement. The potential challenges for student users which arise from these tendencies is discussed in more detail in the presentation’s notes.
In addition to giving some indication of the potential role of LLMs in formative feedback, this task has also prompted me to reflect on the way I approach and understand generative AI as an educator. Going forward, I want to suggest two points of reflection for future tasks used to generate and model LLM output in pedagogical contexts. Firstly: is the task a reasonable one? Using LLMs ethically requires using my own writing as a basis for prompt material, but my choice to use published work means that the text in question had already been re-drafted and edited to a publishable standard. What improvements were the LLMs supposed to find, at this point? In future, I would be interested to try eliciting LLM feedback on work in progress as a point of comparison.
Secondly, is the task realistic, i.e. does it accurately reflect the way students use and engage with AI independently? The review in my presentation, for example, presupposes that the process of prompting an LLM for improvements to pre-written text is comparable to student use of these programmes. But how accurate is this assumption? In the Department of Applied Linguistics, our in-progress Univoice project sees student researchers interviewing their peers about their academic process. Data from this project might provide clearer insight into the ways students employ AI in their learning and writing, providing a stronger basis for future critical investigation of the strengths and limitations in AI’s capacity as a tool for feedback.
This is blog 14 in our diverse assessment series, the two most recent previous blogs can be found here:
- Assessments: Capturing Lived Experience and Shaping the Future
- Building knowledge on the pedagogy of using generative AI in the classroom and in assessments
June 22, 2023
Rethinking authentic assessment: work, well–being, and society by Jan McArthur
In this 2022 paper, Jan McArthur builds on “existing work on authentic assessment to develop a more holistic and richer concept that will be more beneficial to individual students and to the larger society of which they are part.” McArthur presents three key principles to help us rethink and broaden the concept of authentic assessment: 1) From real world/world of work to society; 2) From task performance to why we value the task; 3) From the status‑quo of real‑world/world of work to transforming society. If you are short on time, you might want to jump straight to page 8 where discussion of these three principles begins:
https://link.springer.com/article/10.1007/s10734-022-00822-y
June 19, 2023
Assessments: Capturing Lived Experience and Shaping the Future
Reflection on Project Outputs by Molly Fowler
This WIHEA funded co-creation project aimed to capture and explore student and staff perspectives on diverse assessment. Neither group were clearly able to define a diverse assessment strategy, but interestingly their feelings about assessment and ideas of how they can be improved were very similar. Students expressed a desire for greater choice, flexibility and equitable access to assessments. Equitable access encompasses a wide range of complex personal needs including language requirements, disability, neurodiversity, caring responsibilities, and the need to work alongside studies. Staff iterated many of the same concepts but framed their ideas around pedagogical models. There was a strong emphasis on learning from assessments on both sides and a widespread longing for a culture shift to design assessments that model a fair and fulfilling education. Student co-creation was seen as a necessary tool to expedite the shift towards embedding assessments as part of the learning journey.
I am a final year student on the Health and Medical Sciences BSc programme. My role as a student cocreator in this research project was to collect and analyse data from students and staff pertaining to their beliefs around assessment. In the analysis stage of the project, I mainly focused on collating and summarising the student data. I am new to conducting primary research and I have thoroughly appreciated this experience. I enjoyed the challenge of leading interviews and focus groups and deciding when to explore a statement further or manoeuvre back to the set questions. Gaining first-hand insight into the research process has augmented my ability to understand and extract key information from research papers which will be a life-long skill – and was particularly useful when I was conducting a systematic review for my dissertation. It has been very satisfying to observe my own personal development in this way.
This project has made me aware of my privilege in assessments as a neurotypical English speaker. I have been exposed to a range of different perspectives on assessment and I hope to be better equipped to identify problems and support those around me. For example, I was surprised to learn that international students feel more disadvantaged by multiple choice exams than essays, as MCQs often require a nuanced understanding of language and grammar. Similarly, I have always taken a pragmatic approach to assessments and centred my learning around them. I had not previously considered assessments as part of the learning journey or as a learning exercise. As I move into the next phase of my own education, I will try to extend my learning beyond assessments to gain knowledge that I can use in my profession. Undertaking this project has been an enriching experience as a student and as an individual. It has shaped my approach to my assessments, and I have become more aware of the complex needs of others who are completing the same assessment. Students and staff are calling for the same changes to assessment methodology, which can only be implemented if the University takes a holistic approach to restructuring assessments with students contributing to the process.
I look forward to bringing my knowledge from this assignment into my next research project. This is the 13th blog in our diverse assessment series. Previous blogs can be found here:
Blog 1: Launch of the learning circle (Isabel Fischer & Leda Mirbahai): https://blogs.warwick.ac.uk/wjett/entry/interested_in_diverse/
Blog 2: Creative projects and the ‘state of play’ in diverse assessments (Lewis Beer): https://blogs.warwick.ac.uk/wjett/entry/creative_projects_and/
Blog 3: Student experience of assessments (Molly Fowler): https://blogs.warwick.ac.uk/wjett/entry/a_student_perspective/
Blog 4: Assessment Strategy – one year after starting the learning circle (Isabel Fischer & Leda Mirbahai): https://blogs.warwick.ac.uk/wjett/entry/one_year_on/
Blog 5: Learnings and suggestions based on implementing diverse assessments in the foundation year at Warwick (Lucy Ryland): https://blogs.warwick.ac.uk/wjett/entry/learnings_suggestions_based/
Blog 6: How inclusive is your assessment strategy? (Leda Mirbahai): https://blogs.warwick.ac.uk/wjett/entry/blog_6_how/
Blog 7: Democratising the feedback process (Linda Enow): https://blogs.warwick.ac.uk/wjett/entry/democratising_the_feedback/
Blog 8: AI for Good: Evaluating and Shaping Opportunities of AI in Education (Isabel Fischer, Leda Mirbahai & David Buxton): https://blogs.warwick.ac.uk/wjett/entry/ai_for_good/
Blog 9: On ‘Opportunities of AI in Higher Education’ by DALL.E and ChatGPT (Isabel Fischer): https://blogs.warwick.ac.uk/wjett/entry/on_opportunities_of/
Blog 10: Pedagogic paradigm 4.0: bringing students, educators and AI together (Isabel Fischer): https://www.timeshighereducation.com/campus/pedagogic-paradigm-40-bringing-students-educators-and-ai-together
Blog 11: Ethically deploying AI in education: An update from the University of Warwick’s open community of practice (Isabel Fischer, Leda Mirbahai, Lewis Beer, David Buxton, Sam Grierson, Lee Griffin, and Neha Gupta): https://www.open.ac.uk/scholarship-and-innovation/scilab/ethically-deploying-ai-education
Blog 12: Building knowledge on the pedagogy of using generative AI in the classroom and in assessments (Isabel Fischer and Matt Lucas): https://blogs.warwick.ac.uk/wjett/entry/building_knowledge_on/
Join the Diverse Assessment Learning Circle: If you would like to join the learning circle please contact the co-leads: Leda Mirbahai, Warwick Medical School (WMS) (Leda.Mirbahai@warwick.ac.uk) and Isabel Fischer, Warwick Business School (WBS) (Isabel.Fischer@wbs.ac.uk). This LC is open to non-WIHEA members.