Modern Australian
The Times

I got generative AI to attempt an undergraduate law exam. It struggled with complex questions

  • Written by Armin Alimardani, Lecturer, School of Law, University of Wollongong
I got generative AI to attempt an undergraduate law exam. It struggled with complex questions

It’s been nearly two years since generative artificial intelligence was made widely available to the public. Some models showed great promise by passing academic and professional exams.

For instance, GPT-4 scored higher than 90% of the United States bar exam test takers. These successes led to concerns AI systems might also breeze through university-level assessments. However, my recent study paints a different picture, showing it isn’t quite the academic powerhouse some might think it is.

My study

To explore generative AI’s academic abilities, I looked at how it performed on an undergraduate criminal law final exam at the University of Wollongong – one of the core subjects students need to pass in their degrees. There were 225 students doing the exam.

The exam was for three hours and had two sections. The first asked students to evaluate a case study about criminal offences – and the likelihood of a successful prosecution. The second included a short essay and a set of short-answer questions.

The test questions evaluated a mix of skills, including legal knowledge, critical thinking and the ability to construct persuasive arguments.

Students were not allowed to use AI for their responses. And did the assessment in a supervised environment.

I used different AI models to create ten distinct answers to the exam questions.

Five papers were generated by just pasting the exam question into the AI tool without any prompts. For the other five, I gave detailed prompts and relevant legal content to see if that would improve the outcome.

I hand wrote the AI-generated answers in official exam booklets and used fake student names and numbers. These AI-generated answers were mixed with actual student exam answers and anonymously given to five tutors for grading.

Importantly, when marking, the tutors did not know AI had generated ten of the exam answers.

A man writes on a sheet of paper.
We handwrote the AI answers so markers would think they were done by students. Kate Aedon/Shutterstock

How did the AI papers perform?

When the tutors were interviewed after marking, none of them suspected any answers were AI-generated.

This shows the potential for AI to mimic student responses and educators’ inability to spot such papers.

But on the whole, the AI papers were not impressive.

While the AI did well in the essay-style question, it struggled with complex questions that required in-depth legal analysis.

This means even though AI can mimic human writing style, it lacks the nuanced understanding needed for complex legal reasoning.

The students’ exam average was 66%.

The AI papers that had no prompting, on average, only beat 4.3% of students. Two barely passed (the pass mark is 50%) and three failed.

In terms of the papers where prompts were used, on average, they beat 39.9% of students. Three of these papers weren’t impressive and received 50%, 51.7% and 60%, but two did quite well. One scored 73.3% and the other scored 78%.

A landing page for ChatGPT, asking 'How can I help you today?'
Generative AI has gained a reputation for passing difficult exams. Tada Images/ Shutterstock

What does this mean?

These findings have important implications for both education and professional standards.

Despite the hype, generative AI isn’t close to replacing humans in intellectually demanding tasks such as this law exam.

My study suggests AI should be viewed more like a tool, and when used properly, it can enhance human capabilities.

So schools and universities should concentrate on developing students’ skills to collaborate with AI and analyse its outputs critically, rather than relying on the tools’ ability to simply spit out answers.

Further, to make collaboration between AI and students possible, we may have to rethink some of the traditional notions we have about education and assessment.

For example, we might consider when a student prompts, verifies and edits an AI-generated work, that is their original contribution and should still be viewed as a valuable part of learning.

Authors: Armin Alimardani, Lecturer, School of Law, University of Wollongong

Read more https://theconversation.com/i-got-generative-ai-to-attempt-an-undergraduate-law-exam-it-struggled-with-complex-questions-240021

Hoteliers Look to Clever Value Adds to Increase Revenue

The Australian hospitality industry is still in recovery mode after a notoriously rough patch in recent years. While there has been a post-COVID tra...

Moving to Queensland? Here’s How to Prep Your Car for the Big Move North

There’s no sign of the northern migration slowing down, with thousands of southerners fleeing from chaotic lifestyles and cooler climates for a brig...

Diesel Shortage to Impact Trades and Contractors

Strait of Hormuz blockage affecting all major parts of trades and construction Trades and construction across residential, commercial and industria...

Why Holiday Home Owners Turn to Rental Management Agents

The Allure — and the Reality — of Renting Out Your Property Owning a holiday home is a dream for many Australians. Whether it's a beachside sha...

Why Finding Reliable Doctors In Bundoora Is Important For Long-Term Health

Access to quality healthcare plays an important role in maintaining overall wellbeing and managing health concerns early. Trusted Doctors in Bundoor...

Understanding the Different Types of Car Services: Minor vs Major

When it comes to car maintenance, one of the most important things every vehicle owner should understand is the difference between a minor and a maj...

How Superannuation and TPD Insurance Work Together

Superannuation is an essential part of financial planning in Australia. It is designed to provide individuals with income during retirement, helping...

Tiny Towns funding granted for Mt Hotham and Mt Buller upgrades

Alpine Resorts Victoria (ARV) has welcomed funding support from the Victorian Government’s  Tiny Towns Fund, with both Mt Hotham and Mt Buller se...

Locksmith Services: Why Professional Security Solutions Matter More Than Ever

Security is a critical concern for homeowners, businesses, and vehicle owners alike. Whether it involves protecting a property, replacing damaged lo...

Why Tooth Fillings Are Important For Protecting Damaged Teeth

Cavities and minor tooth damage are common dental problems that can worsen if left untreated. Professional tooth fillings help restore damaged teeth, ...

The Connection Between Visibility and Driver Confidence

Operating a vehicle safely requires an immediate, uncompromised stream of visual information from the surrounding road environment. A driver's decis...

Important Things To Know Before Starting An SMSF Setup

Planning for retirement requires careful financial decisions, and many Australians are now looking for more direct control over how their superannua...

Why Retail Cleaning Plays a Key Role in Customer Experience and Business Success

Professional retail cleaning services are an essential part of maintaining a welcoming, safe, and professional environment for customers and staff...

Simple Ways to Make a Commercial Property More Appealing to Buyers

Selling or leasing a commercial property isn’t just about listing the square metres, taking a few photos and waiting for the right person to appea...

What Café Owners Should Know Before Upgrading Their Display Setup

A café display fridge does a lot more than keep cakes cold and sandwiches fresh. It quietly shapes the way customers browse, the way staff move beh...

Creating a Backyard That Feels Comfortable All Year Round

A great backyard doesn’t need to be huge, expensive or perfectly styled. Most of the time, the spaces people actually use are the ones that feel e...

How Homeowners Can Make Smarter Energy Decisions Before Upgrading

Energy upgrades used to feel like something you only looked into after a power bill gave you a nasty surprise. These days, though, more homeowners a...

Why Retail CX Breaks During Peak Sales Events and How to Prevent It

Retail customer experience has become one of the most important drivers of revenue growth, especially during high-intensity sales periods. However, ev...