Can AI Help Teachers Grade Geography Homework? Here's What We Found.
By Divya Venkatraman, Xin Ying Wong, Jay Shah, and Vidyaraman Sankaranarayanan – DeepDive Labs, Singapore and LearnMojo, Singapore
This blog is a condensed version of research DeepDive Labs to understand how AI can meaningfully be put to use for education. Download the full research paper here.
Imagine you're a secondary school teacher with over 100 geography worksheets to grade, each requiring not just a score, but thoughtful feedback. It's exhausting. Now, imagine you had an assistant who could help you score answers accurately, consistently, and even offer helpful suggestions to students, all within seconds.
That assistant is artificial intelligence.
At DeepDive Labs, we recently explored how advanced AI models, known as could support teachers in grading secondary-level geography homework. Here’s what we found
The Problem: Teachers Are Stretched Thin
Educators aim to provide students with sufficient practice to achieve mastery. However, each additional worksheet significantly increases their marking time. Geography, in particular, encompasses diagrams, real-world applications, and subjective responses, complicating the grading process. Students desire tailored feedback, while parents seek clear guidance on how their children can enhance their skills. There is a collective demand for more, and teachers are expected to meet all these demands.
The Experiment: Can AI Help Grade Real Student Work?
We collaborated with LearnMojo, a platform that provides coaching for students in areas such as geography and social studies, and utilised actual student responses to 12 organised geography questions covering subjects like the water cycle, pollution, and deforestation.
We evaluated two advanced AI models—OpenAI's GPT-4o and Meta's LLaMA-3—by giving them a student’s answer and a teacher-designed marking scheme, without any visual aids, to assess their performance using text alone.
We also tried two different grading styles:
Lenient (generous scoring)
Strict (score only if the answer matches the marking scheme exactly)
The Struggles!
AI struggled when it didn’t get the diagram or textbook context
It was a bit too nice sometimes (more generous than the teacher)
It had trouble with vague or open-ended answers.
The Results: Surprisingly Good
GPT-4o aligned with teacher evaluations 60% of the time
The majority of other ratings were only 1 point away from the teacher's scores
It provided justifications for its scoring, which teachers agreed with!
Scores remained steady even when the AI was run multiple times or when answers were rephrased
The most successful outcomes were observed with fact-oriented questions (such as identifying a forest layer or explaining sources of pollution)
Why This Matters
But Artificial intelligence will not (and should not) take the place of teachers; however, it can alleviate repetitive tasks, allowing educators to dedicate more time to what is more important: teaching and mentoring.
What’s Next?
Want to see the full research paper? Download here.
As AI improves (especially models that can “see” diagrams and understand images), it can take on more complex tasks. Imagine an assistant that can read a scanned student worksheet — squiggly handwriting and all — and still grade accurately!
For now, if you're a school, tutoring centre, or edtech platform looking to scale your feedback system, AI can help you get started.