AI Model Assesses Middle School Science Essays

A team of computer scientists at the Penn State University recently studied the efficiency of Natural Language Processing—a form of AI— for evaluating and giving feedback on the science essays of students.

Their results were detailed in the International Society for the Learning Sciences Conference (ISLS), and in the International Conference on Artificial Intelligence in Education (AIED).

According to principal investigator, Rebecca Passonneau, professor of computer science and engineering, natural language processing is a category of computer science where spoken or written words are converted into computable data.

The research team gave an existing natural language processing tool called PyrEval the ability to evaluate ideas in student writing based on predetermined computable rubrics. They referred to the new software as PyrEval-CR.

‘PyrEval-CR can provide middle school students immediate feedback on their science essays, which offloads much of the burden of assessment from the teacher, so that more writing assignments can be integrated into middle school science curricula,’ Passonneau said. ‘Simultaneously, the software generates a summary report on topics or ideas present in the essays from one or more classrooms, so teachers can quickly determine if students have genuinely understood a science lesson.

In 2004, Passonneau worked with some collaborators to develop the Pyramid method, where source documents are annotated manually to rank written ideas by their relevance. In 2012, Passonneau and her team of graduate students worked on the automation of Pyramid, leading to the creation of the fully automated PyrEval, and subsequently, PyrEval-CR.

The reliability and functionality of PyrEval-CR was tested on hundreds of middle school science essays from Wisconsin public schools. Sadhana Puntambekar, professor of educational psychology at the University of Wisconsin-Madison, developed the science curriculum and recruited the science teachers. She also provided important historical student essay data needed to develop PyrEval-CR.

‘In PyrEval-CR, we created the same kind of model that PyrEval would create from a few passages by expert writers but extended it to align with whatever rubric makes sense for a particular essay prompt,’ said Passonneau. ‘We did a lot of experiments to fine-tune the software, then confirmed that the software’s assessment correlated very highly with an assessment from a manual rubric developed and applied by Puntambekar’s lab.’

In the AIED paper, the researchers detailed how they adapted the PyrEval software to design PyrEval-CR. Passonneau stated that most software is created as a set of modules or building blocks, each with a different function.

In the new PyrEval-CR, the computable rubric is created semi-automatically before the students even get an essay prompt.

‘PyrEval-CR makes things easier for teachers in actual classrooms who use rubrics, but who usually don’t have the resources to create their own rubric and test whether it can be used by different people and achieve the same assessment of student work,’ said Passonneau.

For the evaluation of the essays, the students’ sentences have to be broken down into single clauses, and converted into fixed-length vectors. To get the meaning of the clauses in their conversion to vectors, a weighted text matric factorization algorithm is used. This algorithm detects the similarities of meaning much better than other methods.

Another weight maximal independent set algorithm was used to ensure that PyrEval-CR selects the best analysis of a provided sentence.

‘There are many ways to break down a sentence, and each sentence may be a complex or a simple statement,’ said Passonneau. ‘Humans know if two sentences are similar by reading them. To simulate this human skill, we convert each rubric idea to vectors, and construct a graph where each node represents matches of a student vector to rubric vectors, so that the software can find the optimal interpretation of the student essay.’

In the future, the researchers hope to use the assessment software in classrooms in order to make the evaluation of science essays easier for teachers.

‘Through this research, we hope to scaffold student learning in science classes, to give them just enough support and feedback and then back off so they can learn and achieve on their own,’ said Passonneau. ‘The goal is to allow STEM teachers to easily implement writing assignments in their curricula.’

By Marvellous Iwendi.

Source: Penn State