The 8088 The 8088 ← All news
arXiv cs.AI AI Research Apr 23

Exploring Data Augmentation and Resampling Strategies for Transformer-Based Models to Address Class Imbalance in AI Scoring of Scientific Explanations in NGSS Classroom

★★★★★ significance 2/5

This study investigates data augmentation strategies to improve the automated scoring of student scientific explanations using transformer-based models. Researchers tested synthetic data from GPT-4 and lexical-based extraction methods to address class imbalance issues in SciBERT models.

Why it matters Synthetic data generation via LLMs is becoming a critical lever for refining specialized, high-stakes domain-specific scoring models.
Read the original at arXiv cs.AI

Tags

#transformer #data augmentation #nlp #education #scibert

Related coverage