Singaporean Researchers Develop AI Model to Improve Accuracy in Low-Resource Applications

Singaporean Researchers Develop AI Model to Improve Accuracy in Low-Resource Applications
A*STAR's latest breakthrough in AI addresses the limitations of large pre-trained language models by introducing a novel approach to stance detection, crucial for analyzing opinions in low-resource settings. Published in Big Data Mining and Analytics, the new model leverages a collaborative knowledge infusion method to improve accuracy and efficiency in machine learning tasks with limited data. Achieving superior performance over traditional models, this innovation promises to make AI more effective in specialized fields such as research and medicine, where data and computing resources are scarce.

A team of computer scientists from the Agency for Science Technology and Research (A*STAR) in Singapore has developed a new AI model designed to enhance the accuracy and relevance of machine learning (ML) systems in low-resource scenarios, such as research and medical applications. The breakthrough addresses a key limitation of large pretrained language models (PLMs), which require extensive datasets and computing power that are often unavailable in specialized fields.

The new model, which was detailed in the journal Big Data Mining and Analytics on August 28, focuses on improving stance detection—a critical task in analyzing opinions in contexts like social media or product reviews. Traditional large PLMs, such as those used in popular AI systems like ChatGPT, struggle in scenarios where training data is scarce or computing resources are limited. This limitation has made it difficult to apply AI effectively in more niche applications.

"Stance detection is inherently a low-resource task due to the diversity of targets and the limited availability of annotated data," said Yan Ming, senior scientist at A*STAR’s Center for Frontier AI Research (CFAR) and lead author of the study. "Enhancing AI-based methods for low-resource stance detection is essential to ensure these tools are effective and reliable in real-world applications."

The researchers introduced a collaborative knowledge infusion method that allows ML models to be trained more efficiently with smaller datasets. This method enhances the model's ability to accurately determine the stance—whether in favor or against—of a specific target, such as a product or political figure, based on textual data from sources like tweets or reviews.

To validate their approach, the team conducted experiments on three publicly available stance-detection datasets: VAST, P-Stance, and COVID-19-Stance. The new model outperformed existing AI systems, achieving F1 scores—an important measure of model accuracy—between 79.6% and 86.91%, surpassing the scores of traditional models like BERT and TAN.

"Our proposed method integrates verified knowledge from multiple sources, ensuring that the model remains relevant and effective even as pre-trained large language models become outdated," Ming explained. "This approach also introduces a collaborative adaptor that significantly reduces the need for extensive annotated data, improving training efficiency."

This advancement not only enhances the practical use of AI in research and medical settings but also offers a template for future optimizations in low-resource AI applications.

"Our primary focus is on efficient learning within low-resource real-world applications," said Joey Tianyi Zhou, principal scientist at CFAR and co-author of the paper. "Unlike major AI firms that concentrate on developing general artificial intelligence (AGI) models, our objective is to create more efficient AI methods that benefit both the public and the research community."

The research marks a significant step toward making AI more accessible and effective in specialized domains, where data and computing power are often at a premium.