International Journal For Multidisciplinary Research
E-ISSN: 2582-2160
•
Impact Factor: 9.24
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Home
Research Paper
Submit Research Paper
Publication Guidelines
Publication Charges
Upload Documents
Track Status / Pay Fees / Download Publication Certi.
Editors & Reviewers
View All
Join as a Reviewer
Get Membership Certificate
Current Issue
Publication Archive
Conference
Publishing Conf. with IJFMR
Upcoming Conference(s) ↓
Conferences Published ↓
DePaul-2026
IC-AIRCM-T3-2026
SPHERE-2025
AIMAR-2025
SVGASCA-2025
ICCE-2025
Chinai-2023
PIPRDA-2023
ICMRS'23
Contact Us
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 8 Issue 3
May-June 2026
Indexing Partners
Reliability-Weighted Multi-Agent Annotation Workflow for Quality-Controlled LLM Labeling
| Author(s) | Mr. Sarvagya Jain, Mr. Sandeep Piplotia, Ashish Shrivastava, Brajendra Prajapati |
|---|---|
| Country | India |
| Abstract | Large Language Models (LLMs) are widely adopted for automatically labeling data because they work quickly and can handle large amounts of information. But there are problems like inconsistent results, made-up information, and different levels of reasoning ability across models, which can make the labeling less reliable. To fix these issues, this paper introduces a Multi-Agent Reliability-Weighted Annotation Workflow, a system designed to improve the trust in the labels created by LLMs. The system uses 3 to 5 different types of LLM agents, each with different structures and ways of generating responses. Each agent labels data on its own. The system then gives each agent a reliability score based on how well it has performed before, how confident it is in its answers, and how much the agents agree with each other. The final labels are created by combining the agents' results with more weight given to the more reliable ones, rather than just taking the most common answer. Cases where the model is unsure or there is a big disagreement are automatically marked for further checking by a person or for re-labeling. Tests on datasets like AG News, CoNLL-2003, and SST-20 show that this method improves accuracy by up to 3.4% and increases label agreement by 0.06 compared to simpler methods that don't consider reliability. |
| Keywords | Large Language Models, Data Annotation, Multi-Agent Systems, Reliability Weighting, Quality Control, Weak Supervision, Automated Labeling |
| Field | Computer > Artificial Intelligence / Simulation / Virtual Reality |
| Published In | Volume 8, Issue 3, May-June 2026 |
| Published On | 2026-05-07 |
| DOI | https://doi.org/10.36948/ijfmr.2026.v08i03.77244 |
Share this

E-ISSN 2582-2160
CrossRef DOI is assigned to each research paper published in our journal.
IJFMR DOI prefix is
10.36948/ijfmr
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
Powered by Sky Research Publication and Journals