“Evaluating AI Language Models for Reference Services: A Comparative Study of ChatGPT, Gemini, and Copilot”


The descriptive statistics indicate that Google Gemini outperformed the other GenAI chatbots, by scoring high on “accuracy,” relevancy,” “friendliness” and “instruction” resulting in a higher mean score followed by public ChatGPT, commercial ChatGPT-4.0, and Microsoft Copilot.

https://doi.org/10.1080/10875301.2025.2478861

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photo

Author: Charles W. Bailey, Jr.

Charles W. Bailey, Jr.