“Evaluating AI Language Models for Reference Services: A Comparative Study of ChatGPT, Gemini, and Copilot”

The descriptive statistics indicate that Google Gemini outperformed the other GenAI chatbots, by scoring high on “accuracy,” relevancy,” “friendliness” and “instruction” resulting in a higher mean score followed by public ChatGPT, commercial ChatGPT-4.0, and Microsoft Copilot.

https://doi.org/10.1080/10875301.2025.2478861

Author: Charles W. Bailey, Jr.

Charles W. Bailey, Jr. View all posts by Charles W. Bailey, Jr.