“Leveraging Task-Specific Large Language Models to Enhance Research Data Management Services”


Applying prompt engineering and RAG [Retrieval-Augmented Generation ] to research data management and sharing activities offers numerous opportunities for enhancing institutional research data support services. Here, we present just a few illustrative examples that highlight how these technologies could significantly improve service efficiencies, reduce researcher burden, and support adherence with evolving policies. These examples aim to inspire further exploration and future work rather than serve as extensive case studies.

  • Task-Specific, Agent-Based Chatbots for Data Management and Sharing Plans (DMSPs): Agent-based chatbots can assist researchers in drafting DMSPs by prompting for specific information based on funder requirements. This would offer researchers an interactive, guided experience that streamlines the process of developing a DMSP. The chatbot can be pre-loaded with knowledge of DMSP policies, institutional resources, and common pitfalls observed during plan reviews. Moreover, by incorporating review criteria, these chatbots could also provide real-time feedback on draft plans, allowing researchers to refine their submissions before institutional review.
  • Automated Text Extraction for Structured Compliance Reporting: Using these approaches, institutions can also automate the extraction of key details from narrative-based DMSPs and transform them into structured, formatted fields. This could be particularly useful for converting narrative-based DMSPs into actionable steps for researchers, service providers, and compliance officers, enabling efficient monitoring and follow-up on data management and sharing commitments.
  • Customized Knowledge Retrieval for Policy Guidance and Updates: Institutions can further leverage these approaches to develop tools that offer researchers up-to-date guidance on data management and sharing policies from major funders and publishers as well as institutional requirements. For instance, a researcher could query these tools to receive the latest mandates, institutional requirements, or best practices related to data management and sharing. This capability would reduce the burden for researchers in tracking down the most recent policy update.

https://tinyurl.com/bdee5u29

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photo

Author: Charles W. Bailey, Jr.

Charles W. Bailey, Jr.