Archive for the 'Text and Data Mining' Category

Christine L. Borgman: "Whose Text, Whose Mining, and to Whose Benefit?"

Posted in Copyright, Open Access, Publishing, Scholarly Journals, Text and Data Mining on February 3rd, 2020

https://escholarship.org/uc/item/3682b9j6

"Library Receives $1M Mellon Grant to Experiment with Digital Collections as Big Data "

Posted in Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Grants, Text and Data Mining on October 8th, 2019

https://www.loc.gov/item/prn-19-098/?loclr=ealn

Dahlgren Memorial Library: "Text Mining for Clinical Support"

Posted in Libraries, Scholarly Journals, Text and Data Mining on October 2nd, 2019

https://doi.org/10.5195/jmla.2019.758

Carl Malamud: "The Plan to Mine the World’s Research Papers"

Posted in Publishing, Scholarly Journals, Text and Data Mining on July 18th, 2019

https://www.nature.com/articles/d41586-019-02142-1

"The Fate of Text and Data Mining in the European Copyright Overhaul"

Posted in Copyright, Digital Copyright Wars, Text and Data Mining on April 30th, 2018

https://www.eff.org/deeplinks/2018/04/text-and-data-mining-european-copyright-overhaul

"Releasing 1.8 Million Open Access Publications from Publisher Systems for Text and Data Mining"

Posted in Open Access, Open Science, Scholarly Journals, Text and Data Mining on March 23rd, 2018

Petr Knoth, Nancy Pontika and Lucas Anastasiou have published "Releasing 1.8 Million Open Access Publications from Publisher Systems for Text and Data Mining" in LSE Impact of Social Sciences.

Here's an excerpt:

Text and data mining offers an opportunity to improve the way we access and analyse the outputs of academic research. But the technical infrastructure of the current scholarly communication system is not yet ready to support TDM to its full potential, even for open access outputs. To address this problem, Petr Knoth, Nancy Pontika and Lucas Anastasiou have developed the CORE Publisher Connector, a toolkit service designed to assist text miners in accessing content though a single machine interface. The Connector aims to solve the heterogeneity among publisher APIs and assist text miners with data collection, provide a centralised point of access to all openly available scientific publications, and provide a high-performance, constantly updated access interface.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

HathiTrust Research Center User Requirements Study White Paper

Posted in Digital Humanities, Reports and White Papers, Text and Data Mining on March 15th, 2018

Eleanor Dickson et al. have self-archived "HathiTrust Research Center User Requirements Study White Paper ."

Here's an excerpt:

This paper presents findings from an investigation into trends and practices in humanities and social sciences research that incorporates text data mining. As affiliates of the HathiTrust Research Center (HTRC), the purpose of our study was to illuminate researcher needs and expectations for text data, tools, and training for text mining in order to better understand our current and potential user community. Results of our study have and will continue to inform development of HTRC tools and services for computational text analysis.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Text Data Mining from the Author’s Perspective: Whose Text, Whose Mining, and to Whose Benefit?"

Posted in Digital Curation & Digital Preservation, Mass Digitizaton, Text and Data Mining on March 14th, 2018

Christine L. Borgman has self-archived "Text Data Mining from the Author's Perspective: Whose Text, Whose Mining, and to Whose Benefit?."

Here's an excerpt:

Given the many technical, social, and policy shifts in access to scholarly content since the early days of text data mining, it is time to expand the conversation about text data mining from concerns of the researcher wishing to mine data to include concerns of researcher-authors about how their data are mined, by whom, for what purposes, and to whose benefits.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

An Analytical Review of Text and Data Mining Practices and Approaches in Europe

Posted in Data Curation, Open Data, and Research Data Management, Legislation and Government Regulation, Reports and White Papers, Text and Data Mining on May 5th, 2016

OpenForum Europe has released An Analytical Review of Text and Data Mining Practices and Approaches in Europe: Policy Recommendations in View of the Upcoming Copyright Legislative Proposal.

Here's an excerpt:

Europe needs a regime which enables any researcher, citizen, company or other entity to engage in TDM activities, using material to which they have lawful access, wherever they feel there is a good idea. The exact commercial rewards can be managed at subsequent stages, depending on the implementation of the mining outcome. The protection could be considered at the point at which some clearly commercially beneficial project, product, service, business or company has emerged.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"NIH Manuscript Collection Optimized for Text-Mining and More"

Posted in Open Access, Publishing, Scholarly Journals, Text and Data Mining on December 7th, 2015

NIH has released "NIH Manuscript Collection Optimized for Text-Mining and More."

Here's an excerpt:

You can download the entire PMC collection of NIH-supported author manuscripts as a package in either XML or plain text formats. The collection will encompass all NIH manuscripts posted to PMC since July 2008. While the public can access the articles' full text and accompanying figures, tables, and multimedia on the PMC Web site, the newly available article packages include full text only, in a form that facilitates text-mining.

Digital Scholarship | Digital Scholarship Sitemap

"The Social, Political and Legal Aspects of Text and Data Mining (TDM)"

Posted in Copyright, Emerging Technologies, Publishing, Text and Data Mining on November 17th, 2014

Michelle Brook, Peter Murray-Rust, and Charles Oppenheim have published "The Social, Political and Legal Aspects of Text and Data Mining (TDM)" in D-Lib Magazine.

Here's an excerpt:

The ideas of textual or data mining (TDM) and subsequent analysis go back hundreds if not thousands of years. Originally carried out manually, textual and data analysis has long been a tool which has enabled new insights to be drawn from text corpora. However, for the potential benefits of TDM to be unlocked, a number of non-technological barriers need to be overcome. These include legal uncertainty resulting from complicated copyright, database rights and licensing, the fact that some publishers are not currently embracing the opportunities TDM offers the academic community, and a lack of awareness of TDM among many academics, alongside a skills gap.

Digital Scholarship | "A Quarter-Century as an Open Access Publisher"

"Response to Elsevier’s Text and Data Mining Policy: A LIBER Discussion Paper"

Posted in Data Curation, Open Data, and Research Data Management, Open Access, Publishing, Scholarly Journals, Text and Data Mining on March 31st, 2014

LIBER has released "Response to Elsevier's Text and Data Mining Policy: A LIBER Discussion Paper."

Here's an excerpt from the announcement:

LIBER believes that the right to read is the right to mine and that licensing will never bridge the gap in the current copyright framework as it is unscalable and resource intensive. Furthermore, as this discussion paper highlights, licensing has the potential to limit the innovative potential of digital research methods by:

  1. restricting the tools that researchers can use
  2. limiting the way in which research results can be made available
  3. impacting on the transparency and reproducibility of research results.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"Text & Data Mining—A Librarian Overview"

Posted in Emerging Technologies, License Agreements/Contracts, Text and Data Mining on August 23rd, 2013

IFLA has released "Text & Data Mining—A Librarian Overview" by Ann Okerson.

Here's an excerpt:

Text and data mining offers exciting research opportunities over a broad range of fields. . . .

This paper reviews some of the possibilities for such work and outlines the challenges and the way ahead for librarians. One challenge lies in the terms by which data sets are licensed and made available to academic and other users; librarians need to be proactive in ensuring that these terms are favorable for the kind of use researchers will need and that the resources themselves are available in a format that allows innovative mining-based research. Another challenge is the need to support users who wish to engage in text and data mining with limited experience, especially when they approach data sets made available through library resources. Librarians should develop the expertise to support their users by making data resources available to them on favorable terms and supporting their mining efforts.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

The Value and Benefits of Text Mining

Posted in Data Curation, Open Data, and Research Data Management, Reports and White Papers, Text and Data Mining on March 14th, 2012

JIASC has released The Value and Benefits of Text Mining.

Here's an excerpt:

Vast amounts of new information and data are generated everyday through economic, academic and social activities. This sea of data, predicted to increase at a rate of 40% p.a., has significant potential economic and societal value. Techniques such as text and data mining and analytics are required to exploit this potential. . . .

To date there has been no systematic analysis of the value and benefits of text mining to UK further and higher education (UKFHE), nor of the additional value and benefits that might result from the exceptions to copyright proposed by Hargreaves. JISC thus commissioned this analysis of 'The Value and Benefits of Text Mining to UK Further and Higher Education'.

We have explored the costs, benefits, barriers and risks associated with text mining within UKFHE research using the approach to welfare economics laid out in the UK Treasury best practice guidelines for evaluation [2]. We gathered our evidence from consultations with key stakeholders and a set of case studies.

| Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

"Teaching with Google Books: Research, Copyright, and Data Mining"

Posted in Copyright, E-Books, Google and Other Search Engines, Mass Digitizaton, Text and Data Mining on March 12th, 2012

Nathan Rinne has self-archived "Teaching with Google Books: Research, Copyright, and Data Mining" in E-LIS.

Here's an excerpt:

Google's Google Books site is a rich resource that is probably underutilized by most educators. It has all kinds of potential for a) getting students into the research process in a way that they will enjoy (for example, they can see how a famous quote has been used/quoted, find out which books cite the journal article they are interested in, or check to see if a specific book covers a topic that they want to explore, etc.); b) teaching them about the deeper civic purpose and the evolving state of copyright law; and, c) exploring, with the help of Google Book's Ngram viewer, the promise and ethics surrounding the issue of data-mining and "non-consumptive" research, or research that is accomplished by "mining" books for data, as opposed to reading them.

| Google Books Bibliography | Digital Scholarship |

Stanford University Preparing Proposal for Text Mining Center Providing Access to 30 Million Digitized Books Plus Highwire Journals

Posted in E-Books, Google and Other Search Engines, Mass Digitizaton, Text and Data Mining on October 28th, 2009

In "Possible Text Mining Opportunity at Stanford," Matthew Jockers describes a research proposal being developed at Stanford University for a text mining center that would provide access to 30 million digitized books plus Highwire Journals.

Here's an excerpt:

As I'm sure many of you already know, Stanford has been closely involved with Google's book scanning project, and we (Stanford) are currently preparing a proposal for the creation of a text mining / analysis Center on campus. The core assets of the proposed Center would include all of the Google data (approx. 30 million books) plus all of our Highwire data and all of our licensed content. We see a wide range of research opportunities for this collection, and we are envisioning a Center that would offer various levels of interaction with scholars. In particular we envision a "tiered" service model that would, on one hand, allow technically challenged researchers to work with Center staff in formulating research questions and, on the other, an opportunity for more technically advanced scholars to write their own algorithms and run them on the corpus. We are imagining the Center as both a resource and as a physical place, a place that will offer support to both internal and external scholars and graduate students.

Scholarship in the Age of Abundance: Enhancing Historical Research with Text-Mining and Analysis Tools Project

Posted in Digital Humanities, Scholarly Communication, Text and Data Mining on February 5th, 2008

The Center for History and New Media's Scholarship in the Age of Abundance: Enhancing Historical Research with Text-Mining and Analysis Tools project has been awarded a two-year grant from the National Endowment for the Humanities.

Here's an excerpt from "Enhancing Historical Research with Text-Mining and Analysis Tools":

We will first conduct a survey of historians to examine closely their use of digital resources and prospect for particularly helpful uses of digital technology. We will then explore three main areas where text mining might help in the research process: locating documents of interest in the sea of texts online; extracting and synthesizing information from these texts; and analyzing large-scale patterns across these texts. A focus group of historians will be used to assess the efficacy of different methods of text mining and analysis in real-world research situations in order to offer recommendations, and even some tools, for the most promising approaches.


DigitalKoans

DigitalKoans

Digital Scholarship

Copyright © 2005-2020 by Charles W. Bailey, Jr.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license.