Open Annotation Core Data Model

The Open Annotation Collaboration has released the draft "Open Annotation Core Data Model."

Here's an excerpt:

The Open Annotation Core Data Model specifies an interoperable framework for creating associations between related resources, annotations, using a methodology which conforms to the Architecture of the World Wide Web. Open Annotations can easily be shared between platforms, with sufficient richness of expression to satisfy complex requirements while remaining simple enough to also allow for the most common use cases, such as attaching a piece of text to a single web resource.

An Annotation is considered to be a set of connected resources, including a body and target, and conveys that the body is somehow about the target. The full model supports additional functionality, enabling semantic tagging, embedding content, selecting segments of resources, choosing the appropriate representation of a resource and providing styling hints for consuming clients.

See also the draft “Open Annotation Extension Specification.”

| Research Data Curation Bibliography | Digital Scholarship |

TACC Launches University of Texas Data Repository with Six Petabytes of Data Storage

The Texas Advanced Computing Center has launched the University of Texas Data Repository.

Here's an excerpt from the press release:

The much-anticipated University of Texas Data Repository (UTDR) named “Corral” is available to researchers at all 15 University of Texas System institutions, the Texas Advanced Computing Center (TACC) at The University of Texas at Austin announced today.

The data repository is part of the overall University of Texas Research Cyberinfrastructure (UTRC) project, a $23 million initiative announced in December 2010 to enable world-class research and foster stronger collaborations among researchers in Texas and around the world. The UTRC project ensures that researchers across Texas can effectively use advanced computing capabilities, including high-performance computing for simulation and analysis, high-capacity storage for large digital data collections, and high-bandwidth networking connecting institutions and resources.

As one of the largest online storage systems available to academic researchers in the United States, Corral provides six petabytes of data, which is equal to 50 times the size of the entire collection of DVDs at Netflix. University of Texas System researchers whose data needs outstrip their local capacity are invited to apply for allocations on Corral using the Allocations Request System available through the TACC User Portal.

| Research Data Curation Bibliography | Digital Scholarship |

Report on Peer Review of Research Data in Scholarly Communication

The Alliance for Permanent Access to the Records of Science Network has released the Report on Peer Review of Research Data in Scholarly Communication.

Here's an excerpt:

This report documents ideas, attitudes, developments and discussion concerning quality assurance of research data. The focus is on action taken by scientists, e-infrastructure providers and scientific journals. Their measures are documented and categorized. Future fields of research are to be described based on this work.

| Digital Curation and Preservation Bibliography 2010: "If you're looking for a reading list that will keep you busy from now until the end of time, this is your one-stop shop for all things digital preservation." — "Digital Preservation Reading List," Preservation Services at Dartmouth College weblog, February 21, 2012. | Digital Scholarship |

Monash University’s Research Data Management Strategy and Strategic Plan 2012-2015

Monash University has released its Research Data Management Strategy and Strategic Plan 2012-2015.

Here's an excerpt:

The Research Data Management Strategy and Strategic Plan 2012-2015 outlines an extended program of activities to holistically address technology, professional development and cultural change. The strategy takes as its starting point the following statement of intent.

Monash University recognises that research data that is better managed, more discoverable and available for re-use will contribute to increased research impact, enhanced research practice (including collaboration) and improved education outcomes. The University aims to maintain its national leadership role around research data management and to fulfil compliance requirements and community expectations. All members of the Monash University community share responsibility to improve research data management in a coordinated and integrated way. This strategy supports the research, education and professional services strategies developed as part of the Monash Futures program.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

DuraSpace Gives Automatic DuraCloud Access to Internet2 Members

DuraSpace has given automatic DuraCloud access to Internet2 members.

Here's an excerpt from the press release:

DuraSpace and Internet2 announced today at the Spring 2012 Internet2 Member Meeting that Internet2 members now have automatic access to DuraCloud [http://duracloud.org], a trusted service for archiving and managing content in the cloud featuring one-click creation of many copies, in multiple locations with several providers.

DuraCloud is the first Internet2 NET+ community-developed service aimed at meeting the preservation needs of Internet2 members. As the only managed software service that lets organizations archive content across more than one cloud provider, DuraCloud ensures that irreplaceable documents, imagery and videos are always accessible.

Here's a list of higher education Internet2 members.

Read more about it at "Internet2, 16 Major Technology Companies Announce Cloud Service Partnerships to Benefit the Nation's Universities."

| Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

Harvard Library Releases over 12 Million Bibliographic Records under CC0 1.0 Public Domain Dedication

The Harvard Library has released over 12 million bibliographic records under the CC0 1.0 Public Domain Dedication license.

Here's an excerpt from the press release:

The Harvard Library announced it is making more than 12 million catalog records from Harvard’s 73 libraries publicly available.

The records contain bibliographic information about books, videos, audio recordings, images, manuscripts, maps, and more. The Harvard Library is making these records available in accordance with its Open Metadata Policy and under a Creative Commons 0 (CC0) public domain license. In addition, the Harvard Library announced its open distribution of metadata from its Digital Access to Scholarship at Harvard (DASH) scholarly article repository under a similar CC0 license.

"The Harvard Library is committed to collaboration and open access. We hope this contribution is one of many steps toward sharing the vital cultural knowledge held by libraries with all," said Mary Lee Kennedy, Senior Associate Provost for the Harvard Library.

| Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals: Those wishing to learn more about the open access movement would be well served by turning to Bailey's Open Access Bibliography. . . .This title is a major contribution to the study of the open access movement in general, as well as its emergence in the early twenty-first century. — Mary Aycock, Library Resources and Technical Services 52, no. 3 (2008): 212-213. | Digital Scholarship |

Analysis of Results from the Research Data Preservation Survey

The London School of Economics and Political Science Digital Communication Enhancement project has released Analysis of Results from the Research Data Preservation Survey.

Here's an excerpt from "Results of Researcher Survey":

The survey showed the general lack of awareness amongst LSE [London School of Economics and Political Science] researchers of digital data preservation: this isn't a criticism, if we had found a good awareness we would probably have to stop the project! We also found that there are cultural challenges to address as well as the need for more technical training if researchers are to send their research data and materials into the future with confidence.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

ESIP "Interagency Data Stewardship/Principles" and "Interagency Data Stewardship/Citations/Provider Guidelines" Approved

The Federation of Earth Science Information Partners has approved its "Interagency Data Stewardship/Principles" and "Interagency Data Stewardship/Citations/Provider Guidelines."

Here's an excerpt from "Data Management and the ESIP Federation" by Ruth Duerr:

Why do I think that this was significant? Simply because it represents the first time that a large and diverse set of US Mission agencies, data centers, research groups, commercial companies, tool developers, and even individuals have come together and agreed that data stewardship is important. They saw it to be important enough to codify into standard practices for data and recognized that data citation is something that needs to become part of the culture of science and that it is past time to make that happen.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Being Open About Data: Analysis of the UK Open Data Policies and Applicability of Open Data

The Finnish Institute has released Being Open About Data: Analysis of the UK Open Data Policies and Applicability of Open Data .

Here's an excerpt:

This paper presents an analysis of the recent UK open-data policies and draws an argument on how governments can sustainably promote the development and use of open data. Moreover, research contributes to the ongoing discussion on the normative values of openness by presenting a conceptual analysis of open data as an integral part of the freedom-of-information continuum.

| Scholarly Electronic Publishing Bibliography | Digital Scholarship |

Research Data Stewardship at UNC: Recommendations for Scholarly Practice and Leadership

The University of North Carolina at Chapel Hill School of Information and Library Science has released Research Data Stewardship at UNC: Recommendations for Scholarly Practice and Leadership.

Here's an excerpt:

This working report emanates from efforts to identify policy options for digital research data stewardship at UNC. In January 2011, the UNC Provost charged a task force on the stewardship of digital research data to make recommendations about storage and maintenance of digital data produced in the course of UNC-based research (see Appendix 1 for the task force charge). During the 2011 calendar year, the task force conducted an environmental scan of research data stewardship policies and trends, discussed issues, collected data on campus using interviews and a survey, and developed a set of principles and associated courses of action for the campus to consider (see Appendix 2 for a list of task force meetings). We believe that the principles are in concert with the UNC mission and its academic plan and can serve as the basis for policies and implementations. We recognize, however, that scholarly data and processes are highly diverse and that the technologies and economics of stewardship are changing rapidly. We thus view the implementation alternatives and recommendations here as first steps in what should be an ongoing process that serves the research data stewardship needs of scholars, the campus, and humanity. We offer this document as a working report that we hope will serve as an adaptable framework for research data stewardship across disciplines at UNC and beyond.

| Digital Curation and Preservation Bibliography 2010: "If you're looking for a reading list that will keep you busy from now until the end of time, this is your one-stop shop for all things digital preservation."— "Digital Preservation Reading List," Preservation Services at Dartmouth College weblog, February 21, 2012. | Digital Scholarship |

"Fact Sheet: Big Data Across the Federal Government"

The White House Office of Science and Technology Policy has released "Fact Sheet: Big Data Across the Federal Government."

Here's an excerpt:

Below are highlights of ongoing Federal government programs that address the challenges of, and tap the opportunities afforded by, the big data revolution to advance agency missions and further scientific discovery and innovation.

| Digital Curation and Preservation Bibliography 2010: "If you're looking for a reading list that will keep you busy from now until the end of time, this is your one-stop shop for all things digital preservation."— "Digital Preservation Reading List," Preservation Services at Dartmouth College weblog, February 21, 2012. | Digital Scholarship |

The Value and Benefits of Text Mining

JIASC has released The Value and Benefits of Text Mining.

Here's an excerpt:

Vast amounts of new information and data are generated everyday through economic, academic and social activities. This sea of data, predicted to increase at a rate of 40% p.a., has significant potential economic and societal value. Techniques such as text and data mining and analytics are required to exploit this potential. . . .

To date there has been no systematic analysis of the value and benefits of text mining to UK further and higher education (UKFHE), nor of the additional value and benefits that might result from the exceptions to copyright proposed by Hargreaves. JISC thus commissioned this analysis of 'The Value and Benefits of Text Mining to UK Further and Higher Education'.

We have explored the costs, benefits, barriers and risks associated with text mining within UKFHE research using the approach to welfare economics laid out in the UK Treasury best practice guidelines for evaluation [2]. We gathered our evidence from consultations with key stakeholders and a set of case studies.

| Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

"The Informatics Transform: Re-engineering Libraries for the Data Decade"

Liz Lyon has published "The Informatics Transform: Re-engineering Libraries for the Data Decade" in the latest issue of the International Journal of Digital Curation.

Here's an excerpt:

In this paper, Liz Lyon explores how libraries can re-shape to better reflect the requirements and challenges of today's data-centric research landscape. The Informatics Transform presents five assertions as potential pathways to change, which will help libraries to re-position, re-profile, and re-structure to better address research data management challenges. The paper deconstructs the institutional research lifecycle and describes a portfolio of ten data support services which libraries can deliver to support the research lifecycle phases. Institutional roles and responsibilities for research data management are also unpacked, building on the framework from the earlier Dealing with Data Report. Finally, the paper examines critical capacity and capability challenges and proposes some innovative steps to addressing the significant skills gaps.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

"Peer-Reviewed Open Research Data: Results of a Pilot"

Marjan Grootveld and Jeff van Egmond have self-archived "Peer-Reviewed Open Research Data: Results of a Pilot" in E-LIS.

Here's an excerpt:

Peer review of publications is at the core of science and primarily seen as instrument for ensuring research quality. However, it is less common to value independently the quality of the underlying data as well. In the light of the "data deluge" it makes sense to extend peer review to the data itself and this way evaluate the degree to which the data are fit for re-use. This paper describes a pilot study at EASY—the electronic archive for (open) research data at our institution. In EASY, researchers can archive their data and add metadata themselves. Devoted to open access and data sharing, at the archive we are interested in further enriching these metadata with peer reviews.

As pilot we established a workflow where researchers who have downloaded data sets from the archive were asked to review the downloaded data set. This paper describes the details of the pilot including the findings, both quantitative and qualitative. Finally we discuss issues that need to be solved when such a pilot should be turned into structural peer review functionality of the archiving system.

| Digital Scholarship |

The Open Data Handbook

The Open Knowledge Foundation has released The Open Data Handbook.

Here's an excerpt from the announcement:

From a basic introduction of the "what and why" of open data, the Handbook goes on to discuss the practicalities of making data open – the "how". It gives advice on everything from choosing a file format and applying a license, to motivating the community and telling the world. Clear explanations, illustrative examples and technical recommendations make the Handbook suitable for people with all levels of experience, from the absolute beginner to the seasoned open data professional.

The Handbook is divided into short chapters which cover individual aspects of open data. It can be read in a single sitting, or dipped into as a reference work.

| Digital Curation and Preservation Bibliography | Digital Scholarship |

Review of Data Management Lifecycle Models

Alex Ball has self-archived Review of Data Management Lifecycle Models in the University of Bath institutional repository.

Here's an excerpt:

The importance of lifecycle models is that they provide a structure for considering the many operations that will need to be performed on a data record throughout its life. Many curatorial actions can be made considerably easier if they have been prepared for in advance – even at or before the point of record creation. For example, a repository can be more certain of the preservation actions it can perform if the rights and licensing status of the data has already been clarified, and researchers are more likely to be able to detail the methodologies and workflows they used if they record them at the time.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Data-Intensive Research: Community Capability Model Framework (Consultation Draft)

The Community Capability Model for Data-Intensive Research project has released a consultation draft of the Community Capability Model Framework.

Here's an excerpt:

The Community Capability Model Framework is a tool developed by UKOLN, University of Bath, and Microsoft Research to assist institutions, research funders and researchers in growing the capability of their communities to perform data-­-intensive research by

  • profiling the current readiness or capability of the community,
  • indicating priority areas for change and investment, and
  • developing roadmaps for achieving a target state of readiness.

The Framework is comprised of eight capability factors representing human, technical and environmental issues. Within each factor are a series of community characteristics that are relevant for determining the capability or readiness of that community to perform data- intensive research.

| E-science and Academic Libraries Bibliography | Digital Scholarship |

Collaborative Yet Independent: Information Practices in the Physical Sciences

The Research Information Network, the Institute of Physics, Institute of Physics Publishing, and the Royal Astronomical Society have released Collaborative Yet Independent: Information Practices in the Physical Sciences.

Here's an excerpt:

In many ways, the physical sciences are at the forefront of using digital tools and methods to work with information and data. However, the fields and disciplines that make up the physical sciences are by no means uniform, and physical scientists find, use, and disseminate information in a variety of ways. This report examines information practices in the physical sciences across seven cases, and demonstrates the richly varied ways in which physical scientists work, collaborate, and share information and data.

| Digital Bibliographies | Digital Scholarship |

Open Access: Online Survey on Scientific Information in the Digital Age

The European Commission has released the Online Survey on Scientific Information in the Digital Age.

Here's an excerpt:

Respondents were asked if there is no access problem to scientific publications in Europe: 84 % disagreed or disagreed strongly with the statement. The high prices of journals/subscriptions (89%) and limited library budgets (85%) were signalled as the most important barriers to accessing scientific publications. More than 1,000 respondents (90%) supported the idea that publications resulting from publicly funded research should, as a matter of principle, be in open access (OA) mode. An even higher number of respondents (91%) agreed or agreed strongly that OA increased access to and dissemination of scientific publications. Self-archiving ("green OA") or a combination of self-archiving and OA publishing ("gold OA") were identified as the preferred ways that public research policy should facilitate in order to increase the number and share of scientific publications available in OA. Respondents were asked, in the case of self-archiving ("green OA"), what the desirable embargo period is (period of time during which publication is not yet open access): a six-month period was favoured by 56% of respondents (although 25% disagree with this option).

| Transforming Scholarly Publishing through Open Access: A Bibliography | Digital Scholarship Publications Overview |

ARL, Johns Hopkins University Libraries, and SPARC Reply to White House RFI on Public Access to Digital Data

The Association of Research Libraries, the Johns Hopkins University Libraries, and SPARC have replied to the White House's Request for Information: Public Access to Digital Data Resulting from Federally Funded Scientific Research.

Here's an excerpt:

Question 1

What specific Federal policies would encourage public access to and the preservation of broadly valuable digital data resulting from federally funded scientific research, to grow the U.S. economy and improve the productivity of the American scientific enterprise?

Comment 1

The most effective Federal policies in this regard would mandate digital data deposit into publicly accessible repositories. In the absence of such policies, there are already cases of digital data which have been lost or remain inaccessible or accessible only with high barriers. While laudable efforts such as the NSF and NIH data management plans move the community in the direction of supporting U.S. economic growth and productivity, the reality is that many researchers continue to strictly interpret the requirement as sharing data based on specific requests or personal provisions. The Federal policy framework should move public access to digital data away from the current idiosyncratic environment to a systematic approach that lowers barriers to data access, discovery, sharing and re-use.

Instead of relying upon individual investigators to interpret and support public access through a point to point network (e.g., researcher provides digital data upon request), Federal policies should ensure that public access can occur through well managed, sustained, preservation archives that enable a legally and policy compliant peer to peer model for sharing. A useful metric for full-fledged public access to digital data is whether someone (or some machine) other than the original data producer can discover, access, interpret and use the digital data without contacting the original data producer.

See also Columbia University Libraries/Information Services' reply and the Creative Commons' reply.

| Transforming Scholarly Publishing through Open Access: A Bibliography | Digital Scholarship |

Three New Documents about Creative Commons Licenses for Data

The Creative Commons has released three new documents about the use of its licenses for data: "Data," "Data and CC Licenses," and "CC0 Use for Data."

Here's an excerpt from the announcement by Sarah Hinchliff Pearson:

We have done a lot of thinking about data in the past year. As a result, we have recently published a set of detailed FAQs designed to help explain how CC licenses work with data and databases.

These FAQs are intended to:

  1. alert CC licensors that some uses of their data and databases may not trigger the license conditions,
  2. reiterate to licensees that CC licenses do not restrict them from doing anything they are otherwise permitted to do under the law, and
  3. clear up confusion about how the version 3.0 CC licenses treat sui generis database rights.

| Digital Scholarship's Weblogs and Tweets | Digital Scholarship |