"The Open Knowledge Foundation: Open Data Means Better Science"

Jennifer C. Molloy has published "The Open Knowledge Foundation: Open Data Means Better Science" in PLoS Biology.

Here's an excerpt:

Data provides the evidence for the published body of scientific knowledge, which is the foundation for all scientific progress. The more data is made openly available in a useful manner, the greater the level of transparency and reproducibility and hence the more efficient the scientific process becomes, to the benefit of society. This viewpoint is becoming mainstream among many funders, publishers, scientists, and other stakeholders in research, but barriers to achieving widespread publication of open data remain. The Open Data in Science working group at the Open Knowledge Foundation is a community that works to develop tools, applications, datasets, and guidelines to promote the open sharing of scientific data. This article focuses on the Open Knowledge Definition and the Panton Principles for Open Data in Science. We also discuss some of the tools the group has developed to facilitate the generation and use of open data and the potential uses that we hope will encourage further movement towards an open scientific knowledge commons.

| Digital Scholarship's Digital Bibliographies | Digital Scholarship |

"Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results"

Jelte M. Wicherts, Marjan Bakker, Dylan Molenaar have published "Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results" in PLoS ONE.

Here's an excerpt:

We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

| Digital Scholarship's Digital Bibliographies | Digital Scholarship |

Costs and Benefits of Data Provision: Report to the Australian National Data Service

The Australian National Data Service has released Costs and Benefits of Data Provision: Report to the Australian National Data Service by John Houghton.

Here's an excerpt:

This report presents case studies exploring the costs and benefits that PSI [Public Sector Information] producing agencies and their users experience in making information freely available, and preliminary estimates of the wider economic impacts of open access to PSI. In doing so, it outlines a possibly method for cost-benefit analysis at the agency level and explores the data requirements for such an analysis —recognising that few agencies will have all of the data required. . . .

What this study demonstrates is that the direct and measurable benefits of making PSI available freely and without restrictions on use typically outweigh the costs. When one adds the longerterm benefits that we cannot fully measure, and may not even foresee, the case for open access appears to be strong.

| Transforming Scholarly Publishing through Open Access: A Bibliography | Digital Scholarship |

A Surfboard for Riding the Wave—Towards a Four Country Action Programme on Research Data

The Knowledge Exchange has released A Surfboard for Riding the Wave—Towards a Four Country Action Programme on Research Data.

Here's an excerpt from the announcement:

The report not only offers an overview of the present activities and challenges in the field of research data in Denmark, Germany, the Netherlands and the United Kingdom but also outlines an action programme for the four countries in realising a collaborative data infrastructure. This report is a response to the Riding the Wave report which was published by the High Level Expert Group on Scientific Data. . . .

In the report four key drivers are addressed: incentives for researchers, training in relation to researchers in their role as data producers and users of information infrastructure, organisational and technical infrastructure and, finally, the funding of the infrastructure. The report offers recommendations for actions in each of these fields for the partners and others, not only in the four partner countries, but also beyond these borders.

Based on the overview of the present situation in the four Knowledge Exchange partner countries, the report formulates three long-term strategic goals:

  1. Data sharing will be part of the academic culture
  2. Data logistics will be an integral component of academic professional life
  3. Data infrastructure will be sound, both operationally and financially.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Report on Integration of Data and Publications

The Alliance for Permanent Access has released Report on Integration of Data and Publications.

Here's an excerpt:

This report sets out to identify examples of integration between datasets and publications. Findings from existing studies carried out by PARSE.Insight, RIN, SURF and various recent publications are synthesized and examined in relation to three distinct disciplinary groups in order to identify opportunities in the integration of data.

| Scholarly Electronic Publishing Bibliography 2010 | Digital Scholarship |

"Linking to Data—Effect on Citation Rates in Astronomy"

Edwin A. Henneken and Alberto Accomazzi have self-archived "Linking to Data—Effect on Citation Rates in Astronomy" in arXiv.org.

Here's an excerpt:

Is there a difference in citation rates between articles that were published with links to data and articles that were not? Besides being interesting from a purely academic point of view, this question is also highly relevant for the process of furthering science. Data sharing not only helps the process of verification of claims, but also the discovery of new findings in archival data. However, linking to data still is a far cry away from being a "practice", especially where it comes to authors providing these links during the writing and submission process. You need to have both a willingness and a publication mechanism in order to create such a practice. Showing that articles with links to data get higher citation rates might increase the willingness of scientists to take the extra steps of linking data sources to their publications. In this presentation we will show this is indeed the case: articles with links to data result in higher citation rates than articles without such links.

| New: E-science and Academic Libraries Bibliography | Digital Scholarship |

"Openness as Infrastructure"

John Wilbanks has published "Openness as Infrastructure" in the Journal of Cheminformatics.

Here's an excerpt:

The advent of open access to peer reviewed scholarly literature in the biomedical sciences creates the opening to examine scholarship in general, and chemistry in particular, to see where and how novel forms of network technology can accelerate the scientific method. This paper examines broad trends in information access and openness with an eye towards their applications in chemistry.

| Transforming Scholarly Publishing through Open Access: A Bibliography | Digital Scholarship |

Data Management Planning: Open Source DMPTool Launched by University of California Curation Center and Others

The University of California Curation Center has announced the launch of DMPTool.

Here's an excerpt from the press release:

The University of California and several other major research institutions have partnered to develop the DMPTool, a flexible online application to help researchers generate data management plans—simple but effective documents for ensuring good data stewardship. These plans increasingly are being required by funders such as the National Science Foundation (NSF), the National Institutes of Health (NIH) and the Gordon and Betty Moore Foundation (GBMF). The DMPTool supports data management plans and funder requirements across the disciplines, including the humanities and physical, medical and social sciences. . . .

The DMPTool is open source, freely available and easily configurable to reflect an institution's local policies and information. Users of the DMPTool can view sample plans, preview funder requirements and view the latest changes to their plans. It permits the user to create an editable document for submission to a funding agency and can accommodate different versions as funding requirements change. Not only can researchers use the tool to generate plans compliant to funder requirements, but institutions also can use the tool to present information and policies relevant to data management and to foster collaboration among faculty, the institutional libraries, contracts and grants offices, and academic computing. . . .

Project partners include the University of California Curation Center (UC3) at the California Digital Library, the UCLA Library, the UC San Diego Libraries, the Smithsonian Institution, the University of Virginia Library, the University of Illinois at Urbana-Champaign, DataONE, and the United Kingdom's Digital Curation Centre. Working collaboratively, these institutions have consolidated their expertise and reduced their costs.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

"Federal Funding Agencies: Data Management and Sharing Policies"

The California Digital Library has released "Federal Funding Agencies: Data Management and Sharing Policies."

Here's an excerpt:

The Office of Management and Budget (OMB) Circular A-110 provides the federal administrative requirements for grants and agreements with institutions of higher education, hospitals and other non-profit organizations. In 1999 Circular A-110 was revised to provide public access under some circumstances to research data through the Freedom of Information Act (FOIA).

Funding agencies have implemented the OMB requirement in various ways. The table below summarizes the data management and sharing requirements of primary US federal funding agencies.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Cite Datasets and Link to Publications

The Digital Curation Centre has released Cite Datasets and Link to Publications.

Here's an excerpt:

This guide will help you create links between your academic publications and the underlying datasets, so that anyone viewing the publication will be able to locate the dataset and vice versa. It provides a working knowledge of the issues and challenges involved, and of how current approaches seek to address them. This guide should interest researchers and principal investigators working on data-led research, as well as the data repositories with which they work.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

E-science and Academic Libraries Bibliography

Digital Scholarship has released the E-science and Academic Libraries Bibliography. It includes English-language articles, books, editorials, and technical reports that are useful in understanding the broad role of academic libraries in e-science efforts. The scope of this brief selective bibliography is narrow, and it does not cover data curation and research data management issues in libraries in general. Most sources have been published from 2007 through October 18, 2011; however, a limited number of key sources published prior to 2007 are also included. The bibliography includes links to freely available versions of included works, such as e-prints and open access articles.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Changing the Conduct of Science in the Information Age

The National Science Foundation has released Changing the Conduct of Science in the Information Age.

Here's an excerpt:

The U.S. National Science Foundation (NSF) held a workshop titled "Changing the Conduct of Science in the Information Age" on November 12, 2010, to promote international cooperation in such policy areas as the promotion of data access, the development of technical solutions for open data platforms, and attribution for research contributions. This report describes the discussions, findings, and suggestions generated by the distinguished group of international workshop participants. . . .

There was a strong consensus that this vision could be achieved with the help of a concerted, collaborative effort by international funding agencies to:

  1. Establish a system of persistent identifiers for researchers and their outputs;
  2. Develop national and international pilot projects that compare different technical solutions for establishing and maintaining open data platforms, fostering the replication of scientific research, and ensuring attribution for the intellectual contributions of researchers; and
  3. Foster formal and informal training to develop scientists' skills in knowledge and data access, as well as data analysis.

| New: Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

Data Centres: Their Use, Value and Impact

The Research Information Network has released Data Centres: Their Use, Value and Impact.

Here's an excerpt:

In recent years, the value of data as a primary research output has begun to be increasingly recognised. New technology has made it possible to create, store and reuse datasets, either for new analysis or for combination with other data in order to answer different questions. In the UK, academic researchers, funders and institutions have responded to these possibilities by supporting a number of data centres' organisations with responsibility for supplying research data to the academic community, and in some cases for collecting, storing and curating such data as well. . . .

This study sought to understand usage of UK data centres among researchers, and to examine the impact of such use upon their work. We undertook a series of initial interviews with research funders to understand the role and importance of data and data centres within various academic fields, followed by a survey of the users of five data centres. Finally, through the interviews and surveys, a set of case studies was identified where the data centre had benefited a researcher's work, and in some cases that work had gone on to have an impact in wider society.

| New: Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

"Extracting, Transforming and Archiving Scientific Data"

Daniel Lemire and Andre Vellin have self-archived "Extracting, Transforming and Archiving Scientific Data" in arXiv.org.

Here's an excerpt:

It is becoming common to archive research datasets that are not only large but also numerous. In addition, their corresponding metadata and the software required to analyse or display them need to be archived. Yet the manual curation of research data can be difficult and expensive, particularly in very large digital repositories, hence the importance of models and tools for automating digital curation tasks. The automation of these tasks faces three major challenges: (1) research data and data sources are highly heterogeneous, (2) future research needs are difficult to anticipate, (3) data is hard to index. To address these problems, we propose the Extract, Transform and Archive (ETA) model for managing and mechanizing the curation of research data. Specifically, we propose a scalable strategy for addressing the research-data problem, ranging from the extraction of legacy data to its long-term storage. We review some existing solutions and propose novel avenues of research.

| Digital Scholarship |

Data Privacy Legislation: An Analysis of the Current Legislative Landscape and the Implications for Higher Education

EDUCAUSE has released Data Privacy Legislation: An Analysis of the Current Legislative Landscape and the Implications for Higher Education .

Here's an excerpt:

With the ubiquity of mobile devices and the increases in data breaches, Congress has responded with bipartisan support for comprehensive privacy legislation. As of August 2011, 18 bills have been introduced in the 112th Congress concerning data privacy. . . .

These privacy bills generally fall into three distinct areas: comprehensive online privacy protection, geolocation and mobile devices, and data security and breach notification. If enacted, many of the bills have implications for data collection, storage, and use that could affect higher education and campus IT operations and academic research.

| Digital Scholarship |

European Commission Launches Public Consultation on Digital Scientific Information Access and Preservation

The European Commission has launched a public consultation on digital scientific information access and preservation.

Here's an excerpt from the press release:

A public consultation on access to, and preservation of, digital scientific information has been launched by the European Commission on the initiative of European Commission Vice President for the Digital Agenda Neelie Kroes and Commissioner for Research and Innovation, Máire Geoghegan-Quinn. European researchers, engineers and entrepreneurs must have easy and fast access to scientific information, to compete on an equal footing with their counterparts across the world. Modern digital infrastructures can play a key role in facilitating access. However, a number of challenges remain, such as high and rising subscription prices to scientific publications, an ever-growing volume of scientific data, and the need to select, curate and preserve research outputs. Open access, defined as free access to scholarly content over the Internet, can help address this. Scientists, research funding organisations, universities, and other interested parties are invited to send their contributions on how to improve access to scientific information. The consultation will run until 9 September 2011. . . .

Interested parties are invited to express their views on the following key science policy questions:

  • how scientific articles could become more accessible to researchers and society at large
  • how research data can be made widely available and how it could be re-used
  • how permanent access to digital content can be ensured and what barriers are preventing the preservation of scientific output

| Digital Curation and Preservation Bibliography 2010 | Electronic Theses and Dissertations Bibliography | Google Books Bibliography | Institutional Repository Bibliography | Transforming Scholarly Publishing through Open Access: A Bibliography | Scholarly Electronic Publishing Bibliography 2010 | Digital Scholarship Publications Overview |

"Who Shares? Who Doesn’t? Factors Associated with Openly Archiving Raw Research Data"

Heather A. Piwowar has published "Who Shares? Who Doesn't? Factors Associated with Openly Archiving Raw Research Data" in PLoS One.

Here's an excerpt:

First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.

| Digital Curation and Preservation Bibliography 2010 | Institutional Repository Bibliography | Transforming Scholarly Publishing through Open Access: A Bibliography | Scholarly Electronic Publishing Bibliography 2010 |

JISC Managing Research Data Programme Issues Call for Grant Proposals

The JISC Managing Research Data Programme has issued a call for grant proposals.

Here's an excerpt from the notice:

A total of approximately £4.6M will be available, divided across three strands. The deadline for submissions will be 28 July 2011. . . .

The strands are as follows:

Strand A: Institutional Research Data Management Infrastructure: divided between A(1) Start-up projects to help institutions that are at an early stage of developing a research data management infrastructure; and A(2) Embedding projects to help institutions enhance and extend an existing pilot research data management infrastructure. . . .

Strand B: Research Data Management Planning: projects to design and implement research data management plans for specific projects/departments; including supporting systems and tools. . . .

Strand C: Projects to develop and implement institutional data management planning tools/workflows.

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

Managing and Sharing Data: Best Practice for Researchers

The UK Data Archive has released a new edition of Managing and Sharing Data: Best Practice for Researchers.

Here's an excerpt from the announcement:

To support researchers in producing high quality research data for long-term use, the UK Data Archive has revised and expanded its popular and highly cited Managing and Sharing Data: best practice for researchers, first published in 2009.

The new third edition is 36 pages covering:

  • why and how to share research data
  • data management planning and costing
  • documenting data
  • formatting data
  • storing data
  • ethics and consent issues
  • data copyright
  • data management strategies for large investments

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

Open Data: UK Engineering and Physical Sciences Research Council Adopts EPSRC Policy Framework on Research Data

The UK Engineering and Physical Sciences Research Council, which is "the main UK government agency for funding research and training in engineering and the physical sciences," has adopted the EPSRC Policy Framework on Research Data.

Here's an excerpt from the document:

This policy framework sets out EPSRC's expectations concerning the management and provision of access to EPSRC-funded research data. EPSRC recognises that a range of institutional policies and practices can satisfy these expectations, and encourages research organisations to develop specific approaches which, while aligned with EPSRC's expectations, are appropriate to their own structures and cultures.

The expectations arise from seven core principles which align with the core RCUK principles on data sharing. Two of the principles are of particular importance: firstly, that publicly funded research data should generally be made as widely and freely available as possible in a timely and responsible manner; and, secondly, that the research process should not be damaged by the inappropriate release of such data.

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

"Tragedy of the Data Commons"

Jane Yakowitz has self-archived "Tragedy of the Data Commons" in SSRN.

Here's an excerpt:

Accurate data is vital to enlightened research and policymaking, particularly publicly available data that are redacted to protect the identity of individuals. Legal academics, however, are campaigning against data anonymization as a means to protect privacy, contending that wealth of information available on the Internet enables malfeasors to reverse-engineer the data and identify individuals within them. Privacy scholars advocate for new legal restrictions on the collection and dissemination of research data. This Article challenges the dominant wisdom, arguing that properly de-identified data is not only safe, but of extraordinary social utility. It makes three core claims. First, legal scholars have misinterpreted the relevant literature from computer science and statistics, and thus have significantly overstated the futility of anonymizing data. Second, the available evidence demonstrates that the risks from anonymized data are theoretical – they rarely, if ever, materialize. Finally, anonymized data is crucial to beneficial social research, and constitutes a public resource – a commons – under threat of depletion. The Article concludes with a radical proposal: since current privacy policies overtax valuable research without reducing any realistic risks, law should provide a safe harbor for the dissemination of research data.

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

"Joining in the Enterprise of Response in the Wake of the NSF Data Management Planning Requirement"

Patricia Hswe and Ann Holt have published "Joining in the Enterprise of Response in the Wake of the NSF Data Management Planning Requirement" in the latest issue of Research Library Issues.

Here's an excerpt:

This article affords an overview of the new, leading roles libraries can adopt in the provision of data services, thus blending appraisal with advocacy. How are libraries currently giving assistance in data management planning? What recommendations can libraries make that draw from, and build on, these efforts? The article also reports on new communities of practice forming around the challenges of digital data issues, bringing together much needed knowledge and expertise not only from libraries but also from various other sectors of a university, including IT divisions, grant administration offices, and research institutes.

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

Digital Research Data: What Researchers Want

The SURFfoundation has released What Researchers Want.

Here's an excerpt from the announcement:

This publication reviews recent literature describing what researchers want with regard to data storage and access. It was commissioned by SURFfoundation. Fifteen recent sources were studied, covering the Netherlands, the UK, the USA, Australia, and Europe. . . .

The following factors play a role in making storage successful:

  • Tools and services must be in tune with researchers’ workflows, which are often discipline-specific (and sometimes even project-specific)
  • Researchers resist top-down and/or mandatory schemes.
  • Researchers favour a “cafeteria” model in which they can pick and choose from a set of services.
  • Tools and services must be easy to use.
  • Researchers must be in control of what happens to their data, who has access to it, and under what conditions. Consequently, they want to be sure that whoever is dealing with their data (data centre, library, etc.) will respect their interests.
  • Researchers expect tools and services to support their day-to-day work within the research project; long-term/public requirements must be subordinate to that interest.
  • The benefits of the support must clearly visible – not in three years’ time, but now.
  • Support must be local, hands-on, and available when needed.

| Digital Scholarship | Digital Scholarship Publications Overview | Reviews of Digital Scholarship Publications |