Archive for the 'Big Data, Data Curation, Open Data, and Research Data Management' Category

"Joining in the Enterprise of Response in the Wake of the NSF Data Management Planning Requirement"

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Digital Curation/Digital Preservation on March 24th, 2011

Patricia Hswe and Ann Holt have published "Joining in the Enterprise of Response in the Wake of the NSF Data Management Planning Requirement" in the latest issue of Research Library Issues.

Here's an excerpt:

This article affords an overview of the new, leading roles libraries can adopt in the provision of data services, thus blending appraisal with advocacy. How are libraries currently giving assistance in data management planning? What recommendations can libraries make that draw from, and build on, these efforts? The article also reports on new communities of practice forming around the challenges of digital data issues, bringing together much needed knowledge and expertise not only from libraries but also from various other sectors of a university, including IT divisions, grant administration offices, and research institutes.

| Digital Scholarship | Digital Scholarship Publications Overview | Digital Curation and Preservation Bibliography 2010 |

Share

Digital Research Data: What Researchers Want

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Reports and White Papers on March 8th, 2011

The SURFfoundation has released What Researchers Want.

Here's an excerpt from the announcement:

This publication reviews recent literature describing what researchers want with regard to data storage and access. It was commissioned by SURFfoundation. Fifteen recent sources were studied, covering the Netherlands, the UK, the USA, Australia, and Europe. . . .

The following factors play a role in making storage successful:

  • Tools and services must be in tune with researchers’ workflows, which are often discipline-specific (and sometimes even project-specific)
  • Researchers resist top-down and/or mandatory schemes.
  • Researchers favour a “cafeteria” model in which they can pick and choose from a set of services.
  • Tools and services must be easy to use.
  • Researchers must be in control of what happens to their data, who has access to it, and under what conditions. Consequently, they want to be sure that whoever is dealing with their data (data centre, library, etc.) will respect their interests.
  • Researchers expect tools and services to support their day-to-day work within the research project; long-term/public requirements must be subordinate to that interest.
  • The benefits of the support must clearly visible – not in three years’ time, but now.
  • Support must be local, hands-on, and available when needed.

| Digital Scholarship | Digital Scholarship Publications Overview | Reviews of Digital Scholarship Publications |

Share

How to License Research Data

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Copyright, Creative Commons/Open Licenses on February 14th, 2011

The Digital Curation Centre has released How to License Research Data.

Here's an excerpt:

This guide will help you decide how to apply a licence to your research data, and which licence would be most suitable. It should provide you with an awareness of why licensing data is important, the impact licences have on future research, and the potential pitfalls to avoid. It concentrates on the UK context, though some aspects apply internationally; it does not, however, provide legal advice. The guide should interest both the principal investigators and researchers responsible for the data, and those who provide access to them through a data centre, repository or archive.

| Digital Scholarship | Digital Scholarship Publications Overview |

Share

Managing Digital Collections: A Collaborative Initiative on the South African Framework

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Copyright, Digital Curation/Digital Preservation, Digital Libraries, Metadata on February 10th, 2011

The National Research Foundation has released Managing Digital Collections: A Collaborative Initiative on the South African Framework.

Here's an excerpt:

The objective of this Framework is to provide high-level principles for planning and managing the full digital collection life cycle. It aims to

  • provide an overview of some of the major components and activities involved in creating good digital collections
  • provide a sense of the landscape of digital collections management
  • identify existing resources that support the development of sound local practices
  • encourage community participation in the ongoing development of best practices for digital collection building
  • contribute to the benefits of sound data management practices, as well as the goals of data sharing and long term access
  • introduce data management and curation issues
  • assist cultural heritage organisations to create and manage complex digital collections
  • assist funding organisations who wish to encourage and support the development of good digital collections
  • advocate the use of internationally-created appropriate open community standards to ensure quality and to increase global interoperability for better exchange and re-use of data and digital content.

| Digital Scholarship | Digital Scholarship Publications Overview |

Share

MIT Libraries Awarded $650,000 grant from the Library of Congress for Exhibit 3.0 Project

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Grants on January 25th, 2011

The MIT Libraries have been awarded a $650,000 grant from the Library of Congress for the Exhibit 3.0 Project.

Here's an excerpt from the press release:

The MIT Libraries has been awarded a $650,000 grant from the Library of Congress for work in collaboration with the MIT Computer Science and Artificial Intelligence Lab (CSAIL) and Zepheira, Inc. on "Exhibit 3.0," a new project to redesign and expand upon Exhibit, the popular open source software tool for searching, browsing and visualizing data on the Web. The goal is to provide libraries, cultural institutions and other organizations grappling with large amounts of digital content, with an enhanced tool that is scalable and useful for data management, visualization and navigation. According to the Library of Congress, "It is the Library's intent that this work also will further contribute to the collaborative knowledge sharing among the broader communities concerned about the critical infrastructure that will ensure sustainability and accessibility of digital content over time."

"This innovative work has already made a considerable impact on digital content communities whose data is diverse and complex. The visualizations bring new understanding to users and curators alike," said Martha Anderson, Director of the National Digital Information Infrastructure and Preservation Program at the Library of Congress. "We're extremely fortunate to have the support of the Library of Congress on this important research," said Ann Wolpert, director of the MIT Libraries. "Our hope is that Exhibit 3.0 will be a useful tool in tackling the daunting challenge all libraries face in ensuring the future sustainability and accessibility of our digital content."

Exhibit was originally developed as part of the MIT Simile Project (simile.mit.edu), an ambitious collaboration of the MIT Libraries, the MIT CSAIL, and the World Wide Web Consortium (W3C) to explore applications of the Semantic Web to problems of information management across both large-scale digital libraries and small-scale personal collections. Exhibit runs inside a Web browser and supports many types of information using common Web standards for data publishing. Since its release, Exhibit has been used by thousands of websites worldwide across a range of diverse industries including cultural heritage, libraries, publishers, medical research, life science and government. Most recently Exhibit has been used by DATA.GOV (http://data.gov/), an Open Government Initiative by President Obama's administration to increase public access to high value data generated by the Executive Branch of the Federal Government. The application has been used to help demonstrate new ways of visualizing government data. . . .

The Exhibit 3.0 project will redesign and re-implement Exhibit to scale from small collections to very large data collections of the magnitude created by the Library of Congress and its National Digital Information Infrastructure and Preservation Program (NDIIPP). The redesigned Exhibit will be as simple to use as the current tool but more scalable, more modular, and easier to integrate into a variety of information management systems and websites—offering an improved user experience.

In addition to the Library of Congress, the MIT Libraries and other organizations that manage large quantities of data will collaborate on the project for their own collections. A major focus of the project will be to build a lively community around Exhibit, of both users of the software and software developers, to help continuously improve the open source tool. Another aspect of the new project will incorporate research by students at MIT's CSAIL (Computer Science and Artificial Intelligence Lab) on personal information management. The research will focus on improving the user experience working with data in Exhibit, and incorporating new data visualization techniques that allow users to explore data in novel ways. "Impressive data-interactive sites abound on the web, but right now you need a team of developers to create them. Exhibit demonstrated that authoring data-interactive sites can be as easy as authoring a static web page. With Exhibit 3.0 we can move from a prototype to a robust platform that anyone can use to author (not program) rich interactive information visualizations that effectively communicate with their users," said David Karger, computer science professor with CSAIL.

The project will begin in January for a period of one year, and a new website and other communication channels will be publicized soon. For more information see http://similewidgets.org/exhibit3.

| Digital Scholarship |

Share

DataCite Metadata Scheme for the Publication and Citation of Research Data, Version 2.0 Released

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Metadata, Standards on January 24th, 2011

DataCite has released the DataCite Metadata Scheme for the Publication and Citation of Research Data, Version 2.0.

Here's an excerpt:

The DataCite Metadata Scheme is a list of core metadata properties chosen for the accurate and consistent identification of data for citation and retrieval purposes, along with recommended use instructions. At a minimum, the mandatory metadata scheme properties must be provided at the time of identifier registration. Data centres and other submitters may also choose to use the optional properties to identify their data more clearly. This metadata scheme can fulfill several key functions in support of the larger goals of DataCite. Primarily these are:

  • recommending a standard citation format for datasets, based on a small number of properties required for identifier registration;
  • providing the basis for interoperability with other data management schemas;
  • promoting dataset discovery with optional properties allowing for flexible description of the resource, including its relationship to other resources;
  • and, laying the groundwork for future services (e.g., discovery) through the use of controlled terms from both a DataCite vocabulary and external vocabularies as applicable. The DataCite vocabularies will be administered by the DataCite Metadata Supervisor who will establish and publicize procedures for submitting changes.

| Digital Scholarship |

Share

"Data Preservation in High Energy Physics"

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Digital Curation/Digital Preservation on January 20th, 2011

David M. South has self-archived "Data Preservation in High Energy Physics" in arXiv.org.

Here's an excerpt:

Data from high-energy physics (HEP) experiments are collected with significant financial and human effort and are in many cases unique. At the same time, HEP has no coherent strategy for data preservation and re-use, and many important and complex data sets are simply lost. In a period of a few years, several important and unique experimental programs will come to an end, including those at HERA, the b-factories and at the Tevatron. An inter-experimental study group on HEP data preservation and long-term analysis (DPHEP) was formed and a series of workshops were held to investigate this issue in a systematic way. The physics case for data preservation and the preservation models established by the group are presented, as well as a description of the transverse global projects and strategies already in place.

| Digital Scholarship |

Share

Unchartered Waters—The State of Open Data in Europe

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Open Access on January 18th, 2011

CSC has released Unchartered Waters—The State of Open Data in Europe

Here's an excerpt:

This study analyses the current state of the open data policy ecosystem and open government data offerings in nine European Member States. Since none of the countries studied currently offers a national open data portal, this study compares the statistics offices’ online data offerings. The analysis shows that they fulfill a number of open data principles but that there is still a lot of room for improvement. This study underlines that the development of data catalogues and portals should not be seen as means to an end.

| Digital Scholarship |

Share

America COMPETES Act Establishes Interagency Public Access Committee

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Legislation and Government Regulation, Open Access, Publishing on January 17th, 2011

The signing of the America COMPETES Reauthorization Act of 2010 by President Obama establishes a new Interagency Public Access Committee. The International Association of Scientific, Technical & Medical Publishers (STM) has issued a press release that "applauds the efforts of US legislators in crafting the charter of the Interagency Public Access Committee."

Here's an excerpt from the Act:

SEC. 103. INTERAGENCY PUBLIC ACCESS COMMITTEE.

(a) ESTABLISHMENT.—The Director shall establish a working group under the National Science and Technology Council with

the responsibility to coordinate Federal science agency research and policies related to the dissemination and long-term stewardship of the results of unclassified research, including digital data and peer-reviewed scholarly publications, supported wholly, or in part, by funding from the Federal science agencies.

(b) RESPONSIBILITIES.—The working group shall—

(1) identify the specific objectives and public interests that need to be addressed by any policies coordinated under (a);

(2) take into account inherent variability among Federal science agencies and scientific disciplines in the nature of research, types of data, and dissemination models;

(3) coordinate the development or designation of standards for research data, the structure of full text and metadata, navigation tools, and other applications to maximize interoperability across Federal science agencies, across science and engineering disciplines, and between research data and scholarly publications, taking into account existing consensus standards, including international standards;

(4) coordinate Federal science agency programs and activities that support research and education on tools and systems required to ensure preservation and stewardship of all forms of digital research data, including scholarly publications;

(5) work with international science and technology counterparts to maximize interoperability between United States based unclassified research databases and international databases and repositories;

(6) solicit input and recommendations from, and collaborate with, non-Federal stakeholders, including the public, universities, nonprofit and for-profit publishers, libraries, federally funded and non federally funded research scientists, and other organizations and institutions with a stake in long term preservation and access to the results of federally funded research;

(7) establish priorities for coordinating the development of any Federal science agency policies related to public access to the results of federally funded research to maximize the benefits of such policies with respect to their potential economic or other impact on the science and engineering enterprise and the stakeholders thereof;

(8) take into consideration the distinction between scholarly publications and digital data;

(9) take into consideration the role that scientific publishers play in the peer review process in ensuring the integrity of the record of scientific research, including the investments and added value that they make; and

(10) examine Federal agency practices and procedures for providing research reports to the agencies charged with locating and preserving unclassified research.

(c) PATENT OR COPYRIGHT LAW.—Nothing in this section shall be construed to undermine any right under the provisions of title 17 or 35, United States Code.

(d) APPLICATION WITH EXISTING LAW.—Nothing defined in section

(b) shall be construed to affect existing law with respect to Federal science agencies’ policies related to public access.

(e) REPORT TO CONGRESS.—Not later than 1 year after the date of enactment of this Act, the Director shall transmit a report to Congress describing—

(1) the specific objectives and public interest identified under (b)(1);

(2) any priorities established under subsection (b)(7);

(3) the impact the policies described under (a) have had on the science and engineering enterprise and the stakeholders, including the financial impact on research budgets;

(4) the status of any Federal science agency policies related to public access to the results of federally funded research; and

(5) how any policies developed or being developed by Federal science agencies, as described in subsection (a), incorporate input from the non-Federal stakeholders described in subsection (b)(6).

(f) FEDERAL SCIENCE AGENCY DEFINED.—For the purposes of this section, the term ‘‘Federal science agency’’ means any Federal agency with an annual extramural research expenditure of over $100,000,000.

| Digital Scholarship |

Share

Guide for Research Libraries: The NSF Data Sharing Policy

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Digital Curation/Digital Preservation on November 28th, 2010

ARL has released the Guide for Research Libraries: The NSF Data Sharing Policy.

Here's an excerpt:

The Association for Research Libraries has developed this guide primarily for librarians, to help them make sense of the new NSF requirement. It provides the context for, and an explanation of, the policy change and its ramifications for the grant-writing process. It investigates the role of libraries in data management planning, offering guidance in helping researchers meet the NSF requirement. In addition, the guide provides a resources page, where examples of responses from ARL libraries may be found, as well as guides for data management planning created by various NSF directorates and approaches to the topic created by international data archive and curation centers.

| Digital Scholarship |

Share

Riding the Wave—How Europe Can Gain from the Rising Tide of Scientific Data

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Reports and White Papers on October 7th, 2010

The High-Level Group on Scientific Data has released Riding the Wave—How Europe Can Gain from the Rising Tide of Scientific Data.

Here's an excerpt:

A fundamental characteristic of our age is the rising tide of data — global, diverse, valuable and complex. In the realm of science, this is both an opportunity and a challenge. This report, prepared for the European Commission's Directorate-General for Information Society and Media, identifies the benefits and costs of accelerating the development of a fully functional e-infrastructure for scientific data — a system already emerging piecemeal and spontaneously across the globe, but now in need of a far-seeing, global framework. The outcome will be a vital scientific asset: flexible, reliable, efficient, cross-disciplinary and cross-border.

The benefits are broad. With a proper scientific e-infrastructure, researchers in different domains can collaborate on the same data set, finding new insights. They can share a data set easily across the globe, but also protect its integrity and ownership. They can use, re-use and combine data, increasing productivity. They can more easily solve today's Grand Challenges, such as climate change and energy supply. Indeed, they can engage in whole new forms of scientific inquiry, made possible by the unimaginable power of the e-infrastructure to find correlations, draw inferences and trade ideas and information at a scale we are only beginning to see. For society as a whole, this is beneficial. It empowers amateurs to contribute more easily to the scientific process, politicians to govern more effectively with solid evidence, and the European and global economy to expand.

Share

NSF Data Sharing Policy Released

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Grants, Open Science on October 6th, 2010

The National Science Foundation has released its revised NSF Data Sharing Policy. As of January 18, 2011, NSF proposals must include a two-page (or less) "Data Management Plan" in accordance with the Grant Proposal Guide, chapter II.C.2.j (see below excerpt).

Here's an excerpt from the Award and Administration Guide, chapter VI.D.4:

b. Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. Privileged or confidential information should be released only in a form that protects the privacy of individuals and subjects involved. General adjustments and, where essential, exceptions to this sharing expectation may be specified by the funding NSF Program or Division/Office for a particular field or discipline to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate the legitimate interest of investigators. A grantee or investigator also may request a particular adjustment or exception from the cognizant NSF Program Officer.

c. Investigators and grantees are encouraged to share software and inventions created under the grant or otherwise make them or their products widely available and usable.

d. NSF normally allows grantees to retain principal legal rights to intellectual property developed under NSF grants to provide incentives for development and dissemination of inventions, software and publications that can enhance their usefulness, accessibility and upkeep. Such incentives do not, however, reduce the responsibility that investigators and organizations have as members of the scientific and engineering community, to make results, data and collections available to other researchers.

Here's an excerpt from the Grant Proposal Guide, chapter II.C.2.j:

Plans for data management and sharing of the products of research. Proposals must include a supplementary document of no more than two pages labeled “Data Management Plan”. This supplement should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results (see AAG Chapter VI.D.4), and may include:

  1. the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project;
  2. the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);
  3. policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;
  4. policies and provisions for re-use, re-distribution, and the production of derivatives; and
  5. plans for archiving data, samples, and other research products, and for preservation of access to them.

A May 2010 NSF press release ("Scientists Seeking NSF Funding Will Soon Be Required to Submit Data Management Plans") discussed the background for the policy:

"Science is becoming data-intensive and collaborative," noted Ed Seidel, acting assistant director for NSF's Mathematical and Physical Sciences directorate. "Researchers from numerous disciplines need to work together to attack complex problems; openly sharing data will pave the way for researchers to communicate and collaborate more effectively."

"This is the first step in what will be a more comprehensive approach to data policy," added Cora Marrett, NSF acting deputy director. "It will address the need for data from publicly-funded research to be made public."

Share

Page 5 of 12« First...34567...10...Last »

DigitalKoans

DigitalKoans

Digital Scholarship

Copyright © 2005-2012 by Charles W. Bailey, Jr.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.