CNI Spring 2008 Task Force Meeting Presentations

Presentations and project briefings from the CNI Spring 2008 Task Force Meeting are available. Podcast interviews with a few attendees are also available.

Here's a selection of project briefings:

Digital Research Data Curation: Overview of Issues, Current Activities, and Opportunities for the Cornell University Library

Cornell University Library's Data Working Group has deposited its Digital Research Data Curation: Overview of Issues, Current Activities, and Opportunities for the Cornell University Library report in the eCommons@Cornell repository.

Here's the abstract:

Advances in computational capacity and tools, coupled with the accelerating collection and accumulation of data in many disciplines, are giving rise to new modes of conducting research. Infrastructure to promote and support the curation of digital research data is not yet fully-developed in all research disciplines, scales, and contexts. Organizations of all kinds are examining and staking out their potential roles in the areas of cyberinfrastructure development, data-driven scholarship, and data curation. The purpose of the Cornell University Library's (CUL) Data Working Group (DaWG) is to exchange information about CUL activities related to data curation, to review and exchange information about developments and activities in data curation in general, and to consider and recommend strategic opportunities for CUL to engage in the area of data curation. This white paper aims to fulfill this last element of the DaWG's charge.

Solr Search Engine Plug-In for Fedora Released

The DRAMA team has released a Solr plug-in for Fedora.

Here's a description of Solr from its home page:

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.

Associated Press vs. Drudge Retort: "Both Parties Consider the Matter Closed"

After a firestorm of criticism, the Associated Press has issued a press release saying that its dispute with the Drudge Retort over that blog's use of short quotes from AP stories is over: "Both parties consider the matter closed."

Read more about the controversy at "AP Battles Blogs"; "AP, Bloggers Clash over Wire Content Use"; "AP Exaggerates the 'Conversation' It's Having with Bloggers; Caught Copying Text from Bloggers as Well"; "The A.P. Has Violated My Copyright, and I Demand Justice"; "The Associated Press Plays Role of Metallica in Napster-esque War with Bloggers"; and "Biting the Hand that Feeds (Traffic to) Them."

Omeka Version 0.9.2 Released

Version 0.9.2 of Omeka has been released. This is a bug fix release.

Here's an excerpt from the About page that describes Omeka:

Omeka is a web platform for publishing collections and exhibitions online. Designed for cultural institutions, enthusiasts, and educators, Omeka is easy to install and modify and facilitates community-building around collections and exhibits. It is designed with non-IT specialists in mind, allowing users to focus on content rather than programming.

Journal of the American Society for Information Science and Technology Goes Green

In a forthcoming "Early View" editorial in the Journal of the American Society for Information Science and Technology ("JASIST Open Access"), Donald H. Kraft announces that JASIST will permit self-archiving "on the Contributor's personal Web site or in the Contributor's institution's/employer's institutional repository or archive" (institutional intranets are also permitted). This excludes disciplinary archives, such as dLIST and E-LIS, which are global in nature.

Such self-archiving can occur for both preprints and postprints. The author cannot "update the submission version [version submitted for consideration that has not undergone peer review] or replace it with the published Contribution." However, the author can "update the preprint [accepted version that has undergone peer review] with any corrections."

JASIST is the research journal of the American Society for Information Science and Technology, which "counts among its membership some 4,000 information specialists from such fields as computer science, linguistics, management, librarianship, engineering, law, medicine, chemistry, and education; individuals who share a common interest in improving the ways society stores, retrieves, analyzes, manages, archives and disseminates information, coming together for mutual benefit."

A Survey of Digital Humanities Centers in the United States Released

A report prepared for the Council on Library and Information Resources titled A Survey of Digital Humanities Centers in the United States has been released.

Here's an excerpt from the "Executive Summary":

The immediate goals of the survey were to identify the extent of these [digital humanities] centers, and explore their financing, organizational structure, products, services and sustainability. The longer-term goal is to provide participants of SCI 6 [Scholarly Communication Institute 6] with a greater understanding of existing centers to inform their discussions about regional and national centers. The year-long study took place in two phases: an initial planning phase to develop selection criteria, identify candidates, and plan methodology, and an implementation phase to conduct the survey and analysis of the centers. . . .

The findings of this survey suggest that new models are needed for large-scale cyberinfrastructure projects, for cross-disciplinary research that cuts a wide swathe across the humanities, and for integrating the huge amounts of digital production already available. Current DHCs will continue to have an important role to play, but that role needs to be clarified in the context of the broader models that emerge.

CIC Shared Digital Repository Project Update

A recently updated description of the Committee on Institutional Cooperation's Shared Digital Repository Project is available at Indiana University's Project: Shared Digital Repository page.

Here's an excerpt:

Description: The Shared Digital Repository (SDR) leverages the tradition of leadership in collaboration among the institutions of the Committee on Institutional Cooperation (CIC). The SDR operates under the leadership of the Repository Administrators (Indiana University and the University of Michigan), which also provide a large part of the funding. Additional governance and financial support are provided by the charter participating libraries of the CIC, and by other libraries and library consortia wishing to archive digital content.

Outcome: The SDR offers persistent and high-availability storage for digitized book and journal content, beginning with the Google content from the CIC members and later extending to other digitized content. The SDR will leverage technology investments and developments at the University of Michigan to build (through IU/UM collaboration) more generalized versions of Michigan's services and gain efficiencies from Michigan's investments. . . .

Milestones and status:

As of April 11, 2008, the SDR contains:

  • 1,122,007 volumes
  • 791,460 titles
  • approximately 393 million pages
  • 213,379 individual volumes in the public domain (19% of the total)


  • Early 2008: Bloomington backup storage installed
  • January-March 2008: Page turner mechanism with branding; ability to publish virtual collections (UM-specific version); assessment of global searching functionality; access mechanisms for persons with visual disabilities
  • September-December 2008: Mechanism for direct ingest of non-Google content; compliance with the required elements in the "Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist"

Association of College and Research Libraries Sends a Letter of Support to SCOAP3

The Association of College and Research Libraries has sent a letter of support to SCOAP3.

Here's an excerpt:

On behalf of the Association of College and Research Libraries (ACRL), a division of the American Library Association (ALA) representing over 13,000 academic and research librarians and interested individuals, I am writing to express interest and support for SCOAP3, the Sponsoring Consortium for Open Access Publishing in Particle Physics’ effort to facilitate open access publishing in High Energy Physics (HEP). . . .

ACRL believes that SCOAP3 is a valuable addition to the heterogeneous mix of strategies being undertaken by scholars, publishers, libraries, and others to ensure the future of high-quality journals. SCOAP3 is unique in its explicit goals to unite researchers and libraries and to partner with publishers so that aggregated financial contributions will support HEP publishing, make the results available at no cost to any reader any where, and serve as a potential model to other disciplines.

Therefore ACRL encourages its members to consider joining the SCOAP3 effort when appropriate, e.g. through an institutional or consortial "expression of interest" (as outlined at, providing education and outreach about SCOAP3 to their faculty, library staff and administrators, and finding other ways to analyze and support SCOAP3 where possible.

Registry of U.S. Government Publication Digitization Projects Enhanced

The Registry of U.S. Government Publication Digitization Projects has been significantly enhanced.

Here's an excerpt from the announcement:

The enhanced Registry provides the ability to:

  • Browse digitization projects by category or alphabetically by title.
  • Search the entire Registry or filter searches by category or fields.
  • Quickly access new and recently updated listings.
  • Utilize RSS feeds to keep informed of new and updated projects.
  • View listings by contributor.
  • Contact fellow digitization participants.
  • Recommend listings to others.
  • Report broken links.
  • And much more!

Aberystwyth University Launches CADAIR Institutional Repository

Aberystwyth University has launched CADAIR, its DSpace-based institutional repository.

Here's an excerpt from the press release:

The new service has been developed by the Subject Support and E-Library team in Information Services, led by Dr Talat Chaudhri and Stuart Lewis.

A successful two year pilot project, during which the team worked closely with the Departments of Computer Science and Information Studies, and the Institute of Mathematics and Physics, was concluded in early 2008. Currently the site features approximately 500 academic papers and dissertations by taught masters and PhD students.

Survey of Canadian and International Data Management Initiatives Released

The Canadian Association of Research Libraries (CARL) has released Survey of Canadian and International Data Management Initiatives.

Here's an excerpt from the "Introduction":

Research libraries have a role to play in this emerging data-intensive environment. A 2007 CARL survey found that most CARL members are interested in managing research data, but few have a formal data archiving policy. CARL has formed a Research Data Management Working Group to assist members in collecting, organizing, preserving and providing access to the research data and to formulate a cooperative approach for CARL.

The purpose of this report is to provide an overview of the types of data management activities being undertaken in Canada and internationally. This review documents the various options available for libraries, and will pave the way for a more detailed investigation by the Working Group of the potential roles for libraries.

2007 Impact Factors for PLoS Journals Released

The Public Library of Science has reported the 2007 impact factors for its journals as calculated by Thomson Reuters:

  • PLoS Biology: 13.5
  • PLoS Medicine: 12.6
  • PLoS Computational Biology: 6.2
  • PLoS Genetics: 8.7
  • PLoS Pathogens: 9.3

Here's an excerpt from the press release:

As we and others have frequently pointed out, impact factors should be interpreted with caution and only as one of a number of measures which provide insight into a journal’s, or rather its articles’, impact. Nevertheless, the 2007 figures for PLoS Biology and PLoS Medicine are consistent with the many other indicators (e.g. submission volume, web statistics, reader and community feedback) that these journals are firmly established as top-flight open-access general interest journals in the life and health sciences respectively.

The increases in the impact factors for the discipline-based, community-run PLoS journals also tally with indicators that these journals are going from strength to strength. For example, submissions to PLoS Computational Biology, PLoS Genetics and PLoS Pathogens have almost doubled over the past year—each journal now routinely receives 80-120 submissions per month of which around 20-25 are published. . . .

Although Thomson is yet to index our two youngest journals, other indexing databases are. The subscription-only Scopus citation index (owned by Elsevier and, incidentally, including many more journals than Thomson’s offering) is already covering PLoS ONE (though so far, only as far back as June 2007). But authors don’t need to rely on subscription-only indexes such as those owned by Thomson and Elsevier, and can instead use the freely-available Google Scholar. Using Google Scholar, for example, one can find that the article by Neal Fahlgren and coauthors, about the cataloguing of an important class of RNA in plants and one of the most highly cited PLoS ONE articles so far has been cited 42 times—strong evidence that good research, even if published in a new journal, will rapidly find its place in the scientific record when it’s made freely available to all.

Second Beta Version of Fedora 3.0 Released

The Fedora Commons has released the second beta version of Fedora 3.0.

Here's an excerpt from the announcement:

Fedora 3.0 features the Content Model Architecture (CMA), an integrated structure for persisting and delivering the essential characteristics of digital objects in Fedora. . . . The Fedora CMA plays a central role in the Fedora architecture, in many ways forming the over-arching conceptual framework for future development of Fedora Repositories.

Like a well-thumbed book on a shelf, digital content is stored with the expectation that intellectual works will be the same each time they are accessed, whether the content was put away yesterday, or many years ago. Fedora is a simple, flexible and evolvable approach to delivering and sharing the "essential characteristics" of enduring digital content. Librarians, archivists, records managers, media producers, authors and publishers use patterns of expression formats such as books, journals, articles, collections to convey the essential characteristics of content. The capabilities of digital tools combined with essential characteristics of digital works result in well-understood patterns of expression for different types of content models.

The software engineering community also utilizes patterns of expression for the development of complex computer systems. The same concepts that satisfy agile IT infrastructures can help provide solutions for creating, accessing and preserving content. The Fedora CMA builds on the Fedora architecture-downloaded more than 18,000 times in the last 12 months—to simplify use while unlocking potential.

Dan Davis explains the CMA in the context of Fedora 3.0, "It's a hybrid. The Fedora CMA handles content models that are used by publishers and others, and is also a computer model that describes an information representation and processing architecture." By combining these viewpoints, Fedora CMA has the potential to provide a way to build an interoperable repository for integrated information access within organizations and to provide durable access to our intellectual works.

The Web Imagined in 1934 Using Index Cards, Telegraphs, and Other Analog Tools

In 1934, Belgian Paul Otlet wrote a book in which he envisioned a worldwide "mechanical, collective brain" that would store and make accessible the world's knowledge. By that time, he had created with co-visionary Henri La Fontaine a "database" of over 12 million index cards and was responding to over 1,500 queries a year. Unfortunately, the project's sponsor, the Belgian government, withdrew support, the Nazis invaded, they displaced the project to make way for a Third Reich art exhibit, and Otlet died in relative obscurity in 1944.

Read more about it at "Paul Otlet," The Universe of Information: the Work of Paul Otlet for Documentation and International Organisation, "Visions of Xanadu: Paul Otlet (1868-1944) and Hypertext," and "The Web Time Forgot."

Usenet Newsgroups Will Be Blocked By Major ISPs

Spurred on by New York Attorney General Andrew Cuomo’s efforts to fight child pornography, Sprint, Time Warner Cable. and Verizon will block significant numbers of Usenet news groups.

Regarding the Verizon ban, Declan McCullagh points out that only 8 of 1,000 Usenet hierarchies are being kept, and "That means not carrying perfectly innocuous—and, in fact, very useful—newsgroups like symantec.customerservice.general, us.military, microsoft.public.excel, and fr.soc.economie."

Read more about it at: "alt.blocked: Verizon Blocks Access to Whole USENET Hierarchy" "ISPs: We're Limiting Our Own Usenet Groups, Not Blocking Others" "N.Y. Attorney General Forces ISPs to curb Usenet Access" "Verizon Offers Details of Usenet Deletion: alt.* groups, Others Gone"

Short Quotes Not Fair Use? Associated Press Sends Take-Down Letter to Drudge Retort

The Associated Press has sent the Drudge Retort a DMCA take-down letter demanding that 6 posts and one comment with short quotes from AP articles be removed from the site.

Negative reaction from bloggers and others against what was viewed as an assault on fair use was swift, resulting in a TechCrunch ban on AP story use, a broader AP ban by bloggers, and a wave of criticism.

As a result, AP decided to halt further action against other Weblogs until new guidelines could be established, but it has not withdrawn its letter the Drudge Retort.

Read more about it at: "Associated Press Digs Its Own Grave Deeper; Wants to Create Its Own Fair Use Rules," "The Associated Press to Set Guidelines for Using Its Articles in Blogs," "AP Rethinking Policy After Drudge Retort DMCA Takedowns," "AP Takes Action against Community News Website over Copyright Violation," "AP Wants Change in Blog Excerpting, Just Not Sure What," "DMCA Takedown Tiff Not a Battle the AP Should Be Fighting," "Netroots' Bloggers Boycott of Associated Press Is Working," and "Welcome to the Web Refactory, AP."

Reactions to the "Canadian DMCA" (Bill C-61)

There have been strong reactions to the "Canadian DMCA" (Bill C-61) by both advocates and opponents. Copyright for Canadians has put up a "Tell MPs What's Wrong with the Prentice Bill" page that helps opponents contact their Members of Parliament.

Here's a selection of articles and posts: "Appropriation Art Condemns Bill C-61," "Bill C-61: First Reactions," "Canadian Creator and Music Industry Groups Applaud Introduction of Copyright Bill," "The Canadian DMCA: A Betrayal," "Canadian Library Association Disappointed with New Copyright Legislation," "CIPPIC Disappointed with New Copyright Bill," "CMCC: Copyright Reform Bill Doesn’t Help Canadian Artists," "Conservatives Deliver Rehearsed Responses on Bill C-61," "Copyright Law Could Result in Police State: Critics," "Copyright Reform a Good First Step," "Industry Group Applauds Bill," "Software Industry Praises Federal Government Plans to Modernize Canadian Copyright Act," and "TPM and Bill C-61."

Oil, ALA, and Digital Communities

I follow the energy markets closely, and recently there have been predictions of $250 a barrel oil in 2009 and $400 a barrel oil in 2018.

What does this have to do with ALA? Nothing, if ALA functioned effectively as a virtual organization that wasn't dependent on physical travel. Everything, if it is not.

Already we see airlines consolidating, cutting routes, and raising ticket and auxiliary prices. That's with oil at about $136 a barrel. Imagine if it were $250 a barrel or $400 a barrel. Impossible? Unlikely? Maybe, but in early 2007 it was $60 a barrel, and predictions of $100 a barrel met with incredulity.

We can hope that oil prices stabilize or decline, but it may be prudent to plan for what to do if they do not.

Would ALA function well if its committee members were increasingly unable to attend meetings? Would the organization's current awareness and personal networking functions that physical conferences support work if general members were increasingly unable to attend them?

Ask yourself this: If you never attended ALA conferences, how would the organization look to you? Would you feel that you could meaningfully participate in it? Would you feel that it had added value as an important source of current information, personal networking, and professional development?

Perhaps. In recent years, ALA has make progress in creating a more useful digital presence with efforts like virtual committee members, blogs, wikis, and other tools. This is commendable progress; however, much remains to be done. Do virtual committee members interact with physical committee members in real-time meetings? Is there meaningful non-conference committee digital interaction? Are conference presentations and committee meeting sessions available to ALA members in MP3 and digital video formats? Are blogs open to all potential member authors through self-initiated registration procedures? Are wikis dynamic information exchange mechanisms or primarily dull descriptive tools for disseminating information about ALA and its divisions? Is social network software provided to connect members with each other, committee members, and ALA officers? Is there is a true sense of a vibrant digital community?

Although its not perfect, the EDUCAUSE CONNECT community points in the direction of what could be.

Of course, energy markets are volatile, prices could drop, and all could be well for a while, but there is little to suggest at the current time that the long-term prospects for cheap energy are good. Thinking the unthinkable about reinventing ALA as a digital community might not be a bad idea as a contingency plan, and it might not be a bad idea in any case.