MIT’s SIMILE Project

MIT’s Semantic Interoperability of Metadata and Information in unLike Environments (SIMILE) project is producing a variety of interesting open source software packages that will be of interest to librarians and others such as Piggy Bank, "a Firefox extension that turns your browser into a mashup platform, by allowing you to extract data from different web sites and mix them together."

Here is an overview of the SIMILE project from the About SIMILE page:

SIMILE is a joint project conducted by the MIT Libraries and MIT Computer Science and Artificial Intelligence Laboratory. SIMILE seeks to enhance inter-operability among digital assets, schemata/vocabularies/ontologies, metadata, and services. A key challenge is that the collections which must inter-operate are often distributed across individual, community, and institutional stores. We seek to be able to provide end-user services by drawing upon the assets, schemata/vocabularies/ontologies, and metadata held in such stores.

SIMILE will leverage and extend DSpace, enhancing its support for arbitrary schemata and metadata, primarily though the application of RDF and semantic web techniques. The project also aims to implement a digital asset dissemination architecture based upon web standards. The dissemination architecture will provide a mechanism to add useful "views" to a particular digital artifact (i.e. asset, schema, or metadata instance), and bind those views to consuming services.

You can get a more detailed overview of the project from the SIMILE grant proposal and from other project documents.

There is a SIMILE blog and a Wiki. There are also three mailing lists.

Recent Object Reuse and Exchange (ORE) Documents

In a previous posting, I discussed the Open Archives Initiative’s Object Reuse and Exchange (ORE) project. ORE is worth watching closely.

Two new documents were released this January:

  • "Report of the January 2007 ORE-TC Meeting," which is: "A detailed report of the results of the meeting of OAI-ORE Technical Committee describing features and requirements of the ORE model and its context in the Web Architecture."
  • "Open Repositories 2007," which is: "A presentation describing OAI-ORE and progress based on the January 2007 ORE Technical Committee Meeting."

Petition to European Commission to Support Open Access Tops 10,000 Signatures

A petition to the European Commission asking it to support the European Union’s "Study on the Economic and Technical Evolution of the Scientific Publication Markets of Europe" has been signed by more than 10,000 people.

From the press release:

Nobel laureates Harold Varmus and Rich Roberts are among the more than ten thousand concerned researchers, senior academics, lecturers, librarians, and citizens from across Europe and around the world who are signing an internet petition calling on the European Commission to adopt polices to guarantee free public access to research results and maximise the worldwide visibility of European research.

Organisations too are lending their support, with the most senior representatives from over 500 education, research and cultural organisations in the world adding their weight to the petition, including CERN, the UK’s Medical Research Council, the Wellcome Trust, the Italian Rector’s Conference, the Royal Netherlands Academy for Arts & Sciences (KNAW) and the Swiss Academy for the Humanities and Social Sciences (SAGW), alongside the petition’s sponsors, SPARC Europe, JISC, the SURF Foundation, the German Research Foundation (DFG) and the Danish Electronic Research Library (DEFF).

From the petition

Research funding agencies have a central role in determining researchers’ publishing practices. Following the lead of the NIH and other institutions, they should promote and support the archiving of publications in open repositories, after a (possibly domain-specific) time period to be discussed with publishers. This archiving could become a condition for funding.

The following actions could be taken at the European level: (i) Establish a European policy mandating published articles arising from EC-funded research to be available after a given time period in open access archives, and (ii) Explore with Member States and with European research and academic associations whether and how such policies and open repositories could be implemented.

More signatures are needed, especially from EU organizations and individuals.

DOE and British Library to Develop Portal

The U.S. Department of Energy (DOE) and the British Library have signed an agreement to develop an portal to international science resources called

From the press release:

Called ‘,’ the planned resource would be available for use by scientists in all nations and by anyone interested in science. The approach will capitalise on existing technology to search vast collections of science information distributed across the globe, enabling much-needed access to smaller, less well-known sources of highly valuable science. Following the model of, the U.S. interagency science portal that relies on content published by each participating agency, ‘’ will rely on scientific resources published by each participating nation. Other countries have been invited to participate in this international effort. . . .

Objectives of the ‘’ initiative are to:

  • Search dispersed, electronic collections in various science disciplines;
  • Provide direct, seamless and free searching of open-source collections and portals;
  • Build upon existing and already successful national models for searching;
  • Complement existing information collections and systems; and
  • Raise the visibility and usage of individual sources of quality science information.

New Yorker Google Book Search Article

The New Yorker has published an article about Google Book Search by Jeffrey Toobin in its February 5, 2007 issue ("Google’s Moon Shot: The Quest for the Universal Library").

Here’s a quote from the article:

Google asserts that its use of the copyrighted books is "transformative," that its database turns a book into essentially a new product. "A key part of the line between what’s fair use and what’s not is transformation," Drummond said. "Yes, we’re making a copy when we digitize. But surely the ability to find something because a term appears in a book is not the same thing as reading the book. That’s why Google Books is a different product from the book itself." In other words, Google says that being able to search books on its site—which it describes as the equivalent of a giant library card catalogue—is not the same as making the books themselves available. But the publishers cite another factor in fair-use analysis: the amount of the copyrighted work that is used in the creation of the new one. Google is copying entire books, which doesn’t sound "fair" to the plaintiff publishers and authors.

Draft White Paper on Acquisitions and Electronic Resource Management Systems Interoperability

The Digital Library Federation’s Electronic Resource Management Initiative Phase II Steering Committee has released a draft white paper on the interoperability of ILS acquisition modules and electronic resource management systems.

Here is the introduction:

Electronic resource management systems are becoming an important tool in many libraries. Commercial ERMS development has been driven in part by the lack of accommodation within integrated library systems for elements specific to electronic resources. Financial aspects of acquiring e-resources, in particular, necessitate recording an array of data not suited to ILS acquisitions modules. Unlike other data recorded in an ERMS such as licensing and administrative terms, a moderate percentage of acquisitions data is redundant, being populated in ILS during the acquisitions process, while also being accommodated within ERMS in accordance with the data structure detailed in Electronic Resource Management: Report of the DLF Electronic Resource Management Initiative (Digital Library Federation, 2004). ERMS implementers are eager to automate the process by which acquisitions data move from their ILS into their ERMS. This interest has grown substantially over the past few months as the prospect of connecting financial data to usage statistics has been facilitated through the Standardized Usage Statistics Harvesting Initiative (SUSHI), a NISO draft standard.

This white paper describes workflows at four libraries; reports on conversations held with product managers and other relevant staff of the leading ERMS; summarizes common themes; and suggests next steps. The paper is a draft for comment; it is hoped that those with interest in this area will provide insight to further this investigation.

OAIster Hits 10,000,000 Records

Excerpt from the press release:

We live in an information-driven world—one in which access to good information defines success. OAIster’s growth to 10 million records takes us one step closer to that goal.

Developed at the University of Michigan’s Library, OAIster is a collection of digital scholarly resources. OAIster is also a service that continually gathers these digital resources to remain complete and fresh. As global digital repositories grow, so do OAIster’s holdings.

Popular search engines don’t have the holdings OAIster does. They crawl web pages and index the words on those pages. It’s an outstanding technique for fast, broad information from public websites. But scholarly information, the kind researchers use to enrich their work, is generally hidden from these search engines.

OAIster retrieves these otherwise elusive resources by tapping directly into the collections of a variety of institutions using harvesting technology based on the Open Archives Initiative (OAI) Protocol for Metadata Harvesting. These can be images, academic papers, movies and audio files, technical reports, books, as well as preprints (unpublished works that have not yet been peer reviewed). By aggregating these resources, OAIster makes it possible to search across all of them and return the results of a thorough investigation of complete, up-to-date resources. . . .

OAIster is good news for the digital archives that contribute material to open-access repositories. "[OAIster has demonstrated that]. . . OAI interoperability can scale. This is good news for the technology, since the proliferation is bound to continue and even accelerate," says Peter Suber, author of the SPARC Open Access Newsletter. As open-access repositories proliferate, they will be supported by a single, well-managed, comprehensive, and useful tool.

Scholars will find that searching in OAIster can provide better results than searching in web search engines. Roy Tennant, User Services Architect at the California Digital Library, offers an example: "In OAIster I searched ‘roma’ and ‘world war,’ then sorted by weighted relevance. The first hit nailed my topic—the persecution of the Roma in World War II. Trying ‘roma world war’ in Google fails miserably because Google apparently searches ‘Rome’ as well as ‘Roma.’ The ranking then makes anything about the Roma people drop significantly, and there is nothing in the first few screens of results that includes the word in the title, unlike the OAIster hit."

OAIster currently harvests 730 repositories from 49 countries on 6 continents. In three years, it has more than quadrupled in size and increased from 6.2 million to 10 million in the past year. OAIster is a project of the University of Michigan Digital Library Production Service.

Orphan Works Challenge Fails

The U.S. Court of Appeals for the Ninth Circuit has denied an appeal of Kahle v. Gonzales, leaving the legal status of orphan works unchanged. The plaintiffs’ attorneys were Jennifer Stisa Granick, Lawrence Lessig, and Christopher Sprigman.

Eric Auchard’s article "U.S. Court Upholds Copyright Law on ‘Orphan Works’" gives an overview of the Ninth’s decision.

The opinion is also available. Here is an excerpt:

Plaintiffs appeal from the district court’s dismissal of their complaint. They allege that the change from an "opt-in" to an "opt-out" copyright system altered a traditional contour of copyright and therefore requires First Amendment review under Eldred v. Ashcroft, 537 U.S. 186, 221 (2003). They also allege that the current copyright term violates the Copyright Clause’s "limited Times" prescription. . . .

Arguments similar to Plaintiffs’ were presented to the Supreme Court in Eldred, which affirmed the constitutionality of the Copyright Term Extension Act against those attacks. The Supreme Court has already effectively addressed and denied Plaintiffs’ arguments. . . .

In March 2004, Plaintiffs Brewster Kahle, Internet Archive, Richard Prelinger, and Prelinger Associates, Inc. filed an amended complaint seeking declaratory judgment and injunctive relief. Brewster Kahle and Internet Archive have built an "Internet library" that offers free access to digitized audio, books, films, websites, and software. Richard Prelinger and Prelinger Associates make digital versions of "ephemeral" films available for free on the internet. Each Plaintiff provides, or intends to provide, access to works that allegedly have little or no commercial value but remain under copyright protection. The difficulty and expense of obtaining permission to place those works on the Internet is overwhelming; ownership of these "orphan" works is often difficult, and sometimes impossible, to ascertain. . . .

Plaintiffs also argue that they should be allowed to present evidence that the present copyright term violates the Copyright Clause’s "limited Times" prescription as the Framers would have understood it. That claim was not directly at issue in Eldred, though Justice Breyer discussed it extensively in his dissent. See Eldred, 537 U.S. at 243. Plaintiffs assert all existing copyrights are effectively perpetual. . . .

Both of Plaintiffs’ main claims attempt to tangentially relitigate Eldred. However, they provide no compelling reason why we should depart from a recent Supreme Court decision.

Creative Commons India to Launch on 1/26/07

The Creative Commons India will be launched on Friday.

From "Creative Commons Readies for India Launch":

Creative Commons-India’s project head Shishir K Jha, assistant professor at the IIT’s Shailesh J. Mehta School of Management, said the project would focus on three specific areas in India.

These are—centres of higher education like the seven IITs, regional technology institutes and management and other institutions. . . .

Creative Commons-India also plans to focus on non-profit and non-governmental organisations and corporates keen on adopting easier-to-share licences for the dissemination of their documents.

Scholarly Electronic Publishing Weblog Update (1/22/07)

The latest update of the Scholarly Electronic Publishing Weblog (SEPW) is now available, which provides information about new scholarly literature and resources related to scholarly electronic publishing, such as books, journal articles, magazine articles, newsletters, technical reports, and white papers. Especially interesting are: "Beyond Google: What Next for Publishing?"; "Copyright, Publishing, and Scholarship: The ‘Zwolle Group’ Initiative for the Advancement of Higher Education"; "Electronic Books and the Humanities: A Survey at the University of Denver"; "E-Prints and Journal Articles in Astronomy: A Productive Co-Existence,"; "Evaluating Research Impact through Open Access to Scholarly Communication"; "If the Academic Library Ceased to Exist, Would We Have to Invent It?"; and Managing Digitization Activities.

For weekly updates about news articles, Weblog postings, and other resources related to digital culture (e.g., copyright, digital privacy, digital rights management, and Net neutrality), digital libraries, and scholarly electronic publishing, see the latest DigitalKoans Flashback posting.

2006 PACS Review Use Statistics

The Public-Access Computer Systems Review (PACS Review) was a freely available e-journal, which I founded in 1989. It allowed authors to retain their copyrights, and it had a liberal copyright policy for noncommercial use. It’s last issue was published in 1998.

In 2006, there were 763,228 successful requests for PACS Review files, 2,091 average successful requests per day, 751,264 successful requests for pages, and 2,058 average successful requests for pages per day. (A request is for any type of file; a page request is for a content file, such as an HTML, PDF, or Word file). These requests came from 41,865 distinct host computers.

The requests came from 134 Internet domains. Leaving aside requests from unresolved numerical addresses, the top 15 domains were: .com (Commercial), .net (Networks), .edu (USA Higher Education), .cz (Czech Republic), .jp (Japan), .ca (Canada), .uk (United Kingdom), .au (Australia), .de (Germany), .nl (Netherlands), .org (Non Profit Making Organizations), .in (India), .my (Malaysia), .it (Italy), and .mx (Mexico). At the bottom were domains such as .ms (Montserrat), .fm (Micronesia), .nu (Niue), .ad (Andorra), and .az (Azerbaijan).

Rounded to the nearest thousand, there had previously been 3.5 million successful requests for PACS Review files.

This is the last time that use statistics will be reported for the PACS Review.

Fedora 2.2 Released

The Fedora Project has released version 2.2 of Fedora.

From the announcement:

This is a significant release of Fedora that includes a complete repackaging of the Fedora source and binary distribution so that Fedora can now be installed as a standalone web application (.war) in any web container. This is a first step in positioning Fedora to fit within a standard "enterprise system" environment. A new installer application makes it easy to setup and run Fedora. Fedora now uses Servlet Filters for authentication. To support digital object integrity, the Fedora repository can now be configured to calculate and store checksums for datastream content. This can be done globally, or on selected datastreams. The Fedora API also provides the ability to check content integrity based on checksums. The RDF-based Resource Index has been tuned for better performance. Also, a new high-performing triplestore, backed by Postgres, has been developed that can be plugged into the Resource Index. Fedora contains many other enhancements and bug fixes. Access to Over 13 Million Digital Documents is an initiative of the Institute for Media and Communications Management at the University of St. Gallen. It indexes both metadata and full-text from global digital repositories. It uses OAI-PMH to identify relevant documents. The full-text documents are in PDF, PowerPoint, RTF, Microsoft Word, and Postscript formats. After being retrieved from their original repository, the documents are cached locally at It has indexed about 13 million documents from over 800 repositories.

Here are some additional features from the About page:

Identification of authors across institutions and archives: identifies authors and assigns them their scientific publications across various archives. Additionally the social relations between the authors will be extracted and displayed. . . .

Semantic combination of scientific information: structures and combines the scientific data to knowledge areas with Ontology’s. Lexical and statistical methods are used to identify, extract and analyze keywords. Based on this processes classifies the scientific data and uses it e.g. for navigational and weighting purposes.

Personalization services: offers the researchers the possibilities to inform themselves about new publications via our RSS Feed service. They can customize the RSS Feed to a special discipline or even to personalized list of keywords. Furthermore will provide an upload service. Every researcher can upload his publication directly to and assign already existing publications at to his own researcher profile.

New UC Report: The Promise of Value-based Journal Prices and Negotiation

The University of California libraries have released The Promise of Value-based Journal Prices and Negotiation: A UC Report and View Forward.

Here is the report’s abstract:

In pursuit of their scholarly communication agenda, the University of California ten-campus libraries have posited and tested the case that a journal’s institutional price can and should be related to its value to the academic enterprise. We developed and tested a set of metrics that comprise "value-based pricing" of scholarly journals. The metrics are the measurable impact of the journal, the transparent measures of production costs, the institutionally-based contributions to the journal, such as editorial labor, and the transaction efficiencies from consortial purchases. Initial modeling and use of the approaches are promising, leading the libraries to employ and further develop the approaches and share their work to date with the larger community.

This excerpt from press release provides further information:

The report describes a value-based approach that borrows from analysis done by Professors Ted Bergstrom (UC Santa Barbara) and R. Preston McAfee (Caltech) on journal cost-effectiveness ( The UC approach also includes suggestions for annual price increases that are tied to production costs; credits for institutionally-based contributions to the journal, such as editorial labor; and credits for business transaction efficiencies from consortial purchases.

Through the report the libraries ask how an explicit method can be established, validated, and communicated for aligning the purchase or license costs of scholarly journals with the value they contribute to the academy and the costs to create and deliver them. In addition to describing the work done to date, the report provides examples of potential cost savings and declares UC’s intention to pursue value-based prices in their negotiations with journal publishers. In addition, the report invites the academic community to work collectively to refine and improve these and other value-based approaches.

The Long Run

Enthusiasm about new technologies is essential to innovation. There needs to be some fire in the belly of change agents or nothing ever changes. Moreover, the new is always more interesting than the old, which creaks with familiarity. Consequently, when an exciting new idea seizes the imagination of innovators and, later, early adopters (using Rogers’ diffusion of innovations jargon), it is only to be expected that the initial rush of enthusiasm can sometimes dim the cold eye of critical analysis.

Let’s pick on Library 2.0 to illustrate the point, and, in particular, librarian-contributed content instead of user-contributed content. It’s an idea that I find quite appealing, but let’s set that aside for the moment.

Overcoming the technical challenges involved, academic library X sets up on-demand blogs and wikis for staff as both outreach and internal communication tools. There is an initial frenzy of activity, and a number of blogs and wikis are established. Subject specialists start blogging. Perhaps the pace is reasonable for most to begin with, although some fall by the wayside quickly, but over time, with a few exceptions, the postings become more erratic and the time between postings increases. It is unclear whether target faculty read the blogs in any great numbers. Internal blogs follow a similar pattern. Some wikis, both internal and external, are quickly populated, but then become frozen by inactivity; others remain blank; others flourish because they serve a vital need.

Is this a story of success, failure, or the grey zone in between?

The point is this. Successful publishing in new media such as blogs and wikis requires that these tools serve a real purpose and that their contributors make a consistent, steady, and never-ending effort. It also requires that the intended audience understand and regularly use the tools and that, until these new communication channels are well-established, the library vigorously promote them because there is a real danger that, if you build it, they will not come.

Some staff will blog their hearts out irregardless of external reinforcement, but many will need to have their work acknowledged in some meaningful way, such as at evaluation, promotion, and tenure decision points. Easily understandable feedback about tool use, such as good blog-specific or wiki-specific log analysis, is important as well to give writers the sense that they are being read and to help them tailor their message to their audience.

On the user side, it does little good to say "Here’s my RSS feed" to a faculty member who doesn’t know what RSS is and could care less. Of course, some will be hip to RSS, but that may not be the majority. If the library wants RSS feeds to become part of a faculty member’s daily workflow, it is going to have to give that faculty member a good reason for it to be so, such as significant, identified RSS feed content in the faculty member’s field. Then, it is going to have to help the faculty member with the RSS transition by pointing out good RSS readers, providing tactful instruction, and offering ongoing assistance.

In spite of the feel-good glow of early success, it may be prudent not to declare victory too soon after making the leap into a major new technology. It’s a real accomplishment, but dealing with technical puzzles is often not the hardest part. The world of computers and code is a relatively ordered and predictable one; the world of humans is far more complex and unpredictable.

The real test of a new technology is in the long run: Is the innovation needed, viable, and sustainable? Major new technologies often require significant ongoing organizational commitments and a willingness to measure success and failure with objectivity and to take corrective action as required. For participative technologies such as Library 2.0 and institutional repositories, it requires motivating users as well as staff to make behavioral changes that persist long after the excitement of the new wears off.

Managing Digitization Activities, SPEC Kit 294

The Association of Research Libraries has published Managing Digitization Activities, SPEC Kit 294. The table of contents and executive summary are freely available.

Here are some highlights from the announcement:

This survey was distributed to the 123 ARL member libraries in February 2006. Sixty-eight libraries (55%) responded to the survey, of which all but two (97%) reported having engaged in digitization activities. Only one respondent reported having begun digitization activities prior to 1992; five other pioneers followed in 1992. From 1994 through 1998 there was a steady increase in the number of libraries beginning digital initiatives; 30 joined the pioneers at the rate of three to six a year. There was a spike of activity at the turn of the millennium that reached a high in 2000, when nine libraries began digital projects. Subsequently, new start-ups have slowed, with only an additional one to five libraries beginning digitization activities each year.

The primary factor that influenced the start up of digitization activities was the availability of grant funding (39 responses or 59%). Other factors that influenced the commencement of these activities were the addition of new staff with related skills (50%), staff receiving training (44%), the decision to use digitization as a preservation option (42%), and the availability of gift monies (29%). . . . .

Only four libraries reported that their digitization activities are solely ongoing functions; the great majority (60 or 91%) reported that their digitization efforts are a combination of ongoing library functions and discrete, finite projects.

Notre Dame Institutional Digital Repository Phase I Final Report

The University of Notre Dame Libraries have issued a report about their year-long institutional repository pilot project. There is an abbreviated HTML version and a complete PDF version.

From the Executive Summary:

Here is the briefest of summaries regarding what we did, what we learned, and where we think future directions should go:

  1. What we did—In a nutshell we established relationships with a number of content groups across campus: the Kellogg Institute, the Institute for Latino Studies, Art History, Electrical Engineering, Computer Science, Life Science, the Nanovic Institute, the Kaneb Center, the School of Architecture, FTT (Film, Television, and Theater), the Gigot Center for Entrepreneurial Studies, the Institute for Scholarship in the Liberal Arts, the Graduate School, the University Intellectual Property Committee, the Provost’s Office, and General Counsel. Next, we collected content from many of these groups, "cataloged" it, and saved it into three different computer systems: DigiTool, ETD-db, and DSpace. Finally, we aggregated this content into a centralized cache to provide enhanced browsing, searching, and syndication services against the content.
  2. What we learned—We essentially learned four things: 1) metadata matters, 2) preservation now, not later, 3) the IDR requires dedicated people with specific skills, 4) copyright raises the largest number of questions regarding the fulfillment of the goals of the IDR.
  3. Where we are leaning in regards to recommendations—The recommendations take the form of a "Chinese menu" of options, and the options are be grouped into "meals." We recommend the IDR continue and include: 1) continuing to do the Electronic Theses & Dissertations, 2) writing and implementing metadata and preservation policies and procedures, 3) taking the Excellent Undergraduate Research to the next level, and 4) continuing to implement DigiTool. There are quite a number of other options, but they may be deemed too expensive to implement.

Blackwell Synergy Based on Literatum Goes Live

Blackwell Publishing has released a new version of Blackwell Synergy, which utilizes Atypon’s Literatum software.

From the press release:

Blackwell Synergy enables its users to search 1 million articles from over 850 leading scholarly journals across the sciences, social sciences, humanities and medicine. The redesign provides easier navigation, faster loading times and improved access to tools for researchers, as well as meeting the latest accessibility standards (ADA section 508 and W3C’s WAI-AA).

Recently, the University of Chicago Press picked Atypon as a technology partner to provide an e-publishing platform for its online journals.

OCLC Openly Informatics Link Evaluator for Firefox

OCLC Openly Informatics has announced a free link checking plug-in for Firefox called Link Evaluator.

Here a brief description from the Link Evaluator page:

Link Evaluator is a Firefox extension designed to help users evaluate the availability of online resources linked to from a given Web page. When started, it automatically follows all links on the current page, and assesses the responses of each URL (link). . . .

After each link is checked, it is highlighted with a color based on the relative success of the result: green for fully successful, shades of yellow for partly successful, and red for unsuccessful.

It requires Mozilla Firefox version 1.5 (or later).


The University of Michigan Press and the Scholarly Publishing Office of the University of Michigan Library, working together as the Michigan Digital Publishing Initiative, have established digitalculturebooks, which offers free access to digital versions of its published works (print works are fee-based). The imprint focuses on "the social, cultural, and political impact of new media."

The objectives of the imprint are to:

  • develop an open and participatory publishing model that adheres to the highest scholarly standards of review and documentation;
  • study the economics of Open Access publishing;
  • collect data about how reading habits and preferences vary across communities and genres;
  • build community around our content by fostering new modes of collaboration in which the traditional relationship between reader and writer breaks down in creative and productive ways.

Library Journal Academic Newswire notes in its article about digitalculturebooks:

While press officials use the term "open access," the venture is actually more "free access" than open at this stage. Open access typically does not require permission for reuse, only a proper attribution. UM director Phil Pochoda told the LJ Academic Newswire that, while no final decision has been made, the press’s "inclination is to ask authors to request the most restrictive Creative Commons license" for their projects. That license, he noted, requires attribution and would not permit commercial use, such as using it in a subsequent for-sale product, without permission. The Digital Culture Books web site currently reads that "permission must be received for any subsequent distribution."

The imprint’s first publication is The Best of Technology Writing 2006.

(Prior postings about digital presses.)

Has "Set Free" 100 Public Domain Books from Google Book Search?

In a posting on Google Blogoscoped, Philipp Lenssen has announced that he has put up 100 public domain books from Google Book Search on Authorama.

Regarding his action, Lenssen says:

In other words, Google imposes restrictions on these books which the public domain does not impose*. I’m no lawyer, and maybe Google can print whatever guidelines they want onto those books. . . and being no lawyer, most people won’t know if the guidelines are a polite request, or legally enforceable terms**. But as a proof of concept—the concept of the public domain—I’ve now ‘set free’ 100 books I downloaded from Google Book Search by republishing them on my public domain books site, Authorama. I’m not doing this out of disrespect for the Google Books program (which I think is cool, and I’ll credit Google on Authorama) but out of respect for the public domain (which I think is even cooler).

Since Lenssen has retained Google’s usage guidelines in the e-books, it’s unclear how they have been "set free," in spite of the following statement on Authorama’s Books from Google Book Search page:

The following books were downloaded from Google Book Search and are made available here as public domain. You can download, republish, mix and mash these books, for private or public, commercial or non-commercial use.

Leaving aside the above statement, Lenssen’s action appears to violate the following Google usage guideline, where Google asks that users:

Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for personal, non-commercial purposes.

However, in the above guideline, Google uses the word "request," which suggests voluntary, rather than mandatory, compliance. Google also requests attribution and watermark retention.

Maintain attribution The Google ‘watermark’ you see on each file is essential for informing people about this project and helping them find additional materials through Google Book Search. Please do not remove it.

Note the use of the word "please."

It’s not clear how to determine if Google’s watermark remains in the Authorama files, but, given the retention of the usage guidelines, it likely does.

So, do Google’s public domain books really need to be "set free"? In its usage guidelines, Google appears to make compliance requests, not compliance requirements. Are such requests binding or not? If so, the language could be clearer. For example, here’s a possible rewording:

Make non-commercial use of the files Google Book Search is for individual use only, and its files can only be used for personal, non-commercial purposes. All other use is prohibited.

Landmark Digital Humanities Book Is Now Freely Available

A Companion to Digital Humanities is now freely available in digital form.

This important 2004 book was edited by Susan Schreibman, Ray Siemens, and John Unsworth. It includes chapters by such notable experts as Howard Besser, Greg Crane, Susan Hockey, Willard McCarty, Allen H. Renear, Abby Smith, C. M. Sperberg-McQueen, John Unsworth, and Perry Willett (to name just a few).