Digital Preservation via Emulation at Koninklijke Bibliotheek

In a two-year (2005-2007) joint project with Nationaal Archief of the Netherlands, Koninklijke Bibliotheek is developing an emulation system that will allow digital objects in outmoded formats to be utilized in their original form. Regarding the emulation approach, the Koninklijke Bibliotheek says:

Emulation is difficult, the main reason why it is not applied on a large scale. Developing an emulator is complex and time-consuming, especially because the emulated environment must appear authentic en must function accurately as well. When future users are interested in the contents of a file, migration remains the better option. When it is the authentic look and feel and functionality of a file they are after, emulation is worth the effort. This can be the case for PDF documents or websites. For multimedia applications, emulation is in fact the only suitable permanent access strategy.

J. R. van der en Wijngaarden Hoeven’s paper "Modular Emulation as a Long-Term Preservation Strategy for Digital Objects" provides a overview of the emulation approach.

In a related development, a message to padiforum-l on 11/17/06 by Remco Verdegem of the Nationaal Archief of the Netherlands reported on a recent Emulation Expert Meeting, which issued a statement noting the following advantages of emulation for digital preservation purposes:

  • It preserves and permits access to each digital artifact in its original form and format; it may be the only viable approach to preserving digital artifacts that have significant executable and/or interactive behavior.
  • It can preserve digital artifacts of any form or format by saving the original software environments that were used to render those artifacts. A single emulator can preserve artifacts in a vast range of arbitrary formats without the need to understand those formats, and it can preserve huge corpuses without ever requiring conversion or any other processing of individual artifacts.
  • It enables the future generation of surrogate versions of digital artifacts directly from their original forms, thereby avoiding the cumulative corruption that would result from generating each such future surrogate from the previous one.
  • If all emulators are written to run on a stable, thoroughly-specified "emulation virtual machine" (EVM) platform and that virtual machine can be implemented on any future computer, then all emulators can be run indefinitely.

OAI’s Object Reuse and Exchange Initiative

The Open Archives Initiative has announced its Object Reuse and Exchange (ORE) initiative:

Object Reuse and Exchange (ORE) will develop specifications that allow distributed repositories to exchange information about their constituent digital objects. These specifications will include approaches for representing digital objects and repository services that facilitate access and ingest of these representations. The specifications will enable a new generation of cross-repository services that leverage the intrinsic value of digital objects beyond the borders of hosting repositories. . . . its real importance lies in the potential for these distributed repositories and their contained objects to act as the foundation of a new digitally-based scholarly communication framework. Such a framework would permit fluid reuse, refactoring, and aggregation of scholarly digital objects and their constituent parts—including text, images, data, and software. This framework would include new forms of citation, allow the creation of virtual collections of objects regardless of their location, and facilitate new workflows that add value to scholarly objects by distributed registration, certification, peer review, and preservation services. Although scholarly communication is the motivating application, we imagine that the specifications developed by ORE may extend to other domains.

OAI-ORE is being funded my the Andrew W. Mellon Foundation for a two-year period.

Presentations from the Augmenting Interoperability across Scholarly Repositories meeting are a good source of further information about the thinking behind the initiative as is the "Pathways: Augmenting Interoperability across Scholarly Repositories" preprint.

Forget RL, Try an Avatar Instead

Real life (RL) is so 20th century. Virtual worlds are where it’s at. At least, that’s what readers of BusinessWeek‘s recent "My Virtual Life" article by Robert D. Hof may quickly come to believe.

You may think that virtual worlds are just kids stuff. Tell that to Anshe Chung, who has made over $250,000 buying and renting virtual real estate in Linden Lab’s Second Life. Or, Chris Mead, whose Second Life couples avatars earn him a cool $90,000 per year. Or the roughly 170,000 Second Life users who spent about $5 million real dollars on virtual stuff in January 2006.

How about this? For all virtual worlds, IGE Ltd. estimates that users spend over $1 billion real dollars on virtual stuff last year.

While most users may be buying virtual clothes, land, and entertainment and other services, conventional companies are exploring how to use virtual worlds for training, meeting, and other purposes, plus trying to snag regular users’ interest with offerings such as Well’s Fargo’s Stagecoach Island.

For the library slant on Second Life, try the Second Life Library 2.0 blog and don’t miss the Alliance Second Life Library 2.0 introduction on 5/31/06 from 2:00 PM-3:30 PM. And don’t foget to browse the Second Life Library 2.0 image pool at Flickr.

Oh, brave new world that has such avatars in it!

Source: Hof, Robert D. "My Virtual Life." BusinessWeek, 1 May 2006, 72-82.

Microsoft’s Windows Live Academic Search

Microsoft will be releasing Windows Live Academic Search shortly (I was recently told Wednesday; the blog buzz is saying tomorrow).

As is typical with such software projects, the team is doing some last minute tweaking before release. So, I won’t try to describe the system in any detail at this point, except to say that it integrates access to published articles with e-prints and other open access materials, it provides a reference export capability, there’s a cool optional two-pane view (short bibliographic information on the left; full bibliographic information and abstract on the right), and it supports search "macros" (user-written search programs).

What I will say is this: Microsoft made a real effort to get significant, honest input from the librarian and publisher communities during the development process. I know, because, now that the nondisclosure agreement has been lifted, I can say that I was one of the librarians who provided such input on an unpaid basis. I was very impressed by how carefully the development team listened to what we had to say, how sharp and energetic they were, how they really got the Web 2.0 concept, and how deeply committed they were to creating the best product possible. Having read Microserfs, I had a very different mental picture of Microsoft than the reality I encountered.

Needless to say, there were lively exchanges of views between librarians and publishers when open access issues were touched upon. My impression is that the team listened to both sides and tried to find the happy middle ground.

When it’s released, Windows Live Academic Search won’t be the perfect answer to your open access search engine dreams (what system is?), and Microsoft knows that there are missing pieces. But I think it will give Google Scholar a run for its money. I, for one, heartily welcome it, and I think it’s a good base to build upon, especially if Microsoft continues to solicit and seriously consider candid feedback from the library and publisher communities (and it appears that it will).

Customers Welcome RFID-Enabled Cards. . . with Hammers and Microwave Ovens

The Wall Street Journal reports that customers lack enthusiasm for RFID credit cards due to privacy and fraud concerns. In fact, they are devising novel ways to disable RFID chips, including using hammers and microwave ovens to smash or fry them. FoeBud, a German digital rights group, sells a variety of devices to detect or disable the chips. Sensing a hot market, some companies have joined the bandwagon with new products (e.g., RFIDwasher) that do the job more safely than a microwave oven, which can be a fire hazard when used for RFID frying. For those who don’t want to tamper with their RFID cards, they can buy shielded wallets and passport cases from DIFRWEAR that block signals when closed.

As libraries begin to embrace RFID technology, these concerns from the credit card sector may be worth watching, and it may give them pause.

Source: Warren, Susan. "Why Some People Put These Credit Cards in the Microwave." The Wall Street Journal, 10 April, 2006, A1, A16.

Bar Code 2.0

Now you can store a 20-second video that can be viewed on a cell phone in a colored bar code the size of a postage stamp. Or, if the cell phone is connected to the Internet, use the bar code to launch a URL. The user snaps a picture of the bar code to utilize it. Content Idea of Asia invested this new bar code technology, which can store 600 KB, and plans to offer it later this year.

Source: Hall, Kenji. "The Bar Code Learns Some Snazzy New Tricks." BusinessWeek, 3 April 2006, 113.

Gary Flake’s "Internet Singularity"

Dr. Gary William Flake, Microsoft technical fellow, gave a compelling and lively presentation at SearchChamps V4 entitled "How I Learned to Stop Worrying and Love the Imminent Internet Singularity."

Flake’s "Internet Singularity," is "the idea that a deeper and tighter coupling between the online and offline worlds will accelerate science, business, society, and self-actualization."

His PowerPoint presentation is text heavy enough that you should be able to follow his argument fairly well. (Ironically, he had apparently received some friendly criticism from colleagues about the very wordiness of the PowerPoint that allows it to stand alone.)

I’m not going to try to recap his presentation here. Rather, I urge you to read it, and I’ll discuss a missing factor from his model that may, to some extent, act as a brake on the type of synergistic technical progress that he envisions.

That factor is the equally accelerating growth of what Lawrence Lessig calls the "permission culture," which is "a culture in which creators get to create only with the permission of the powerful, or of creators from the past."

Lessig discusses this topic with exceptional clarity in his book Free Culture: How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity (HTML, PDF, or printed book; Lessig’s book in under an Attribution-NonCommercial 1.0 License).

Lessig is a Stanford law professor, but Free Culture is not a dry legal treatise about copyright law. Rather, it is a carefully argued, highly readable, and impassioned plea that society needs to reexamine the radical shift that has occurred in legal thinking about the mission and nature of copyright since the late 19th century, especially since there are other societal factors that heighten the effect of this shift.

Lessig describes the current copyright situation as follows:

For the first time in our tradition, the ordinary ways in which individuals create and share culture fall within the reach of the regulation of the law, which has expanded to draw within its control a vast amount of culture and creativity that it never reached before. The technology that preserved the balance of our history—between uses of our culture that were free and uses of our culture that were only upon permission—has been undone. The consequence is that we are less and less a free culture, more and more a permission culture.

How did we get here? Lessig traces the following major changes:

In 1790, the law looked like this:

  PUBLISH TRANSFORM
Commercial © Free
Noncommercial Free Free

The act of publishing a map, chart, and book was regulated by copyright law. Nothing else was. Transformations were free. And as copyright attached only with registration, and only those who intended to benefit commercially would register, copying through publishing of noncommercial work was also free.

By the end of the nineteenth century, the law had changed to this:

  PUBLISH TRANSFORM
Commercial © ©
Noncommercial Free Free

Derivative works were now regulated by copyright law—if published, which again, given the economics of publishing at the time, means if offered commercially. But noncommercial publishing and transformation were still essentially free.

In 1909 the law changed to regulate copies, not publishing, and after this change, the scope of the law was tied to technology. As the technology of copying became more prevalent, the reach of the law expanded. Thus by 1975, as photocopying machines became more common, we could say the law began to look like this:

  PUBLISH TRANSFORM
Commercial © ©
Noncommercial ©/Free Free

The law was interpreted to reach noncommercial copying through, say, copy machines, but still much of copying outside of the commercial market remained free. But the consequence of the emergence of digital technologies, especially in the context of a digital network, means that the law now looks like this:

  PUBLISH TRANSFORM
Commercial © ©
Noncommercial © ©

Lessig points out one of the ironies of copyright law’s development during the last few decades: the entertainment industries that have been the driving force behind moving the law from the permissive to permission side of the spectrum benefited from looser regulation in their infancies:

If "piracy" means using value from someone else’s creative property without permission from that creator—as it is increasingly described today—then every industry affected by copyright today is the product and beneficiary of a certain kind of piracy. Film, records, radio, cable TV. . . . The list is long and could well be expanded. Every generation welcomes the pirates from the last. Every generation—until now.

Returning to Flake’s model, what will the effect of a permission culture be on innovation? Lessig says:

This wildly punitive system of regulation will systematically stifle creativity and innovation. It will protect some industries and some creators, but it will harm industry and creativity generally. Free market and free culture depend upon vibrant competition. Yet the effect of the law today is to stifle just this kind of competition. The effect is to produce an overregulated culture, just as the effect of too much control in the market is to produce an overregulated-regulated market.

New knowledge typically builds on old knowledge, new content on old content. "Democratization of content" works if the content is completely new, if it builds on content that is in the public domain or under a Creative Commons (or similar) license, or if fair use can be invoked without it being stopped by DRM or lawsuits. If not, copyright permissions granted or withheld may determine if a digital "Rip, Mix, Burn" (or as some say "Rip, Mix, Learn") meme lives or dies and the full transformational potential of digital media are realized or not.

If you are concerned about the growing restrictions that copyright law imposes on society, I highly recommend that you read Free Culture.

Library 2.0

Walt Crawford has published a mega-issue of Cites & Insights: Crawford at Large on Library 2.0 that presents short essays on the topic by a large number of authors, plus his own view. At Walt’s request, I dashed off the following:

Blogs, tagging, Wikis, oh my! Whether "Library 2.0" truly transforms libraries’ Web presence or not, one thing is certain: the participative aspect of 2.0 represents a fundamental, significant change. Why? Because we will ask patrons to be become content creators, not just content consumers. And they will be interacting with each other, not just with the library. This will require what some have called "radical trust," meaning who knows what they will do or say, but the rich rewards of collective effort outweigh the risks. Or so the theory goes. Recent Wikipedia troubles suggest that all is not peaches and cream in Web 2.0 land. But, no one can deny (ok, some can) that participative systems can have enormous utility far beyond what one would have thought. Bugaboos, such as intellectual property violations, libel, and fiction presented as fact, of course, remain, leading to liability and veracity concerns that result in nagging musings over control issues. And it all is mixed in a tasty stew of enormous promise and some potential danger. This is a trend worth keeping a close eye on.

The Sony BMG Rootkit Fiasco Redux

There’s a new development in the Sony BMG Rootkit story (for background see my prior posting and update comment): Sony BMG has reached a settlement (awaiting court approval) regarding the class action lawsuit about its use of DRM (Digital Rights Management) software after virtual "round-the-clock settlement negotiations" (on December 1st numerous individual lawsuits were given class action status). The short story is that XCP-protected CDs will be replaced with DRM-free CDs and customers will be given download/cash incentives to exchange the disks; no recall for MediaMax-protected CDs, but buyers will get song MP3s and an album download. You can get details at "Sony Settles ‘Rootkit’ Class Action Lawsuit."

Since my December 4th update comment, there have been a few articles/blog postings of note about this controversy. "Summary of Claims against Sony-BMG" provides an analysis by Fred von Lohmann of EFF of "the various legal theories that have been brought against Sony-BMG over the CD copy-protection debacle." In "Sony CDs and the Computer Fraud and Abuse Act," Ed Felten considers whether Sony BMG, First4Internet, and SunnComm/MediaMax "violated the Computer Fraud and Abuse Act (CFAA), which is the primary Federal law banning computer intrusions and malware" (he notes that he is not a lawyer), and, in "Inside the MediaMax Prospectus," he highlights some interesting aspects of this document. "New Spyware Claim against Sony BMG" describes a new claim added to the Texas lawsuit by Attorney General Greg Abbott: "MediaMax software . . . violated state laws because it was downloaded even if users rejected a license agreement." Finally, "Just Let Us Play the Movie" examines the fallout for the film industry and DRM use in general.

In other recent IP news, two items of interest: "France May Sanction Unfettered P2P Downloads" (mon dieu!) and "Pro-Hollywood Bill Aims to Restrict Digital Tuners."

Machinima

Here’s an interesting trend: using video games to create animated digital films. It’s called "Machinima." In one technique, the 3-D animation tools built into games to allow users to extend the games (e.g., create new characters) are used to generate new 3-D films. Of course, it can be more complicated than this: the Machinima FAQ outlines other strategies in layperson’s terms.

BusinessWeek has a short, interesting article on Machinima ("France: Thousands of Young Spielbergs") that describes one social commentary Machinima film (The French Democracy), noting that it got over one million hits in November. It also quotes Paul Marino, executive director of the Academy of Machinima Arts & Sciences as saying: "This is to the films what blogs are to the written media."

If you want to check out more Machinima films, try the 2005 Machinima Film Festival or Machinima.com (try "download" if "watch" doesn’t work).

Machinima is yet another example of how users want to create derivative works from digital media and how powerful a capability that can be—if intellectual property rights owners don’t prohibit it. Since the first Machinima movie was created in 1996, it appears that the video game industry has not moved to squash this movement, and, needless to say, it has thrived. However, this state of affairs may simply reflect Machinima’s low profile: A recent Wired News article, which notes that Machinima has been employed in commercials and music videos, indicates that Doug Lombardi, Director of Marketing at Valve (a video game software company), feels that: "As the films become commercially viable, machinima filmmakers are going to butt up against copyright law."

The Sony BMG Rootkit Fiasco

When Mark Russinovich posted "Sony, Rootkits and Digital Rights Management Gone Too Far," he helped trigged a firestorm of subsequent criticism about Sony BMG Music Entertainment’s use of the First4Internet’s digital rights protection software on some of its music CDs. It was bad enough that one of the planet’s largest entertainment companies was perceived as hacking users’ computers with "rootkits" in the name of copy protection, but then the EFF posted an analysis of the license agreement associated with the CDs (see "Now the Legalese Rootkit: Sony-BMG’s EULA"). Things got worse when real hackers started exploiting the DRM software (see "First Trojan Using Sony DRM Spotted"). Then the question posed by the EFF’s "Are You Infected by Sony-BMG’s Rootkit?" posting became a bit more urgent. And the lawsuits started (see "Sony Sued For Rootkit Copy Protection"). Sony BMG suspended production (see "Sony Halts Production of ‘Rootkit’ CDs"), but said it would continue using DRM software from SunnComm (see "Sony Shipping Spyware from SunnComm, Too"). Among others, Microsoft said it will try to eradicate the hard-to-kill DRM software (see "Microsoft Will Wipe Sony’s ‘Rootkit’").

What would drive Sony BMG to such a course of action? Blame that slippery new genie, digital media, which seems to want information to not only be free, but infinitely mutable into new works as well. Once it’s granted a few wishes, it’s hard to get it back in the bottle, and the one wish it won’t grant is that the bottle had never been opened in the first place.

Faced with rampant file sharing that is based on CDs, music companies now want to nip the rip in the bud: put DRM software on customers’ PCs that will control how they use a CD’s digital tracks. Of course, it would be better from their perspective if such controls were built in to the operating system, but, if not, a little deep digital surgery can add lacking functionality.

The potential result for consumers is multiple DRM modifications to their PCs that may conflict with each other, open security holes, deny legitimate use, and have other negative side effects.

In the hullabaloo over the technical aspects of the Sony BMG DRM fiasco, it’s important not to lose sight of this: your CD is now licensed. First sale rights are gone, fair use is gone, and the license reigns supreme.

Pity the poor music librarian, who was already struggling to figure out how to deal with digital audio reserves. Between DRM-protected tracks from services such as iTunes and DRM-protected CDs that modify their PCs, they "live in interesting times."

While the Sony BMG fiasco has certain serio-comic aspects to it, rest assured that music (and other entertainment companies) will eventually iron out the most obvious kinks in the context of operating systems that are designed for intrinsic DRM support and, after some bumps in the road, a new era of DRM-protected digital multimedia will dawn.

That is, it will dawn unless musicians, other digital media creators, and consumers do something about it first.

Navigating the Library Blogosphere

Needless to say, there has been rapid growth in blogging by librarians over the last few years, and library blogosphere has become more varied and complex. Here are some directories of library web logs to help you navigate the library blogosphere:

Want more information about library web logs? Try Susan Herzog’s BlogBib.

Something Wiki This Way Comes

Wikis are catching on in the library world. What’s a Wiki? "The simplest online database that could possibly work." (Quote from: "Making the Case for a Wiki.")

Here’s a few examples of how Wikis are being used:

If you want to dig in and learn more about Wikis, try Gerry McKiernan’s WikiBibliography.