Alternative File Formats for Storing Master Images of Digitisation Projects

Koninklijke Bibliotheek has published Alternative File Formats for Storing Master Images of Digitisation Projects.

Here's an excerpt from the "Management Summary":

The main conclusions of this study are as follows:

Reason 1: Substitution

JPEG 2000 lossless and PNG are the best alternatives for the uncompressed TIFF file format from the perspective of long-term sustainability. When the storage savings (PNG 40%, JPEG 2000 lossless 53%) and the functionality are factored in, the scale tips in favour of JPEG 2000 lossless.

Reason 2: Redigitisation Is Not Desirable

JPEG 2000 and JPEG are the best alternatives for the uncompressed TIFF file format. If no image information may be lost, then JPEG 2000 lossless and PNG are the two recommended options.

Reason 3: Master File is the Access File

JPEG 2000 lossy and JPEG with greater compression are the most suitable formats.

NDIIPP's Carl Fleischhauer on Video Formatting and Preservation

The National Digital Information Infrastructure and Preservation Program at the Library of Congress has released Carl Fleischhauer's presentation on "Video Formatting and Preservation" at the Digital Library Federation 2007 Fall Forum.

Here's an excerpt about the presentation from the March 2008 issue of the Library of Congress Digital Preservation Newsletter:

Fleischhauer discussed content wrappers, bitstream encodings, metadata and format profiles for born-digital content. He also spoke about emerging reformatting practices at the Library’s new facility for audiovisual collections and a handful of notable NDIIPP projects.

NEH Awards $474,474 in Digital Humanities Start-Up Grants

The National Endowment for the Humanities has awarded $474,474 to Digital Humanities Start-Up Grants recipients.

Here's an excerpt from the press release:

Note: The We the People program encourages and strengthens the teaching, study, and understanding of American history and culture. Grants bearing this designation have been recognized for advancing the goals of this program.

ALASKA

Fairbanks

University of Alaska, Fairbanks $50,000
Digital Humanities Start-Up Grants
Project Director: Siri Tuttle
We the People Project Title: Minto Songs Project Description: The collection, digitization, organization, and archival storage, as well as dissemination among the Minto Athabascan community, of recorded performances of Alaskan Athabascan songs.

ARIZONA

Tucson

University of Arizona $25,000
Digital Humanities Start-Up Grants
Project Director: Douglas Gann
Project Title: Virtual Vault
Project Description: Electronic access to the world's largest collection of whole pottery vessels from the American Southwest through digital renderings of Arizona State University's Pottery Vault and relevant prehistoric archaeological sites as well as interviews with anthropologists, conservators, and Native American potters.

ILLINOIS

Lake Forest

Lake Forest College $25,000
Digital Humanities Start-Up Grants
Project Director: Davis Schneiderman
We the People Project Title: Virtual Burnham Initiative
Project Description: The development of the Virtual Burnham Initiative (VBI), a multimedia project that would examine the history and legacy of Daniel H. Burnham's and Edward H. Bennett's Plan of Chicago (1909).

MARYLAND

College Park

University of Maryland, College Park $11,708
Digital Humanities Start-Up Grants
Project Director: Matthew Kirschenbaum
Project Title: Approaches to Managing and Collecting Born-Digital Literary Materials for Scholarly Use
Project Description: A series of planning meetings and site visits aimed at developing archival tools and best practices for preserving born-digital documents produced by contemporary authors.

MASSACHUSETTS

Boston

University of Massachusetts, Boston $24,748
Digital Humanities Start-Up Grants
Project Director: Joanne Riley
We the People Project Title: Online Social Networking for the Humanities: the Massachusetts Studies Network Prototype
Project Description: The development and evaluation of a social networking platform for the members of the statewide Massachusetts Studies Project.

Norton

Wheaton College $41,950
Digital Humanities Start-Up Grants
Project Director: Mark LeBlanc
Project Title: Pattern Recognition through Computational Stylistics: Old English and Beyond
Project Description: Development of a prototypical suite of computational tools and statistical analyses to explore the corpus of Old English literature using the genomic approach of tracing information-rich patterns of letters as well as that of literary analysis and interpretation.

MISSISSIPPI

Mississippi State

Mississippi State University $50,000
Digital Humanities Start-Up Grants
Project Director: Paul Jacobs
Project Title: Distributed Archives Transaction System
Project Description: Development of open source web tools for accessing online digitized collections in the humanities via a system that communicates with multiple database types while protecting the integrity of the original data sets.

NEW YORK

Brooklyn

Unaffiliated Independent Scholar $23,750
Digital Humanities Start-Up Grants
Project Director: Daniel Visel
Project Title: Sophie Search Gateway
Project Description: The development of an interoperable portal within the Web authoring program, "Sophie," for locating and incorporating multi-media sources from the Internet Archive.

Hempstead

Hofstra University $23,591
Digital Humanities Start-Up Grants
Project Director: John Bryant
We the People Project Title: Melville, Revision, and Collaborative Editing: Toward a Critical Archive
Project Description: The development of the TextLab scholarly editing tool to allow for analysis of texts that exist in multiple versions or editions, beginning with the Melville Electronic Library.

New York City

New York University $49,657
Digital Humanities Start-Up Grants
Project Director: Brian Hoffman
Project Title: MediaCommons: Social Networking Tools for Digital Scholarly Communication
Project Description: Development of a set of networking software tools to support a "peer-to-peer" review structure for MediaCommons, a scholarly publishing network in the digital humanities.

RHODE ISLAND

Providence

Brown University $49,992
Digital Humanities Start-Up Grants
Project Director: Julia Flanders
Project Title: Encoding Names for Contextual Exploration in Digital Thematic Research Collections
Project Description: The advancement of humanities text encoding and research by refining and expanding the automated representation of personal names and their contexts.

TEXAS

Austin

University of Texas, Austin $49,251
Digital Humanities Start-Up Grants
Project Director: Samuel Baker
Project Title: The eCommentary Machine Project
Project Description: Development of a web-based collaborative commentary and annotation tool.

VIRGINIA

Charlottesville

University of Virginia $49,827
Digital Humanities Start-Up Grants
Project Director: Scot French
We the People Project Title: Jefferson's Travels: A Digital Journey Using the HistoryBrowser
Project Description: Development of an interactive web-based tool to integrate primary documents, dynamic maps, and related information in the study of history, with the prototype to be focused on Thomas Jefferson's trip to England in 1786.

JPEG 2000—A Practical Digital Preservation Standard?

The Digital Preservation Coalition has published JPEG 2000—A Practical Digital Preservation Standard?.

Here's an excerpt from the "Executive Summary":

With JPEG 2000, an application can access and decode only as much of the compressed image as needed to perform the task at hand. This means a viewer, for example, can open a gigapixel image almost instantly by retrieving and decompressing a low resolution, display-sized image from the JPEG 2000 codestream.

JPEG 2000 also improves a user’s ability to interact with an image. The zoom, pan, and rotate operations that users increasingly expect in networked image systems are performed dynamically by accessing and decompressing just those parts of the JPEG 2000 codestream containing the compressed image data for the region of interest. The JPEG 2000 data can be either converted to JPEG and delivered for viewing with a standard image browser or delivered to a native JPEG 2000 viewer using the JPIP client-server protocol, developed to support the JPEG 2000 feature set.

Using a single JPEG 2000 master to satisfy user requests for dynamic viewing reduces storage costs and management overhead by eliminating the need to maintain multiple derivatives in a repository.

Beyond image access and distribution, JPEG 2000 is being used increasingly as a repository and archival image format. What is remarkable is that many repositories are storing “visually lossless” JPEG 2000 files: the compression is lossy and irreversible but the artefacts are not noticeable and do not interfere with the performance of applications. Compared to uncompressed TIFF, visually lossless JPEG 2000 compression can reduce the amount of storage by an order of magnitude or more.

Digital Video on JoVE (Journal of Visualized Experiments)

In a digital video from the Google Tech Talks series, Moshe Pritsker, Editor-in-Chief of JoVE (Journal of Visualized Experiments), discusses that video-based journal.

Here's an excerpt from the abstract:

Contrasting the rapid advancement of scientific research itself, scientific communication still heavily relies on traditional print journals. Print journals however, lack the necessary characteristics to allow enable an effective transfer of knowledge, which is significantly impeding scientific progress. Addressing this problem, the Journal of Visualized Experiments (JoVE, www.jove.com) implemented a novel, video-based approach to scientific publishing, based on visualization of experimental studies. Created with the participation of scientists from leading research institutions (e.g. Harvard, MIT, and Princeton), JoVE provides solutions to the "bottleneck" of the contemporary biological research: transparency and reproducibility of biological experiments. JoVE has so far released 9 monthly issues that include over 150 video-protocols on experimental approaches in developmental biology, neuroscience, microbiology and other fields.

Goodbye Digital Music DRM, Goodbye RIAA?, and Hello Music Watermarking

SONY BMG has moved beyond experimenting with non-DRM-protected music tracks and indicated that its entire catalog will be available as MP3s from Amazon by the end of the month. SONY BMG is the last of the "big four" music labels to offer MP3s via Amazon (the others are the EMI Group, the Universal Music Group, and the Warner Music Group). Napster has also announced that it will offer MP3s for sale this spring (its subscription service will still use DRM). It would appear that the DRM era for digital music is coming to a close.

Meanwhile, rumors continue to circulate that the RIAA is endangered due to a potential withdrawal of funding from the EMI Group.

The decline of digital music DRM does not mean that the labels have given up the fight to stem the tide of illegal downloads. MP3s from Sony and Universal include "anonymous" watermarks that allow them to be traced as they move through the Internet to provide infringement data for music labels and to potentially allow filtering by ISPs.

Nor does the decline of digital music DRM mean that Hollywood will quickly follow, avoiding the mistakes of the music industry.

Read more about it at "DRM Is Dead, but Watermarks Rise from Its Ashes," "Napster to Sell DRM-Free Downloads," "Sony Joins Other Labels on Amazon MP3 Store," and "Under Pressure from EMI, RIAA Could Disappear."

Image Management Software Descriptions from TASI Survey

TASI (Technical Advisory Service for Images) has published descriptions of information management software resulting from a vendor survey (e.g., see the Greenstone description). TASI notes: "The information has been provided by the system developer/vendor in answer to TASI's survey, but has not been independently verified."

TASI recommends that readers consult Systems for Managing Image Collections and Choosing a System for Managing your Image Collection as background for evaluating the survey responses.

Recut, Reframe, Recycle: Quoting Copyrighted Material in User-Generated Video

American University's Center for Social Media has released Recut, Reframe, Recycle: Quoting Copyrighted Material in User-Generated Video, which examines fair use issues in user-created digital videos. See the announcement for links to videos used in the report.

Here's an excerpt from the "Next Steps" section:

The effervescence of this moment at the dawn of participatory media should not be mistaken for triviality. The practices of today’s online creators are harbingers of a far more interactive media era. Today’s makers—feckless, impudent, brash, and extravagant as they often are—in fact are the pioneers of an emerging media economy and society. Recognition of the importance of fair use, within the copyright law toolkit for cultural creation, is both prudent and forward-looking for those concerned with maintaining an open society.

Sophie Project Gets $1 Million from Macarthur Foundation

Thanks to a million dollar grant from the Macarthur Foundation, version 1.0 of Sophie, software that allows non-programmers to easily create multimedia documents, will be released in February 2008. Sophie runs on Mac, Windows and Linux operating systems. An alpha version and several demo books created with Sophie are available.

Here's an excerpt the project's home page:

Originally conceived as a standalone multimedia authoring tool, Sophie is now integrated into the Web 2.0 network in some very powerful ways:

  • Sophie documents can be uploaded to a server and then streamed over the net
  • It's possible to embed remote audio, video and graphic text files in the pages of Sophie documents meaning that the actual document that needs to be distributed might be only a few hundred kilobytes even if the book itself is comprised of hundreds of megabytes or even a few gigabytes.
  • Sophie now has the ability to browse OKI (open knowledge initiative) repositories from within Sophie itself and then to embed objects from those repositories.
  • We now have live dynamic text fields (similar to the Institute's CommentPress experiments on the web) such that a comment written in the margin is displayed immediately in every other copy of that book—anywhere in the world.

New York Public Library Makes 600,000 Digital Images Available to Kaltura Users

The New York Public Library has made its collection of 600,000 digital images available for use by Kaltura users. Kaltura is a free, online collaborative video production site.

Here's an excerpt from the press release:

The New York Public Library and Kaltura, Inc., a pioneer in Collaborative Media, announced today that the organizations have joined forces to further enhance online rich-media collaboration. The New York Public Library's treasure trove of 600,000 digital images can now be incorporated easily into Kaltura's group video projects. The library's digital collection includes a wide range of rare and unique images drawn from its research collections. These range from Civil War photographs and illuminated Medieval manuscripts to historic views of New York City, Yiddish theatre placards and 19th Century restaurant menus. Users can search, preview and add these library images directly from the Kaltura web site (To try it, go to http://www.kaltura.com, click 'start a kaltura').

"Kaltura is a good fit for The New York Public Library as we work to take advantage of the latest technologies and approaches to make our collection freely and widely accessible," said Joshua M. Greenberg, Director of Digital Strategy and Scholarship at The New York Public Library. "We are excited to enable the use of our extensive Digital Gallery of historical images in Kaltura's cutting-edge and innovative application. Working with Kaltura was a natural step in enabling the creative use of these rich materials in the broader online world."

Kaltura enables groups of users to collaborate in the creation of videos and slideshows, similar to the way in which Wiki platforms allow users to collaborate with text. When creating a Kaltura video, users can upload their own videos, photos, audio and animation, can import their previously uploaded material from MySpace, Photobucket or YouTube, or they can access and import rich-media from various public-domain and CreativeCommons sources such as Flickr, CCMixter, Jamendo, and now The New York Public Library. Kaltura aims to team with additional databases and digital resource partners in order to both provide users with the widest array of rich-media, and to provide its resource partners with access to Kaltura's Global Network of users, content, and services that allows unprecedented collaboration around rich-media creation, remixing and distribution.

"We strive to provide users with the most comprehensive, enjoyable and user-friendly experience possible when creating their collaborative Kalturas in a fun, safe, and legal environment; The New York Public Library database is a huge addition to resources that we offer, both in terms of its size and the great value that it brings," said Ron Yekutiel, Chairman and CEO of Kaltura.

"Kaltura was built around the principles of openness and sharing with the mission to enhance collaboration and to lower the barriers of participation—it is through partners with a similar vision, like The New York Public Library, that we can achieve our goal of delivering the world's first open platform for peer production of rich media, with the broadest access to rich-media materials, resources and databases," Yekutiel added. "We are truly honored by this collaboration."

100 Year Archive Requirements Survey

The Storage Networking Industry Association has released the 100 Year Archive Requirements Survey. Access requires registration.

Here's an excerpt from the "Survey Highlights":

  • 80% of respondents declared they have information they must keep over 50 years and 68% of respondents said they must keep it over 100 years. . . .
  • Long-term generally means greater than 10 to 15 years—the period beyond which multiple migrations take place and information is at risk. . .
  • Database information (structured data) was considered to be most at risk of loss. . .
  • Over 40% of respondents are keeping e-Mail records over 10 years. . . .
  • Physical migration is a big problem. Only 30% declared they were doing it correctly at 3-5 year intervals. . . .
  • 60% of respondents say they are ‘highly dissatisfied’ that they will be able to read their retained information in 50 years. . .
  • Help is needed—current practices are too manual, too prone to error, too costly and lack adequate coordination across the organization. . . .

Welcome to the DRM Zone: Case in Point, the Google Video Store

If you have ever purchased or rented a video from the Google Video Store, it will cease to function on August 15, 2007. That's because the Google Video Store is being shut down and along with it Google 's associated DRM system.

Customers will get credits in Google Checkout for what they spent on Google Video Store products, but not cash refunds, meaning that they must buy merchandise available via that service to recoup their losses. Of course, this does not compensate purchasers for the inconvenience of having to replace their videos (assuming that they can).

This fiasco underlines a key problem with DRM: it doesn't just restrict access, it restricts access using proprietary technologies, and, with few exceptions, those technologies cannot be legally circumvented under U.S. law.

Source: Fisher, Ken. "Google Selleth Then Taketh Away, Proving the Need for DRM Circumvention." Ars Technica, 12 August 2007.

Free, Legal Digital Audio Downloads (Courtesy of the Creative Commons)

In Darknet: Hollywood’s War Against the Digital Generation, J. D. Lasica tells the story of Tarnation, a documentary film that nominated for a Camera d’Or award (pg. 84). The film was made for $218.31 using a video camera and iMovie. One catch: Lasica says that getting permission to use brief commercial music and video segments in the movie cost around $400,000. Creating derivative works that use the entertainment industry’s copyrighted works is clearly not cheap, assuming that you can obtain permission to use them at all.

Imagine instead a world where you could download, play, and use digital media works for free without paying license fees. It may sound impossible, but that world is starting to be built using Creative Commons licenses.

The most liberal license of the six main Creative Commons licences is Attribution: "This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation."

The most restrictive license is Attribution Non-Commercial No Derivatives: "This license is often called the ‘free advertising’ license because it allows others to download your works and share them with others as long as they mention you and link back to you, but they can’t change them in any way or use them commercially."

Here’s a brief guide to selected resources that will help you get started finding digital audio works licensed under Creative Commons licenses.

  • Creative Commons Audio Page: An excellent place to start. It has a search engine, featured audio Web sites, brief information about the Creative Commons Licenses, a list of sites where you can contribute audio works, and featured artists, tools, and works. See also: the Creative Commons Find page, where you can search for CC-licensed works using Google and Yahoo!.
  • ccMixter: "This is a community music site featuring remixes licensed under Creative Commons, where you can listen to, sample, mash-up, or interact with music in whatever way you want." Site tabs provide access to picks, remixes, samples, a cappellas, people, and extras.
  • Common Content: "Common Content is a catalog of works licensed in the Creative Commons, available to anyone for copying or creative re-use. The catalog includes over 3,848 records, many of which are collections which include hundreds or thousands of other works." Audio categories include ambient, music, samples, and speech.
  • The Freesound Project: "The Freesound Project is a collaborative database of Creative Commons licensed sounds. Freesound focuses only on sound, not songs." Sound clips are described, tagged (there’s a tag cloud for popular tags), geotagged, and rated (example: tibetan chant 4 colargol 2.aif). Site includes a "Remix! tree," sample packs, and user forum.
  • Indieish: Your Free Music Daily: Blog with CC-licensed music reviews.
  • jamendo: "On jamendo, the artists distribute their music under Creative Commons licenses. . . .jamendo users can discover and share albums, but also review them or start a discussion on the forums. Albums are democratically rated based on the visitors’ reviews. If they fancy an artist they can support him by making a donation." Site distributes albums using BitTorrent and the M3U playlist file format.
  • PodSafeAudio: "This site aims to provide a location where musicians can upload music under the Creative Commons License for use in Podcasts, Mashups, Shoutcasts, Webcasts and every other kind of ‘casting’ that exists on the ‘net." A complex site with many features, including track reviews,categorization of music by genre and rating, categorization of artists by genre and region, collaboration project listing, user forums, and a blog.

Machinima

Here’s an interesting trend: using video games to create animated digital films. It’s called "Machinima." In one technique, the 3-D animation tools built into games to allow users to extend the games (e.g., create new characters) are used to generate new 3-D films. Of course, it can be more complicated than this: the Machinima FAQ outlines other strategies in layperson’s terms.

BusinessWeek has a short, interesting article on Machinima ("France: Thousands of Young Spielbergs") that describes one social commentary Machinima film (The French Democracy), noting that it got over one million hits in November. It also quotes Paul Marino, executive director of the Academy of Machinima Arts & Sciences as saying: "This is to the films what blogs are to the written media."

If you want to check out more Machinima films, try the 2005 Machinima Film Festival or Machinima.com (try "download" if "watch" doesn’t work).

Machinima is yet another example of how users want to create derivative works from digital media and how powerful a capability that can be—if intellectual property rights owners don’t prohibit it. Since the first Machinima movie was created in 1996, it appears that the video game industry has not moved to squash this movement, and, needless to say, it has thrived. However, this state of affairs may simply reflect Machinima’s low profile: A recent Wired News article, which notes that Machinima has been employed in commercials and music videos, indicates that Doug Lombardi, Director of Marketing at Valve (a video game software company), feels that: "As the films become commercially viable, machinima filmmakers are going to butt up against copyright law."

The Supremes Landmark Ruling on MGM vs. Grokster

The Supreme Court has ruled against Grokster. See "Supreme Court Rules against File Swapping" and "Court: File-Sharing Services May Be Sued" for details. For background information, see "File-Swap Fallout in Supreme Court Ruling" and the EFF’s MGM v. Grokster page. For in-depth discussion of the underlying issues, see Darknet: Hollywood’s War Against the Digital Generation and Sonic Boom listed at "Digital Works Want to Be Free ."

The key quote in the ruling is:

For the same reasons that Sony took the staple-article doctrine of patent law as a model for its copyright safeharbor rule, the inducement rule, too, is a sensible one for copyright. We adopt it here, holding that one who distributes a device with the object of promoting its use to infringe copyright, as shown by clear expression or other affirmative steps taken to foster infringement, is liable for the resulting acts of infringement by third parties. We are, of course, mindful of the need to keep from trenching on regular commerce or discouraging the development of technologies with lawful and unlawful potential. Accordingly, just as Sony did not find intentional inducement despite the knowledge of the VCR manufacturer that its device could be used to infringe, 464 U. S., at 439, n. 19, mere knowledge of infringing potential or of actual infringing uses would not be enough here to subject a distributor to liability. Nor would ordinary acts incident to product distribution, such as offering customers technical support or product updates, support liability in themselves. The inducement rule, instead, premises liability on purposeful, culpable expression and conduct, and thus does nothing to compromise legitimate commerce or discourage innovation having a lawful promise.

The EFF provides other key quotes.

Here’s an interesting take on the ruling: "File-Sharing Decision Hardly Apocalyptic".

ARL issued a statement for the Library Copyright Alliance that said:

The Library Copyright Alliance (LCA)­a group composed of the American Association of Law Libraries, American Library Association, Association of Research Libraries, Medical Library Association, and Special Libraries Association ­welcomes this balanced decision that supports the interests of libraries while addressing issues of widespread copyright infringement. By focusing on conduct that induces infringement, rather than on the distribution of technology, the decision ensures the continued availability of new and evolving digital technologies to libraries and their patrons.

The Center for Democracy and Technology’s press release said:

The court has worked to craft careful balance that allows copyright owners to pursue bad actors, but still protect the rights of technology makers. We hope this decision will preserve the climate of innovation that fostered the development of everything from the iPod to the Internet itself.

The EFF was less sanguine in their press release:

This decision relies on a new theory of copyright liability that measures whether manufacturers created their wares with the “intent” of inducing consumers to infringe. It means that inventors and entrepreneurs will not only bear the costs of bringing new products to market, but also the costs of lawsuits if consumers start using their products for illegal purposes.

And, of course, many bloggers weighed in as seen in Eric Goldman’s roundup, the lively discussion on SCOTUSblog, and the tsunami of comments on Slashdot.

According to "Congress Applauds File-Sharing Ruling" Congress is unlikely to take any immediate action as a result of the ruling.

Robert Summer, former head of the Recording Industry Association of America and former president of Sony Music International, said of the music industry reaction to the verdict: "The response across the board was one of elation."

Streaming Video E-Reserves at Emory University Libraries

Emory’s Woodruff Library has a streaming video e-reserves service. Here are a few quotes:

Material to be digitized must be owned either by the library or by the person requesting the digitization. We will not digitize any third-party copies, recordings, or transfers, including personal recordings of television broadcasts or rentals. If you would like to digitize material that is not owned either by you or by the library, please contact us and we will attempt to purchase it for the library’s collection. . . .

We will digitize video and compress it into a streaming video format that is accessible via a link posted in ReservesDirect for the duration of the semester. Our current streaming formats of choice are Real and QuickTime. Real and quicktime video players may be downloaded freely from the web. . . . We will optimize the stream for a reasonably wide cross-section of those who are likely to view it. . . .

As with other materials that are digitized and placed on ReservesDirect, we will place a copyright notice at the beginning of all video we digitize. All digitized materials will be retained and archived solely by us. . . .

We will digitize up to 20% total of a commercially produced video or film. . . .

Since all video submitted is for use in an instructional context, we anticipate that all materials submitted will follow guidelines for what is appropriate for display in a classroom setting. Therefore we will not judge or censor materials submitted to us for digitization. However, if a challenge concerning the appropriateness of materials is submitted to us, we reserve the right to restrict access to digitized materials at any time while we review the challenge and make a decision on whether to continue access to the material.