Archive for the 'Search Engines' Category

Google Gives Wikipedia a Lump of Knol for Xmas

Posted in Search Engines, Web 2.0 on December 16th, 2007

According to "Encouraging People to Contribute Knowledge," Google has launched Knol, a Wikipedia competitor, in test mode.

Here'as an excerpt from the posting:

Earlier this week, we [Google] started inviting a selected group of people to try a new, free tool that we are calling "knol", which stands for a unit of knowledge. Our goal is to encourage people who know a particular subject to write an authoritative article about it. . . . .

A knol on a particular topic is meant to be the first thing someone who searches for this topic for the first time will want to read. The goal is for knols to cover all topics, from scientific concepts, to medical information, from geographical and historical, to entertainment, from product information, to how-to-fix-it instructions. Google will not serve as an editor in any way, and will not bless any content. . . . .For many topics, there will likely be competing knols on the same subject. . . .

Knols will include strong community tools. People will be able to submit comments, questions, edits, additional content, and so on. Anyone will be able to rate a knol or write a review of it. Knols will also include references and links to additional information. At the discretion of the author, a knol may include ads.

Read more about it at "Google to Wikipedia: "Knol" Thine Enemy," "Google's Knol: No Wikipedia Killer," "Google's 'Knols' Aren't a Threat to Wikipedia," "Google's Know-It-All Project," and "Google's Units of Knowledge May Raise Conflict of Interest."

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Columbia University Libraries and Bavarian State Library Become Google Book Search Library Partners

Posted in Digital Repositories, Digitization, E-Books, Mass Digitizaton, Open Access, Scholarly Books, Search Engines on December 14th, 2007

Both the Columbia University Libraries and Bavarian State Library have joined the Google Book Search Library Project.

Here are the announcements:

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

University of Michigan Libraries Make over 100,000 Records for Digitized Books Available for Harvesting

Posted in ARL Libraries, Digital Repositories, Digitization, E-Books, Institutional Repositories, Libraries, Mass Digitizaton, Metadata, Open Access, Public Domain, Search Engines on December 12th, 2007

The University of Michigan Libraries have made over 100,000 metadata records from its MBooks collection available for OAI-PMH harvesting. The records are for digitized books in the public domain.

Here's an excerpt from the announcement:

The University of Michigan Library is pleased to announce that records from our MBooks collection are available for OAI harvesting. The MBooks collection consists of materials digitized by Google in partnership with the University of Michigan.

http://quod.lib.umich.edu/cgi/o/oai/oai?verb=Identify

Only records for MBooks available in the public domain are exposed. We have split these into sets containing public domain items according to U.S. copyright law, and public domain items worldwide. There are currently over 100,000 records available for harvesting. We anticipate having 1 million records available when the entire U-M collection has been digitized by Google.

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Paul Courant on Michigan’s Mass Digitization Project with Google

Posted in ARL Libraries, Digitization, E-Books, Metadata, Open Access, Research Libraries, Scholarly Books, Scholarly Communication, Search Engines on November 5th, 2007

In "On Being in Bed with Google," Paul N. Courant, University Librarian and Dean of Libraries at the University of Michigan, vigorously rebuts arguments against research libraries participating in the Google Books Library Project.

Here's an excerpt:

Since 2005, Siva Vaidhyanathan has been making and refining the argument that libraries should be digitizing their collections independently, without corporate financing or participation, and that those who don’t are failing to uphold their responsibility to the public. "Libraries should not be relinquishing their core duties to private corporations for the sake of expediency."

"Expediency" is a bit of a dirty word. Vaidhyanathan’s phrase suggests that good people don’t do things simply because they are "expedient." But I view large-scale digitization as expeditious. We have a generation of students who will not find valuable scholarly works unless they can find them electronically. At the rate that OCA is digitizing things (and I say the more the merrier and the faster the better) that generation will be dandling great-grandchildren on its knees before these great collections can be found electronically. At Michigan, the entire collection of bound print will be searchable, by anyone in the world, about when children born today start kindergarten.

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Update on the British Public Library/Microsoft Digitization Project

Posted in Copyright, Digitization, E-Books, Mass Digitizaton, Open Access, Search Engines on November 3rd, 2007

Jim Ashling provides an update on the progress that the British Public Library and Microsoft have made in their project to digitize about 100,000 books for access in Live Book Search in his Information Today article "Progress Report: The British Library and Microsoft Digitization Partnership."

Here's an excerpt from the article:

Unlike previous BL digitization projects where material had been selected on an item-by-item basis, the sheer size of this project made such selectivity impossible. Instead, the focus is on English-language material, collected by the BL during the 19th century. . . .

Scanning produces high-resolution images (300 dpi) that are then transferred to a suite of 12 computers for OCR (optical character recognition) conversion. The scanners, which run 24/7, are specially tuned to deal with the spelling variations and old-fashioned typefaces used in the 1800s. The process creates multiple versions including PDFs and OCR text for display in the online services, as well as an open XML file for long-term storage and potential conversion to any new formats that may become future standards. In all, the data will amount to 30 to 40 terabytes. . . .

Obviously, then, an issue exists here for a collection of 19th-century literature when some authors may have lived beyond the late 1930s [British/EU law gives authors a copyright term of life plus 70 years]. An estimated 40 percent of the titles are also orphan works. Those two issues mean that item-by-item copyright checking would be an unmanageable task. Estimates for the total time required to check on the copyright issues involved vary from a couple of decades to a couple of hundred years. The BL’s approach is to use two databases of authors to identify those who were still living in 1936 and to remove their work from the collection before scanning. That, coupled with a wide publicity to encourage any rights holders to step forward, may solve the problem.

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Yale Will Work with Microsoft to Digitize 100,000 Books

Posted in ARL Libraries, Digitization, E-Books, Mass Digitizaton, Open Access, Scholarly Books, Search Engines on October 31st, 2007

The Yale University Library and Microsoft will work together to digitize 100,000 English-language out-of-copyright books, which will be made available via Microsoft’s Live Search Books.

Here’s an excerpt from the press release:

The Library and Microsoft have selected Kirtas Technologies to carry out the process based on their proven excellence and state-of-the art equipment. The Library has successfully worked with Kirtas previously, and the company will establish a digitization center in the New Haven area. . . .

The project will maintain rigorous standards established by the Yale Library and Microsoft for the quality and usability of the digital content, and for the safe and careful handling of the physical books. Yale and Microsoft will work together to identify which of the approximately 13 million volumes held by Yale’s 22 libraries will be digitized. Books selected for digitization will remain available for use by students and researchers in their physical form. Digital copies of the books will also be preserved by the Yale Library for use in future academic initiatives and in collaborative scholarly ventures.

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

German Publishers Just Say No to Google Book Search: Libreka Launched at Frankfurt Book Fair

Posted in E-Books, Mass Digitizaton, Publishing, Search Engines on October 14th, 2007

German publishers who want to retain control of their content have a new alternative to Google Book Search: Libreka, a full-text search engine that initially has about 8,000 books from publishers who opted-in for inclusion. Searchers retrieve book titles and cover images, but no content.

Source: "German Publishers Offer Alternative to Google Books." Deutsche Welle, 11 October 2007.

Share and Enjoy:
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • description
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comment on Preservation in the Age of Large-Scale Digitization

Posted in Digital Preservation, Digitization, Search Engines on September 21st, 2007

CLIR seeks comments on Preservation in the Age of Large-Scale Digitization by Oya Rieger. The deadline is 10/5/07.

Share and Enjoy: