OCLC has announced the availability of Web Harvester, which allows CONTENTdm sites to import Web content into their systems.
Here's an excerpt from the press release:
OCLC's Web Harvester evolved from collaboration with several state libraries, state archives and universities over a period of seven years. Participants emphasized the increasing importance of collecting and managing Web-based content as information resources move online yet remain within libraries' and archives' collection scopes.
The Web Harvester is integrated into library workflows, allowing library staff to capture content as part of the cataloging process. The captured content is then sent to the organization's digital collections where it can be managed with other CONTENTdm digital content. . . .
The Web Harvester is accessed via the Connexion client, OCLC's powerful cataloging service, and captures content ranging from single, Web-based documents to entire Web sites. Once retrieved, users can review the captured Web content and add it to a collection managed by OCLC's CONTENTdm software, a complete solution for storing, managing and delivering a library's digital collections to the Web. Once in CONTENTdm, then Web content can be accessed and managed in conjunction with other digital collections. Harvested items are discoverable from WorldCat.org, WorldCat Local and the CONTENTdm Web interface.
For additional security, master files of the captured content also can be ingested to the OCLC Digital Archive, the service for long-term storage of originals and master files from libraries' digital collections.