JISC has released MetaTools: Final Report .
Here's an excerpt from the announcement:
Automatic metadata generation has sometimes been posited as a solution to the 'metadata bottleneck' that repositories and portals are facing as they struggle to provide resource discovery metadata for a rapidly growing number of new digital resources. Unfortunately there is no registry or trusted body of documentation that rates the quality of metadata generation tools or identifies the most effective tool(s) for any given task.
The aim of the first stage of the project was to remedy this situation by developing a framework for evaluating tools used for the purpose of generating Dublin Core metadata. . . .
A test program was then implemented using metrics from the framework. It evaluated the quality of metadata generated from 1) Web pages (html) and 2) scholarly works (pdf) by four of the more widely-known metadata generation tools—Data Fountains, DC-dot, SamgI, and the Yahoo! Term Extractor. . . .
It was found that the output from Data Fountains was generally superior to that of the other tools that the project tested. But the output from all of the tools was considered to be disappointing and markedly inferior to the quality of metadata that Tonkin and Muller report that PaperBase has extracted from scholarly works. Over all, the prospects for generating high-quality metadata for scholarly works appear to be brighter because of their more predictable layout. . . .
In the third stage of the project SOAP and RESTful Web Service interfaces were developed for three metadata generation tools—Data Fountains, SamgI and Kea. This had a dual purpose. Firstly, the creation of an optimal metadata record usually requires the merging of output from several tools each of which, until now, had to be invoked separately because of the ad hoc nature of their interfaces. As Web services, they will be available for use in a network such as the Web with well-defined interfaces that are implementation-independent. These services will be exposed for use by clients without them having to be concerned with how the service will execute their requests. Repositories should be able to plug them into their own cataloguing environments and experiment with automatic metadata generation under more 'real-life' circumstances than hitherto. Secondly, and more importantly (in view of the relatively poor quality of current tools) they enabled the project to experiment with the use of a high-level ontology for describing metadata generation tools.