Google Print Controversy Heats Up

Lots of ink (real and virtual) on Google Print and the AAUP’s recent resistance (all from Open Access News):

"Forget Google Print Copyright Infringement; Search Engines Already Infringe," SearchEngineWatch

"From Gutenberg to Google: Five Views on the Search-Engine Company’s Project to Digitize Library Books," The Chronicle
of Higher Education
(requires subscription)

"Google Books under Fire," The Register

"Google Library Project Hit by Copyright Challenge from University Presses," Information Today Newsbreaks

"Google Print Goes Live," InternetNews

"A Google Project Pains Publishers," Business Week

"Google This: ‘Copyright Law,’" Business Week

"Google’s Scan Plan Hits More Bumps," Forbes

"Publishers Lay into Google Print," ZDNet UK

"The University Press Assn.’s Objections," Business Week

"University-Press Group Raises Questions About Google’s Library-Scanning Project," The Chronicle of Higher Education

Strategic Planning Efforts at ARL Libraries, Part 1

How do some of the largest libraries in North America see their near-term future?

The Association of Research Libraries (ARL) currently has 123 member libraries in the US and Canada. Below is a partial list of strategic planning Web sites at ARL libraries. This list was complied by a quick look at ARL libraries’ home pages, supplemented by limited site-specific Google searching. Web sites were included if the library’s strategic plan included the years 2004 and/or 2005.

Will You Only Harvest Some?

The Digital Library for Information Science and Technology has announced DL-Harvest, an OAI-PMH service provider that harvests and makes searchable metadata about information science materials from the following archives and repositories:

  • ALIA e-prints
  • arXiv
  • Caltech Library System Papers and Publications
  • DLIST
  • Documentation Research and Training Centre
  • DSpace at UNC SILS
  • E-LIS
  • Metadata of LIS Journals
  • OCLC Research Publications
  • OpenMED@NIC
  • WWW Conferences Archive

DL-Harvest is a much needed, innovative discipline-based search service. Big kudos to all involved.

DLIST also just announced the formation of an advisory board.

The following musings, inspired by the DL-Harvest announcement, are not intended to detract from the fine work that DLIST is doing or from the very welcome addition of DL-Harvest to their service offerings.

Discipline-focused metadata can be relatively easily harvested from OAI-PHM-compliant systems that are organized along disciplinary lines (e.g., the entire archive/repository is discipline-based or an organized subset is discipline-based). No doubt these are very rich, primary veins of discipline-specific information, but how about the smaller veins and nuggets that are hard to identify and harvest because they are in systems or subsets that focus on another discipline?

Here’s an example. An economist, who is not part of a research center or other group that might have its own archive, writes extensively about the economics of the scholarly publishing business. This individual’s papers end up in the economics department section of his or her institutional repository and in EconWPA. They are highly relevant to librarians and information scientists, but will their metadata records be harvested for use in services like DL-Harvest using OAI-PMH since they are in the wrong conceptual bins (e.g., set in the case of the IR)?

Coleman et al. point to one solution in their intriguing "Integration of Non-OAI Resources for Federated Searching in DLIST, an Eprints Repository" paper. But (lots of hand waving here), if using automatic metadata extraction was an easy and simple way to supplement conventional OAI-PMH harvesting, the bottom line question is: how good is good enough? In other words, what’s an acceptable level of accuracy for the automatic metadata extraction? (I won’t even bring up the dreaded "controlled vocabulary" notion.)

No doubt this problem falls under the 80/20 Rule, and the 20 is most likely in the low hanging fruit OAI-PMH-wise, but wouldn’t it be nice to have more fruit?

Streaming Video E-Reserves at Emory University Libraries

Emory’s Woodruff Library has a streaming video e-reserves service. Here are a few quotes:

Material to be digitized must be owned either by the library or by the person requesting the digitization. We will not digitize any third-party copies, recordings, or transfers, including personal recordings of television broadcasts or rentals. If you would like to digitize material that is not owned either by you or by the library, please contact us and we will attempt to purchase it for the library’s collection. . . .

We will digitize video and compress it into a streaming video format that is accessible via a link posted in ReservesDirect for the duration of the semester. Our current streaming formats of choice are Real and QuickTime. Real and quicktime video players may be downloaded freely from the web. . . . We will optimize the stream for a reasonably wide cross-section of those who are likely to view it. . . .

As with other materials that are digitized and placed on ReservesDirect, we will place a copyright notice at the beginning of all video we digitize. All digitized materials will be retained and archived solely by us. . . .

We will digitize up to 20% total of a commercially produced video or film. . . .

Since all video submitted is for use in an instructional context, we anticipate that all materials submitted will follow guidelines for what is appropriate for display in a classroom setting. Therefore we will not judge or censor materials submitted to us for digitization. However, if a challenge concerning the appropriateness of materials is submitted to us, we reserve the right to restrict access to digitized materials at any time while we review the challenge and make a decision on whether to continue access to the material.

No Respectable University Would Offer an Online Doctorate?

During the JESSE debate on online Ph.D.’s, Bill Summers said:

Relatedly, Universities which take themselves seriously do not permit external PHD programs. At any of the three institutions with which I have been privileged to be associated, Rutgers, South Carolina and Florida State, the Dean presenting such a proposal to the Faculty Senate would be hooted off campus and the program forever thereafter labeled as Mickey Mouse.

These are some universities that offer online doctorates (there are others that offer distance-education doctorates that aren’t "online" per se).

University of Arizona, College of Nursing

Boston University, College of Fine Arts

Boston University, Sargent College of Rehabilitation Sciences

Texas Tech University, Department of English

University of Hawaii, School of Nursing and Dental Hygiene

University of Maryland University College

University of Wisconsin, Milwaukee, College of Nursing

Anyone hooting?

Online Ph.D. Programs Redux

My "Online Ph.D. Programs: Unique Clientele?" posting, which I also sent as a message to the JESSE list, triggered a long discussion thread on that list. It makes for very interesting reading. (Choose "Next in topic" in the View box to move from message to message.) For related threads, see the May archive.

Let me briefly recap some of my main points in light of this discussion. Academic librarians with faculty or faculty-like status who are at the associate and full levels do not need to be taught how to be scholars: they are scholars. In this respect they represent a unique doctoral clientele. What they need, if they do not have them, are Ph.D.’s. They do not want to quit their jobs or commute long distances to get them from the few information schools that remain. If they wanted Ph.D.’s in other subject areas, they would not be troubling information school faculty. Certainly, a DLIS option would be a welcome alternative to nothing. However, they are not in any way intimidated by the prospect of a research degree. They are researchers. They are interested in a research degree, but many have no interest in joining the ranks information school faculty. Having a research degree will help them in their current career path in a variety of ways.

Illustrating the point that academic librarians are researchers, an examination of high-impact library-oriented journals would likely show: (1) academic librarians edit such journals, (2) information school faculty edit such journals, (3) academic librarians publish in journals edited by information school faculty, (4) information school faculty publish in journals edited by academic librarians, (5) academic librarians often cite papers written by information school faculty, and (6) information school faculty often cite papers written by academic librarians. In short, the peer-reviewed library literature is a co-mingling of the scholarly work of academic librarians, information school faculty, and others. If all identifying information were stripped away from a peer-reviewed library journal article, it would be impossible to determine if it was written by an academic librarian or an information school faculty member.

In spite of some frustrations, most academic librarians have a high regard for information school faculty and believe that what they do is very important. However, they find it difficult to understand how, in 2005, with the wide array of digital technologies at information schools’ disposal why, in light of their unique circumstances, their needs cannot be adequately met with these technologies, supplemented by brief on-campus stays. This dialog has revealed a number of information school faculties’ concerns. It appears to me that a key one is that such a degree would not be viewed as legitimate by faculty in other disciplines at the local institution. This is understandable, because these faculty do not have a potential doctoral study body with similar characteristics. But, depending on local circumstances, they may, at the same time, be officially recognizing local librarians as faculty members or as having a faculty-like status. They sit beside them at the Faculty Senate, and they may have elected an academic librarian to lead them. This could be pointed out to them as a case was made for establishing a special program that was designed to reflect the unique status of academic librarians.

The extent of interest in an online Ph.D. program among academic librarians may not be apparent to information school faculty. However, market research is likely to reveal that a significant subset of academic librarians are interested in pursuing such an option, and information schools that overcome the barriers that prevent such programs will find that their pool of potential doctoral students is significantly expanded with experienced, highly desirable candidates that they would never otherwise attract.

Postscript:

Based on a JESSE message from Ian M. Johnson, it appears that the Information Management department at The Robert Gordon University in the UK is about to offer an online Ph.D. (It currently has six online Master’s programs.)

Scholarly Electronic Publishing Weblog Update (5/23/05)

The biweekly update of the Scholarly Electronic Publishing Weblog (SEPW) is now available, which provides brief information on 20+ new journal issues and other resources. Especially interesting is a new issue of the INDICARE Monitor, which has an article about Digital Rights Management (DRM) and open access by Richard Poynder. Also, Walt Crawford weighs in on the DigitalKoans Bailey-Harnad debates in the latest Cites & Insights: Crawford at Large, and there is a theme issue of Issues in Science and Technology Librarianship on open access.

Online Ph.D. Programs: Unique Clientele?

Information schools have one group of potential Ph.D. students that appear to have unique characteristics: academic librarians with faculty or faculty-like status.

To advance in rank in these up-or-out systems, academic librarians:

  1. Publish in peer-reviewed journals, edit such journals, serve on the editorial boards of such journals, write books, and edit books. They also write, edit, and serve on editorial boards of a variety of other publications.
  2. Write proposals for, manage, and analyze the results of funded research projects.
  3. Make presentations at professional conferences and elsewhere.
  4. Teach for-credit and non-credit courses.
  5. Serve as adjunct faculty in information schools.
  6. Serve on committees and as officers of professional associations.
  7. Often obtain multiple master’s degrees.

This is not to say that other librarians do not also perform the above activities; however, academic librarians with faculty or faculty-like status are typically required to do 1, 3, and 6, with the main difference in such requirements being on the need to perform higher-level activities in 1. And they are "rewarded" for performing all of them.

So, what other disciplines with Ph.D. programs have potential students with similar requirements? If the answer is "none" and if the above activities are not viewed as a kind of faux scholarship, then it would appear that experienced members this client group (say those with associate status or above) have characteristics that suggest that their need for enculturation, lengthy preliminary study, and other academic requirements that are obviously needed for freshly minted undergraduates or inexperienced MLS graduates is limited or nonexistent. Consequently, they may be quite successful in online Ph.D. programs where these other students would fail, especially if online study is supplemented with brief on-campus stays.

The Nearly Nonexistant Online Ph.D.

Well, I had that bit about UNT offering an online Ph.D. program wrong. UNT’s grant PI Brian O’Connor says:

I must say that our program is NOT web-based, though it has a strong web component. Our IMLS cohort members are considered members of the same doctoral program and subject to the same requirements as our "residential" students. . . . The IMLS program is design[ed] so that more than 51% of course time is conducted with the same level of face-to-face engagement between students and faculty as would be the case for residential students. . . . I would comment that enculturation is terribly important, though one might be able to imagine someone making major contributions while not being "enculturated.". . . Perhaps more intriguing is the assertion that enculturation cannot be adequately accomplished within a virtual environment. Is this a necessary case? Is it not at all true now, but possible with different technology?. . . . Is there not some virtual way to accomplish critical thinking, sharing, debating, using different perspectives? So far, our experience shows that such give and take is quite possible, especially if the students have had an opportunity to meet each other face-to-face at some point early on. Please do not take the above to mean that I prefer the possibility of a virtual academy—I do not. I am simply suggesting that we not toss out the possibilities, at least, not yet.

Looks like we’re down to one online Ph.D. program (and waiting for a disclaimer on that one). Since it’s only been about 12 years since the Web took off with the release of the alpha version of Mosaic, I guess we need to be patient.

The Ever-Elusive Online Ph.D.

Recently, there has been some discussion of online Ph.D. programs in information studies on the JESSE list. I probably don’t need to tell you that few such programs exist.

The University of North Texas has an online Ph.D. program. However, this IMLS-funded program limits who can apply to school media and public librarians. Nova University has had one for quite some time.

While I’m sure that information school faculty have many good reasons why they believe that such degrees cannot be offered online, I’m afraid that to some academic librarians, who are not about to abandon their day jobs and who have no program within striking distance, this seems like a decidedly 19th-century viewpoint, especially if offered by a school that has morphed into an avant-garde “I” school.

It also seems to be based on the peculiar notion that all Ph.D.s must want to teach. Academic librarians, who are "neither fish nor fowl," may want a Ph.D. for other career reasons.

But leaving that aside, is it really the case that, in 2005, the rich diversity of online tools at our disposal cannot substitute for pressing the flesh, especially if augmented by brief on-campus stays? If that’s really true, why aren’t online MLS degrees second-rate? Isn’t physical proximity as important to future library professionals as to the future teachers of library (and other) professionals?

There is a certain delicious irony in the fact that "I" schools, like my old alma mater Syracuse University, strive mightily and successfully to teach and develop advanced technologies, but cannot bring themselves to use them to deliver online Ph.D. degrees in subject areas like digital libraries. Yet, they offer online digital library CAS degrees (SU and UIUC) without any apparent qualms.

But, it’s unfortunate that, by doing so, they deprive potential students of doctoral degrees and themselves of an expanded client base.

Joint Institutional Repository Evaluation Project

The Johns Hopkins University Digital Knowledge Center in conjunction with MIT and the University of Virginia are working on a Mellon Foundation-funded "A Technology Analysis of Repositories and Services" project to: "conduct an architecture and technology evaluation of repository software and services such as e-learning, e-publishing, and digital preservation. The result will be a set of best practices and recommendations that will inform the development of repositories, services, and appropriate interfaces."

The grant proposal and a presentation given at the CNI Spring 2005 Task Force Meeting provide further details about the project.

FCLA Digital Archive

Since 2003, the Florida Center for Library Automation (FCLA) has been creating an IMLS-grant-funded Digital Archive (DA) to serve Florida’s public universities. The DA project’s goals are to: "1) establish a working and trusted digital archive, 2) identify costs involved in all aspects of archiving, and 3) disseminate tools, procedures and results for the widest national impact."

The DA will "accept submission packages from participating partners, ingest digital documents along with the appropriate metadata, and safely store on-site and off-site copies of the files."

The DA is a "dark" archive:

Our original idea, and the one thing that did not change over time, was the idea of
building a dark archive. By “dark” I mean an archive with no real-time, online access to
the content by anyone except repository staff. Dark archives are out of favor right now
but we had some good reasons for it. We serve ten state universities and each of them
has its own presentation systems and some have their own institutional repository
systems. Some of the libraries use FCLA delivery applications but some use their own
applications. Central Florida uses CONTENTdm, South Florida uses SiteSearch and
Florida State uses DigiTool. At FCLA we donÂ’t have the software to replicate these
access functions and we donÂ’t have any desire to; it would cost a great deal to acquire the
software licenses, and it would take a huge amount of staff to support all these
applications. So the idea of our offering presentation services on top of our stored
repository content wasnÂ’t feasible.

Real-life digital preservation efforts are always worth keeping an eye on, and this one is quite ambitious. You can track their progress through their grant page and their publications and presentations page.

The project’s most recent presentation by Priscilla Caplan ("Preservation Rumination: Digital Preservation and the Unfamiliar Future ") is available from OCLC in both PowerPoint and MP3 formats.

Is the Access Spectrum a Red Herring or Are Green and Gold Too Black and White?

Stevan Harnad has commented extensively on my "The Spectrum of E-Journal Access Policies: Open to Restricted Access" DigitalKoans posting. Thanks for doing so, Stevan. Here are my thoughts on your comments.

First, let me concede that if you look at this question from Stevan’s particular open-access-centric point of view that, of course, the spectrum of publisher access policies is a complete and utter waste of time. I don’t recall suggesting that this was a new open access model per se, even though it includes open access in it as a component and it makes some further distinctions between open access and free access journals. Rather, it is what it says it is: a model that presents a range of publisher access policies from the least restrictive to the most restrictive. The color codes merely enhance the model slightly, they are not central to it (and, of course, as Steven says, he created this color coding Frankenstein to begin with). The model says nothing about e-prints.

That said, Steven’s view that open access equals free access (period) is not, as he well knows, universal, and his green and gold models are based on this premise.

Here is how Peter Suber defines OA in "Open Access Overview: Focusing on Open Access to Peer-Reviewed Research Articles and Their Preprints" (boldface is mine):

  • OA should be immediate, rather than delayed, and OA should apply to the full-text, not just to abstracts or summaries.
  • OA removes price barriers (subscriptions, licensing fees, pay-per-view fees) and permission barriers (most copyright and licensing restrictions).
  • There is some flexibility about which permission barriers to remove. For example, some OA providers permit commercial re-use and some do not. Some permit derivative works and some do not. But all of the major public definitions of OA agree that merely removing price barriers, or limiting permissible uses to "fair use" ("fair dealing" in the UK), is not enough.
  • Here’s how the Budapest Open Access Initiative put it: "There are many degrees and kinds of wider and easier access to this literature. By ‘open access’ to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited."
  • Here’s how the Bethesda and Berlin statements put it: "For a work to be OA, the copyright holder must consent in advance to let users ‘copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship….’"
  • The Budapest (February 2002), Bethesda (June 2003), and Berlin (October 2003) definitions of "open access" are the most central and influential for the OA movement. Sometimes I call refer to them collectively, or to their common ground, as the BBB definition.

So, by most OA definitions, a journal that "makes all of its articles immediately and permanently accessible to all would-be users webwide toll-free" is not OA unless it also uses a Creative Commons or similar license that permits use with minimal restrictions. It is FA (Free Access). As I have said in an earlier dialog, we can count on no journal to be "permanently accessible" unless some trusted archive other than the publisher makes it so, an issue that Steven apparently disagrees with, believing that publishers never go out of business.

I note that Steven has deviated from his "chrononomic parsimony" principle by having both "Green" and "Pale-Green," in his model and then lumping them both together in his discussions as "GREEN." (In his Summary Statistics So Far site he also introduces the color Grey, for "neither yet.") If preprints and postprints are of equal value, why not just code them Green? If they are not of equal value (i.e., postprints that accurately incorporate the changes that occur during the peer-review process are the only real substitute for the published article), then, in reality, those 15.5% of "Pale-Green" journals are of limited value in terms of self-archiving, and the real GREEN journal number is 76.2%, not 92%.

I must admit to some confusion on his latest stand that all types of self-archiving are equal. In "Ten Years After," he seems to be expressing a different sentiment regarding author home pages:

That said, there was a naive element to the Subversive Proposal, too, since Harnad’s plan would have led to researchers posting their papers on thousands of isolated FTP sites. This would have meant that anyone wanting to access the papers would have needed prior knowledge of the papers’ existence and the whereabouts of every relevant archive. They would then have had to search each archive separately. Today, Harnad concedes that "anonymous FTP sites and arbitrary Web sites are more like common graves, insofar as searching is concerned."

Perhaps I misunderstand what is meant by "arbitrary Web sites."

As the prior DigitalKoans dialog beginning with "How Green Is My Publisher?" shows, we clearly disagree on many points related to the importance of author copyright agreements (e.g., they have to permit deposit in disciplinary archives), the importance of deposit in OAI-PMH-compliant archives, and the mission and scope of institutional repositories.

A series of DigitalKoans postings that start with "The View from the IR Trenches, Part 1" provides numerous quotes from the literature that bolster my case.

Second, while I admire Stevan’s unflagging advocacy of open access (by which he really means free access), open access is not the only issue in the e-journal publishing world that is of concern to librarians to whom this missive was mainly addressed. This is because librarians, while hopefully working to build a better future, have to deal with the messy existing realities of the e-publishing environment to do their jobs and to make decisions about how to allocate scarce resources. Consequently, librarians have to scan the e-publishing environment, analyze it, categorize it, and make evaluative judgements about it. They have to make models of e-publishing reality to better understand it. They don’t have the luxury of only dreaming about what that reality should be.

Thus, while Steven is indifferent to many of those 894,302 free full-text articles from 857 HighWire-hosted journals (a number which likely dwarfs all articles available from OA/free journals), librarians are not. Paying attention to them is important. While many are not immediately free, they are free nonetheless after some embargo period. And EA (Embargoed Access) journals are better than RA (Restricted Access) journals in practical terms for users who have no other current access. And even limited access to more restrictive PA (Partial Access) journals is likely to be welcomed by users who today would have no access otherwise. I know that both kinds of access are welcomed by me as a user.

This is not to say that we shouldn’t strive for journals to move up the spectrum from red to green, but it is to say that: (1) some free access is better than no free access for journals that will never move further up the spectrum, and (2) it may be that some journals have to move step-by-step, not in one leap, for the change to take place, and, if they start higher, it may be easier to encourage them to move further and faster. (But we have to know which ones have this potential based on their current status.)

Steven’s model has colors, but, in reality, each color is black and white: Gold and nothing, GREEN and grey. All or nothing. And, as long as you accept his premises, it works, and it allows him to focus on his free-access goal with single minded determination, undistracted by the knotted complexities of the e-scholarly publishing environment. Long may he run.

For those who have a different view of OA or who have broader concerns, it’s too "black and white."

I give him the last word on this matter.

The Spectrum of E-Journal Access Policies: Open to Restricted Access

As journal publishing continues to evolve, the access policies of publishers become more differentiated. The open access movement has been an important catalyst for change in this regard, prodding publishers to reexamine their access policies and, in some cases, to move towards new access models.

To fully understand where things stand with journal access policies, we need to clarify and name the policies in use. While the below list may not be comprehensive, it attempts to provide a first-cut model for key journal access policies, adopting the now popular use of colors as a second form of shorthand for identifying the policy types.

  1. Open Access journals (OA journals, color code: green): These journals provide free access to all articles and utilize a form of licensing that puts minimal restrictions on the use of articles, such as the Creative Commons Attribution License. Example: Biomedical Digital Libraries.
  2. Free Access journals (FA journals, color code: cyan): These journals provide free access to all articles and utilize a variety of copyright statements (e.g., the journal copyright statement may grant liberal educational copying provisions), but they do not use a Creative Commons Attribution License or similar license. Example: The Public-Access Computer Systems Review.
  3. Embargoed Access journals (EA journals, color code: yellow): These journals provide free access to all articles after a specified embargo period and typically utilize conventional copyright statements. Example: Learned Publishing.
  4. Partial Access journals (PA journals, color code: orange): These journals provide free access to selected articles and typically utilize conventional copyright statements. Example: College & Research Libraries.
  5. Restricted Access journals (RA journals, color code: red): These journals provide no free access to articles and typically utilize conventional copyright statements. Example: Library Administration and Management. (Available in electronic form from Library Literature & Information Science Full Text and other databases.)

Using this taxonomy, an examination of the contents of the Directory of Open Access Journals quickly reveals that, in reality, it is the Directory of Open and Free Access Journals, because many listed journals do not use a Creative Commons Attribution License or similar license.

Some may argue that the distinction between OA and FA journals is meaningless; however, to do so suggests that the below sections of the "Budapest Open Access Initiative" in italics are meaningless and, consequently, that the Open Access movement is really just the Free Access movement.

By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

Not that there would be anything wrong with the Free Access movement, but some may feel that the broader scope of the Open Access movement is much more desirable.

In any case, the journal universe is not just green or red, and it’s a pity that we don’t know the breakdown of the spectrum (e.g., x number of green journals and y number of cyan journals), for that would give us a better handle on how the world has changed from the days when all journals were red journals.

Institutional Repository Overviews: A Brief Bibliography

You want a good introduction to institutional repositories. What should you read? Try one or more of the works below. For a quick overview, try Drake, Johnson, or Lynch. For more detail, try Crow or Ware. For an in-depth, library-oriented overview, Gibbons can’t be beat.

Crow, Raym. The Case for Institutional Repositories: A SPARC Position Paper. Washington, DC: The Scholarly Publishing and Academic Resources Coalition, 2002.

Drake, Miriam A. "Institutional Repositories: Hidden Treasures." Searcher 12, no. 5 (2004): 41-45.

Gibbons, Susan. "Establishing an Institutional Repository." Library Technology Reports 40, no. 4 (2004). (Available on Academic Search Premier.)

Johnson, Richard K. "Institutional Repositories: Partnering with Faculty to Enhance Scholarly Communication." D-Lib Magazine 8 (November 2002).

Lynch, Clifford A. "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age." ARL: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC, no. 226 (2003): 1-7.

Ware, Mark. Pathfinder Research on Web-based Repositories. London: Publisher and Library/Learning Solutions, 2004.

More Blind Than Double-Blind Review?

The Wall Street Journal has published an interesting article on the failure of medical journals to adequately screen articles (reprinted below in the Pittsburgh Post-Gazette):

http://www.post-gazette.com/pg/05130/501996.stm

To quote:

A study published in the Journal of the American Medical Association last year reviewed 122 medical-journal articles and found that 65 percent of findings on harmful effects weren’t completely reported. It also found gaps in half the findings on how well treatments worked. . . .

Journal editors rarely see the complete design and outcome of the studies summarized in articles submitted for publication. A typical article is perhaps six or seven pages long, even when the research behind it took years and involved thousands of patients. Peer reviewers — other scientists who work voluntarily to review articles before they are published — also see only the brief article. They might fail to notice suspicious omissions and changes in focus, or, if they do, lack the time or inclination to follow them up.

The View from the IR Trenches, Part 4

Today, we’ll look at an article that describes the results of a one-year study at the University of Rochester, River Campus Libraries to "understand the current work practices of faculty in different disciplines in order to see how an IR might naturally support existing ways of work."

Foster, Nancy Fried, and Susan Gibbons. "Understanding Faculty to Improve Content Recruitment for Institutional Repositories." D-Lib Magazine 11, no. 1 (2005).
http://www.dlib.org/dlib/january05/foster/01foster.html

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Needs

The people we interviewed want most to be able to. . .

  • Work with co-authors
  • Keep track of different versions of the same document
  • Work from different computers and locations, both Mac and PC
  • Make their own work available to others
  • Have easy access to other people’s work
  • Keep up in their fields
  • Organize their materials according to their own scheme
  • Control ownership, security, and access
  • Ensure that documents are persistently viewable or usable
  • Have someone else take responsibility for servers and digital tools
  • Be sure not to violate copyright issues
  • Keep everything related to computers easy and flawless
  • Reduce chaos or at least not add to it
  • Not be any busier

Using Standard IR Terminology Doesn’t Work

Accordingly, when we tried to recruit content using typical IR promotional language, faculty members and researchers did not respond enthusiastically. This is because they did not perceive the relevance of almost any of the IR features as stated in the terms used by librarians, archivists, computer programmers, and others who were setting up and running the IR for the institution. One reason faculty have not rushed to put their work into IRs, therefore, is that they do not recognize its benefits to them in their own terms.

Another reason that faculty have expressed little interest in IRs is related to the way the IR is named and organized. The term ‘institutional repository’ implies that the system is designed to support and achieve the needs and goals of the institution, not necessarily those of the individual. Moreover, it suggests that contributions of materials into the repository serve to highlight the achievements of the institution, rather than those of individual researchers and authors. . . .

Faculty Are Most Interested in Communicating with Colleagues Worldwide

When it comes to research, a faculty member’s strongest ties are usually with a small circle of colleagues from around the world who share an interest in the same field of research, such as plasma astrophysics or contemporary European critical thought. It is with these colleagues, many of them at other institutions, that researchers most want to communicate and share their work. But most organizations have mapped their IR communities to their academic departments rather than to the subtle, shifting communities of scholars engaged in interrelated research projects. . . . In the absence of a strong connection that would naturally bring these documents together into a collection that other scholars would look for, find, and use, there is no compelling reason for the authors to make the submission.

One-on-One Librarian-Faculty Sessions Are Best Way to Interest Faculty

Rather than approach faculty with a set, one-size-fits-all promotional spiel, these library liaisons operate under the guidance that a personalized, tailored approach works best. As we learned from the work-practice study, what faculty members care most about is their research. . . . Throughout the conversation, the library liaison is listening for opportunities to demonstrate how the benefits of the IR respond directly to the faculty member’s web-related research needs. . . .

IR Benefits Must Be Stated in Terms That Faculty Relate To

By contrast to the language previously used to describe the features and benefits of the IR, we are now describing the IR in language drawn from faculty interviews. Thus, we tell faculty that the IR will enable them to. . .

  • Make their own work easily accessible to others on the web through Google searches and searches within the IR itself
  • Preserve digital items far into the future, safe from loss or damage
  • Give out links to their work so that they do not have to spend time finding files and sending them out as email attachments
  • Maintain ownership of their own work and control who sees it
  • Not have to maintain a server
  • Not have to do anything complicated

Scholarly Communication Web Sites at ARL Libraries

The Association of Research Libraries (ARL) currently has 123 member libraries in the US and Canada. Below is a list of scholarly communication web sites at ARL libraries. This list was complied by a quick examination of ARL libraries’ home pages, supplemented by some Google searching. It’s not comprehensive, and I would welcome additions.

More on OhioLINK’s Digital Resource Commons

David F. Kohl has self-archived a PowerPoint presentation about the DRC at E-LIS. It’s called "Cooperating Beyond the ‘Buying Club’: Digital Resource Commons (DRC): Making the Impossible Possible in Ohio."

To quote from the abstract:

Each institution can ‘brand’ itself in the system and may host a discrete and customized interface to all of its content. To the end user it will appear as an institutional resource as if it were hosted on your own servers. There will also be a collective OhioLINK level branding and ability for searches to retrieve across the institutional collections. . . . You will have complete control of your own content and how it is accessed. Multi-tiered security levels will allow your content to be shared only to the extent desired. . . .

Alternatively content can be restricted to an individual department, to an institution, or to the OhioLINK membership. Each institution can set its own policies governing the content in its repositories. Likewise custom workflows can be established to make the most of the personnel involved in each project and expedite the content creation and capture process. The service will include robust and flexible cataloging tools to aid in the creation of records that can be searched and browsed effectively by all types of users. Catalog records can be exported in international standard XML formats such as the Open Archives Initiative Protocol for Metadata Harvesting. Through OhioLINK’s unique collaboration with the Ohio Supercomputer Center your content is stored on enterprise class servers and storage networks.. . . A huge storage area network allows virtually unlimited storage space on our disks. . . . Programming or system administration skills and experience are not required. The system is flexible and adaptable and provides services superior to ‘DSpace’ and ‘ContentDM’ without the associated costs.

OhioLINK’s Digital Resource Commons

Peter Murray, Assistant Director of Multimedia Systems at OhioLINK recently posted a job announcement on LITA-L (I’d link, but given the way ALA safeguards access to its lists, it’s simply impossible) that brought to my attention a bold OhioLink project called the Digital Resource Commons, which is part of an even bolder project called the Ohio Digital Commons for Education. The quote from the job ad below describes the Digital Resource Commons. An earlier part of the ad indicates that Fedora will be used as the DRC’s platform.

OhioLINK’s Digital Resource Commons (DRC) is an Ohio Board of Regents-funded project to create a federated repository service that ingests, preserves, presents, and mediates administration of the educational and research materials of participating institutions. With the capability to store and deliver a virtually unlimited variety of digital file types and formats (including text, data sets, image, audio, video, streaming video, multimedia presentations, animations, etc.) the DRC is positioned to capture digital content from student and faculty researchers as it is produced and return it to users of the DRC upon request. The DRC offers wide and flexible control to member institutions and the communities within institution to define how content is added, preserved, and displayed to repository users. With federated community administration features, lead contacts at member institutions can create communities and delegate up to a complete subset of their privileges within the system to the editors/moderators of those new communities. The ability to scope and brand content to a particular community and institution is offered while retaining the ability to search for content across the entire repository. As both an Open Archives Initiative Data Provider and Service Provider, the DRC is positioned to become the premier point for the discovery of knowledge by and about Ohio’s scholars. In conjunction with the other parts of the Ohio Board of Regents grant funding, the DRC is one piece of a larger effort to build the Ohio Digital Commons for Education—a powerful vision for the future of learning and research in the state of Ohio.

The quote below from the DRC Web site describes the Ohio Digital Commons for Education.

The Digital Resource Commons is one of three projects funded by an Ohio Board of Regents Technology Initiatives grant collectively called the Ohio Digital Commons for Education (ODCE). The three components—this resource repository, the state-wide licensing and development of course management systems (WebCT and Blackboard), and a common access control mechanism (Shibboleth)—combine to offer a powerful vision for learning and research for the state of Ohio.

Impressive. As Daniel Hudson Burnham said: "Make no little plans; they have no magic to stir men’s blood and probably themselves will not be realized."

The View from the IR Trenches, Part 3

Today, we’ll look at an article that provides a UK academic library’s view of its institutional repository responsibilities:

Nixon, William J. "The Evolution of an Institutional E-Prints Archive at the University Of Glasgow." Ariadne, no. 32 (2002).
http://www.ariadne.ac.uk/issue32/eprint-archives/

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Library IR Roles

(The below quotes are from a summary list of library roles in the article.)

IR Advocate

Encouraging members of the University to deposit material into the ePrints archives. At Glasgow we have started an Advocacy campaign to demonstrating that this service has a broader context beyond Glasgow . . . A recent event to raise awareness about the issues of Scholarly Communication provided us with an opportunity to launch our e-prints service and to raise its profile

Copyright Advisory Service

Providing advice to members of the University about copyright and journal embargo policies for material which they would like to deposit in our archive, and as appropriate liaising directly with the Journal in question. This will become a pivotal role in the acceptance of our e-prints service since copyright is the number one question which members of the University ask about

Digitization Service

Converting material to a suitable format such as HTML or PDF for import into the archive. It may also be necessary to ensure that HTML which is submitted is properly formatted and cross-browser compatible

Deposit Service

Depositing material directly on behalf of members of the University who do not, or cannot self-archive their material. In instances in which we have deposited papers on behalf of individuals, we have created a new account for them and used that to submit their content. . . .

Metadata Review and Creation Service

Reviewing the metadata of content which has been self-archived to maintain the quality of the record and to add any additional subject headings and keywords as appropriate.

Here Comes the Sun: Morphing Library Journals

Information Technology and Libraries (ITAL) has a new editor, John Webb, and he’s outlined an ambitious agenda for the journal in his initial editorial in the March 2005 issue (volume 24, no. 1).

That issue includes articles on e-books myths, the International Children’s Library, and the Music of Social Change (MOSC) project. It’s a very promising start that suggests that he may he able to reinvigorate ITAL.

For those of you who are unfamiliar with ITAL, it is a low-cost refereed journal published by the Library and Information Technology Association. There is free access to selected articles published in the journal from March 2001 to March 2004. There is no information on the Web site about any other issues (including the current one), except a note about potential retrospective digitization.

In case you haven’t noticed, OCLC Systems & Services now has a subtitle of "International Digital Library Perspectives." Since the journal now seems to be primarily about digital libraries, why the title wasn’t changed completely is bit of a mystery. It is a refereed "for-fee" journal with no free access, which is published by Emerald. It’s edited by Bradford Lee Eden.

Both of these journals have high-quality free competitors in or significantly overlapping their niche (e.g., Ariadne, D-LIB Magazine, and RLG DigiNews). To a lesser degree, they also overlap with other significant free (e.g, First Monday, High Energy Physics Libraries Webzine, and Issues in Science & Technology Librarianship) and free-with-embargo-access journals (Learned Publishing). Not to mention some major for-fee global competitors. This presents the editors with paper recruitment challenges, especially since US authors now happily cross the big pond when they seek homes for their papers.

Both of these morphing journals are worth keeping an eye on.

The View from the IR Trenches, Part 2

Today, we’ll look at an article about the challenges involved in populating an institutional repository:

Mackie, Morag. "Filling Institutional Repositories: Practical Strategies from the DAEDALUS Project." Ariadne, no. 39 (2004).
http://www.ariadne.ac.uk/issue39/mackie/

The DAEDALUS Project is at the University of Glasgow. This article is an especially interesting case study, and it details a number of useful, imaginative strategies for populating an IR.

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Do Not Want to Deposit Works Themselves

Despite a generally encouraging response, this did not translate into real content being deposited in the repository. . . . We found that it was difficult to get staff to give or send us electronic copies of their papers, even when they had promised to do so. This was our first indication that while staff may be sympathetic many of them do not have the time or the inclination to contribute. They were happy to give us permission to do the work on their behalf, but could not commit to doing the work themselves. Clearly the advantages of institutional repositories were not yet sufficiently convincing to academics to persuade them to play an active part in the process.

Determining Which Articles Can be Legally Deposited Is Difficult and Time Consuming

[T]he majority of academics we contacted were happy for us to establish which of their publications could be added to the repository.

While an extremely useful resource and one that is growing all the time, the [SHERPA] list does not cover all publishers. . . . it has been necessary to track down policies from publishers’ Web sites, or to contact publishers directly where these do not exist or where they do not address the issue of whether an author is permitted to make his or her paper available in a repository. No two publisher polices are exactly the same, and many do not explicitly state what rights authors have in relation to repositories. . . . Interpreting publisher copyright policies is also a difficult area, particularly as there is no real precedent and no case law.

Where copyright policies did not exist or where they were unclear, we contacted the publishers directly and asked for permission. . . . Although some publishers reply quickly, others may take some weeks and some do not reply at all. We found that publishers were more likely to give permission for specific papers to be added than to outline their general policy on the issue. Consequently permissions for most articles have to be established on a case-by-case basis.

It Is Challenging to Identify Possible Depositors Using Open Access Journals

It would be useful to be able to identify additional content in other open access journals, but so far we have not found an easy way of doing this. The Directory of Open Access Journals. . . is very useful, but it does not enable searching by institution or author affiliation.

For IRs to Be Filled, Deposit May Need to be Mandated

Although we have succeeded in adding a reasonable amount of content to the repository we have also been offered significant amounts of content that cannot be added because of restrictive publisher copyright agreements. . . . This is a clear demonstration that major changes need to take place at a high level in order for repositories to be successful. Although some academics have taken the decision to try and avoid publishing in the journals of publishers with restrictive policies, this is still relatively rare. We can inform staff about the issues, but we cannot and should not dictate in which journals they publish. Change is only likely to happen if staff are required, either by the funding councils or by their institution, to make their publications available either by publishing in open access journals or in journals that permit deposit in a repository.

The View from the IR Trenches, Part 1

It may be helpful in understanding IRs to to examine some of the articles mentioned in yesterday’s "Early Adopters of IRs: A Brief Bibliography" posting in more detail.

Today, we’ll look at:

Andrew, Theo. "Trends in Self-Posting of Research Material Online by Academic Staff." Ariadne, no. 37 (2003).
http://www.ariadne.ac.uk/issue37/andrew/

This paper presents findings from "a baseline survey of research material already held on departmental and personal Web pages in the ed.ac.uk domain" (this is the University of Edinburgh’s domain).

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Self-Archiving Disciplinary Differences Matter

As expected, there is a clear difference between academic areas. The average percentage of self-archiving scholars in each College supports this view. Within the College of Science and Engineering (S&E) this figure is 14.81%, which drops to 3.18% within Humanities and Social Science (HSS) and 0.32% within Medicine and Veterinary Medicine (MVM).

However, the situation is more complex than a simple trend of self-archiving being better established in S&E. Looking at the averages between Schools shows that even within Colleges there is a wide distribution of values. In S&E this ranges from 32.67% in Informatics to 6.99% in Engineering and Electronics. . . and in HSS from 12.70% in Philosophy, Psychology and Language Sciences to 0% in Divinity and Law . . . .

Even within individual Schools there is a noticeable change in self-archiving attitudes. For example, self-archiving percentages within the School of GeoScience range from 29.41% in Meteorology down to 0% in Geography. . . .

Disciplinary Archives May Not Be Generally Trusted

Considering the wide-ranging self-archiving trends between academic Colleges and even within Schools, it seems there is a direct correlation between willingness to self-archive and the existence of subject-based repositories. . . . because the ArXiv has become so successful . . . academics trust it as their ‘natural’ repository for self-archived material. The same degree of trust may not yet obtain in the case of the subject repositories mentioned above, which leads to additional self-archiving in home institution repositories. . . . where there is a pre-existing culture of self-archiving eprints in subject repositories, scholars are more likely to post research material on their own Web pages, until such time as those subject repositories become trusted for their comprehensiveness and persistence.

Low Number of Preprints Found on Personal Web Pages

A surprising finding from the baseline survey is the relatively low volume of preprints found on personal Web pages. This could be related to the success of eprint repositories. . . . Preprints do not have anywhere near the same impact factor as those papers from accredited journal titles, so it is possible that researchers would favour only putting their most impressive work in their online CV.

Scholars Are Confused by Copyright Agreements

One aspect of the survey that is not shown in the results is the lack of consistency in dealing with copyright and IPR issues that scholars face when placing material online. Some academic units have responded by not self-archiving any material at all. . . . A small percentage of individual scholars have responded by using general disclaimers that may or may not be effective. Others, generally well-established professors, have posted material online that is arguably in breach of copyright agreements. . . . Most, however, take a middle line of only posting papers from sympathetic publishers who allow some form of self-archiving. It is apparent that if institutional repositories are going to work, then this general confusion over copyright and IPR issues needs to be addressed right at the source.