EDUCAUSE Quarterly Special Issue on Cloud Computing

EDUCAUSE Quarterly has published a special issue on cloud computing.

Here are some representative articles:

OCLC’s Web-Scale Library Management Services Available to Early Adopters on 7/1/10

Early adopters will be able to implement OCLC's Web-Scale Library Management Services starting on 7/1/10.

Here's an excerpt from the press release:

Beginning July 1, OCLC will work with libraries that are interested and prepared to implement Web-based services for acquisitions and circulation. This will be followed by successive updates for subscription and license management, and cooperative intelligence—analysis and recommendations based on statistics and workflow evaluation among participating libraries. The cloud computing environment and agile development methodology will facilitate incremental updates while minimizing impact to library operations.

Faced with scarce resources, disparate systems and local maintenance issues during a time when demand for library services has never been higher, OCLC members have made it clear that new, innovative responses are needed to meet these challenges. For the past eight months, OCLC has worked with an Advisory Council and six libraries and library groups as pilots for Web-scale management services. These groups have provided advice to OCLC on an overall direction, offered new ideas that were not in the original development plan, and validated strategic positioning for the service. . . .

OCLC Web-scale Management Services offer a next-generation choice for traditional, back-office operations. Moving these functions to the Web alongside cataloging and discovery activities allows libraries to lower the total cost of ownership for management services, automate critical operations, reduce support costs and free resources for high-priority services. It will also allow libraries and industry partners to develop unique and innovative workflow solutions that can then be shared across the profession.

"OCLC is extending our well established metadata management, resource sharing and discovery services to include the back-office management components of acquisitions and circulation which will allow libraries to extend their use of WorldCat for full library management functions and improved workflow,” said Andrew Pace, Executive Director, OCLC Networked Library Services. “This is a natural extension of OCLC’s mission to help libraries share costs and extend the power of cooperation."

Using Cloud for Research: A Technical Review

Xiaoyu Chen et al. have self-archived Using Cloud for Research: A Technical Review in the ECS EPrints Repository.

Here's an excerpt:

The purpose of the TeciRes project was to conduct a technical review of the current landscape within cloud computing to establish the extent to which existing solutions meet encountered and envisioned requirements for using emerging cloud technologies, in particular those which enable computing and storage cloud facilities for research in Higher Education (HE) institutions, and to make recommendations on further development, guidance, and standardisation.

Shaping the Higher Education Cloud

EDUCAUSE has released Shaping the Higher Education Cloud.

Here's an excerpt:

In February 2010, chief information officers, chief business officers, and industry leaders gathered in Tempe, Arizona, for a two-day EDUCAUSE/NACUBO Cloud Computing Workshop to explore what shape a higher education cloud might take and to identify opportunities and models for partnering together.

One important option is the development of collaborative service offerings among colleges and universities. Yet, substantial challenges raise at least some near-term concerns including risk, security, and governance issues; uncertainty about return on investment and service provider certification; and questions regarding which business and academic activities are best suited for the cloud.

This white paper captures key findings from those two days of exploring, including recommendations for cloud action.

"Using Cloud Services for Library IT Infrastructure"

Erik Mitchell has published "Using Cloud Services for Library IT Infrastructure" in the latest issue of the Code4Lib Journal.

Here's an excerpt:

Cloud computing comes in several different forms and this article documents how service, platform, and infrastructure forms of cloud computing have been used to serve library needs. Following an overview of these uses the article discusses the experience of one library in migrating IT infrastructure to a cloud environment and concludes with a model for assessing cloud computing.

Presentations from Repositories and the Cloud Meeting

Presentations from the recent Repositories and the Cloud meeting, which was sponsored by Eduserv and JISC, and are now available.

Presentations included "Cloud-Based Projects at Belfast e-Science Centre," "Cloud Services for Repositories", "DuraCloud—Open Technologies and Services for Managing Durable Data in the Cloud," and "EPrints and the Cloud."

Read more about it at "Slides and Observations from “Repositories in the Cloud” London."

Cloud Computing and Repositories: Fedorazon: Final Report

JISC has released Fedorazon: Final Report.

Here's an excerpt:

The Fedorazon project is first and foremost the experiences of a small HE/FE team running and maintaining a Repository in the Cloud for one year. Being early adopters we provide both technical, fiscal and practical advice for both our successes and failures in this endeavour. We hope this report provides insight for other institutions wishing to utilise the Cloud for their Repository instance which we wholeheartedly recommend given they read this report first and prepare accordingly.

The Fedorazon project has discovered that a 'Repository in the Cloud' is easy to get up and running (both figuratively and literally); after that, all the complexity of hardware management, political costings and human resource allocation are still right where you left them. None the less we think there are significant cost savings in the Cloud that will only increase over time. We also believe that utilising the 'network effect' of the Cloud institutions can relieve the burden of having a local hardware expert to manage the repository instance. Finally, we believe that Cloud will lead to a significant change in the way we view repository architectures, especially in regards to how a 'preservation architecture' is achieved.

Towards Repository Preservation Services. Final Report from the JISC Preserv 2 Project

Steve Hitchcock, David Tarrant, and Les Carr have self-archived Towards Repository Preservation Services. Final Report from the JISC Preserv 2 Project in the ECS EPrints Repository.

Here's the abstract:

Preserv 2 investigated the preservation of data in digital institutional repositories, focussing in particular on managing storage, data and file formats. Preserv 2 developed the first repository storage controller, which will be a feature of EPrints version 3.2 software (due 2009). Plugin applications that use the controller have been written for Amazon S3 and Sun cloud services among others, as well as for local disk storage. In a breakthrough application Preserv 2 used OAI-ORE to show how data can be moved between two repository softwares with quite distinct data models, from an EPrints repository to a Fedora repository. The largest area of work in Preserv 2 was on file format management and an 'active' preservation approach. This involves identifying file formats, assessing the risks posed by those formats and taking action to obviate the risks where that could be justified. These processes were implemented with reference to a technical registry, PRONOM from The National Archives (TNA), and DROID (digital record object identification service), also produced by TNA. Preserv 2 showed we can invoke a current registry to classify the digital objects and present a hierarchy of risk scores for a repository. Classification was performed using the Preserv2 EPrints preservation toolkit. This 'wraps' DROID in an EPrints repository environment. This toolkit will be another feature available for EPrints v3.2 software. The result of file format identification can indicate a file is at risk of becoming inaccessible or corrupted. Preserv 2 developed a repository interface to present formats by risk category. Providing risk scores through the live PRONOM service was shown to be feasible. Spin-off work is ongoing to develop format risk scores by compiling data from multiple sources in a new linked data registry.

Duke, NC State, and UNC Data Sharing Cloud Computing Project Launched

Duke University, North Carolina State University, and the University of North Carolina at Chapel Hill have launched a two-year project to share digital data.

Here's an excerpt from the press release:

An initiative that will determine how Triangle area universities access, manage, and share ever-growing stores of digital data launched this fall with funding from the Triangle Universities Center for Advanced Studies, Inc. (TUCASI).

The two-year TUCASI data-Infrastructure Project (TIP) will deploy a federated data cyberinfrastructure—or data cloud—that will manage and store digital data for Duke University, NC State University, UNC Chapel Hill, and the Renaissance Computing Institute (RENCI) and allow the campuses to more seamlessly share data with each other, with national research projects, and private sector partners in Research Triangle Park and beyond.

RENCI and the Data Intensive Cyber Environments (DICE) Center at UNC Chapel Hill manage the $2.7 million TIP. The provosts, heads of libraries and chief information officers at the three campuses signed off on the project just before the start of the fall semester.

"The TIP focuses on federation, sharing and reuse of information across departments and campuses without having to worry about where the data is physically stored or what kind of computer hardware or software is used to access it," said Richard Marciano, TIP project director, and also professor at UNC's School of Information and Library Science (SILS), executive director of the DICE Center, and a chief scientist at RENCI. "Creating infrastructure to support future Triangle collaboratives will be very powerful."

The TIP includes three components—classroom capture, storage, and future data and policy, which will be implemented in three phases. In phase one, each campus and RENCI will upgrade their storage capabilities and a platform-independent system for capturing and sharing classroom lectures and activities will be developed. . . .

In phase two, the TIP team will develop policies and practices for short- and long-term data storage and access. Once developed, the policies and practices will guide the research team as it creates a flexible, sustainable digital archive, which will connect to national repositories and national data research efforts. Phase three will establish policies for adding new collections to the TIP data cloud and for securely sharing research data, a process that often requires various restrictions. "Implementation of a robust technical and policy infrastructure for data archiving and sharing will be key to maintaining the Triangle universities' position as leaders in data-intensive, collaborative research," said Kristin Antelman, lead researcher for the future data and policy working group and associate director for the Digital Library at NC State.

The tasks of the TIP research team will include designing a model for capturing, storing and accessing course content, determining best practices for search and retrieval, and developing mechanisms for sharing archived content among the TIP partners, across the Triangle area and with national research initiatives. Campus approved social media tools, such as YouTube and iTunesU, will be integrated into the system.

University of Michigan to Distribute Over 500,000 Digitized Books Using HP BookPrep POD Service

The University of Michigan Library will distribute over 500,000 rare and hard-to-find digitized books using HP BookPrep POD service.

Here's an excerpt from the press release:

HP BookPrep — a cloud computing service that enables on-demand printing of books — brings new life to the traditional publishing model, making it possible to bring any book ever published back into print through an economical and sustainable service model.

As part of a growing movement to preserve and digitize historic content, major libraries are partnering with technology leaders to scan previously hard-to-find works using high-resolution photography. HP's process transforms these scans prior to printing by cleaning up some of the wear and tear that often is present in the originals.

HP BookPrep significantly drives down the cost of republishing books by eliminating the manual cleanup work that would otherwise be required. Based on imaging and printing technology from HP Labs, the company's central research arm, HP BookPrep automates the creation of high-quality, print-ready books from these raw book scans by sharpening text and images, improving alignment and coloration, and generating and adding covers.

People can now purchase high-quality print versions of public-domain, out-of-print books from the University of Michigan Library through HP BookPrep channels, including traditional and online retailers such as Amazon.com.

"People around the world still value reading books in print," said Andrew Bolwell, director, New Business Initiatives, HP. "HP BookPrep technology allows publishers to extend the life cycle of their books, removes the cost and waste burdens of maintaining inventory, and uses a full spectrum of technologies to deliver convenient access to consumers."

For publishers and content owners, HP BookPrep offers an opportunity to offer their full catalog of titles online, irrespective of demand. Because HP BookPrep is a web service that processes books as they are ordered, there is little upfront investment or risk as books are printed only after they are purchased, no matter the volume, eliminating the need for high carrying costs.

Consistently ranked as one of the top 10 academic research libraries in North America, the University of Michigan Library is a true repository for the human record. The print collection contains more than 7 million volumes, covering thousands of years of civilization. HP is collaborating with the university to eliminate barriers and increase access to content as part of an ongoing effort to make the concept of "out of print" a thing of the past.

"Our partnership with HP is a testament to the University of Michigan Library's commitment to increase public access to our library's collections and our continued innovative use of digitization," said Paul N. Courant, librarian and dean of libraries, University of Michigan. "We are excited that HP BookPrep can offer print distribution of the public domain works in our collection and help to provide broad access to works that have previously been hard to find outside the walls of our library."

The collaboration also builds upon HP's existing relationship with Applewood Books, a publisher of historical, Americana books. The company, which has been using HP BookPrep for the last year to republish hundreds of titles, also will distribute HP BookPrep's best-selling titles from the University of Michigan Library.

"Digital Preservation: Logical and Bit-Stream Preservation Using Plato, EPrints and the Cloud"

Adam Field, David Tarrant, Andreas Rauber, and Hannes Kulovits have self-archived their "Digital Preservation: Logical and Bit-Stream Preservation Using Plato, EPrints and the Cloud" presentation on the ECS EPrints Repository.

Here's an excerpt from the abstract:

This tutorial shows attendees the latest facilities in the EPrints open source repository platform for dealing with preservation tasks in a practical and achievable way, and new mechanisms for integrating the repository with the cloud and the user desktop, in order to be able to offer a trusted and managed storage solution to end users. . . .

The benefit of this tutorial is the grounding of digital curation advice and theory into achievable good practice that delivers helpful services to end users for their familiar personal desktop environments and new cloud services.

7 Things You Should Know About Cloud Computing

EDUCAUSE has released 7 Things You Should Know About Cloud Computing.

Here's the abstract:

Cloud computing is the delivery of scalable IT resources over the Internet, as opposed to hosting and operating those resources locally, such as on a college or university network. Those resources can include applications and services, as well as the infrastructure on which they operate. By deploying IT infrastructure and services over the network, an organization can purchase these resources on an as-needed basis and avoid the capital costs of software and hardware. With cloud computing, IT capacity can be adjusted quickly and easily to accommodate changes in demand. Cloud computing also allows IT providers to make IT costs transparent and thus match consumption of IT services to those who pay for such services. Operating in a cloud environment requires IT leaders and staff to develop different skills, such as managing contracts, overseeing integration between in-house and outsourced services, and mastering a different model of IT budgets.

DuraCloud to Test Cloud Technologies for Digital Preservation

DuraCloud will test cloud technologies for digital preservation purposes.

Here's an excerpt from the press release:

How long is long enough for our collective national digital heritage to be available and accessible? The Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) and DuraSpace have announced that they will launch a one-year pilot program to test the use of cloud technologies to enable perpetual access to digital content. The pilot will focus on a new cloud-based service, DuraCloud, developed and hosted by the DuraSpace organization. Among the NDIIPP partners participating in the DuraCloud pilot program are the New York Public Library and the Biodiversity Heritage Library.

Cloud technologies use remote computers to provide local services through the Internet. Duracloud will let an institution provide data storage and access without having to maintain its own dedicated technical infrastructure.

For NDIIPP partners, it is not enough to preserve digital materials without also having strategies in place to make that content accessible. NDIIPP is concerned with many types of digital content, including geospatial, audiovisual, images and text. The NDIIPP partners will focus on deploying access-oriented services that make it easier to share important cultural, historical and scientific materials with the world. To ensure perpetual access, valuable digital materials must be stored in a durable manner. DuraCloud will provide both storage and access services, including content replication and monitoring services that span multiple cloud-storage providers.

Martha Anderson, director of NDIIPP Program Management said "Broad online public access to significant scientific and cultural collections depends on providing the communities who are responsible for curating these materials with affordable access to preservation services. The NDIIPP DuraCloud pilot project with the DuraSpace organization is an opportunity to demonstrate affordable preservation and access solutions for communities of users who need this kind of help."

Podcast: “Library 2.0 Gang 06/09: Library System Suppliers View of OCLC Web-Scale”

In the "Library 2.0 Gang 06/09: Library System Suppliers View of OCLC Web-Scale" podcast, vendor representatives from Axiell, Ex Libris, and LibLime discuss OCLC's Web-Scale Management Services.

Here's an excerpt from the post:

The initial reactions to hearing the announcement included "why did they take so long" and guarded "uh-ho." There were several aspects of, and reactions to, the announcement in the conversation—from welcoming the initiative, the inevitable move of library functionality to the cloud, questions about the size of library that would use it, the cost model, and of course issues about data and API availability.

Free Cloud Services from Amazon: AWS in Education

Amazon is offering academic community members free cloud services in its AWS in Education program.

Here's an excerpt from the press release:

Amazon.com, Inc. announces AWS in Education, a set of programs that enable the academic community to easily leverage the benefits of Amazon Web Services for teaching and research. With AWS in Education, educators, academic researchers, and students worldwide can obtain free usage credits to tap into the on-demand infrastructure of Amazon Web Services to teach advanced courses, tackle research endeavors and explore new projects. . . AWS in Education also provides self-directed learning resources on cloud computing for students.

Read more about it at "AWS in Education FAQs."

NSF Awards about $5 Million to 14 Universities to Participate in the IBM/Google Cloud Computing University Initiative

The National Science Foundation has awarded about $5 million in grants to 14 universities to participate in the IBM/Google Cloud Computing University Initiative.

Here's an excerpt from the press release:

The initiative will provide the computing infrastructure for leading-edge research projects that could help us better understand our planet, our bodies, and pursue the limits of the World Wide Web.

In 2007, IBM and Google announced a joint university initiative to help computer science students gain the skills they need to build cloud applications. Now, NSF is using the same infrastructure and open source methods to award CLuE grants to universities around the United States. Through this program, universities will use software and services running on an IBM/Google cloud to explore innovative research ideas in data-intensive computing. These projects cover a range of activities that could lead not only to advances in computing research, but also to significant contributions in science and engineering more broadly.

NSF awarded Cluster Exploratory (CLuE) program grants to Carnegie-Mellon University, Florida International University, the Massachusetts Institute of Technology, Purdue University, University of California-Irvine, University of California-San Diego, University of California-Santa Barbara, University of Maryland, University of Massachusetts, University of Virginia, University of Washington, University of Wisconsin, University of Utah and Yale University.

Sun Microsystems Releases Open APIs for the Sun Open Cloud Platform

Sun Microsystems has released Open API's for its Open Cloud Platform.

Here's an excerpt from the press release:

Today at its CommunityOne developer event, Sun Microsystems, Inc. . . . showcased the Sun Open Cloud Platform, the company's open cloud computing infrastructure, powered by industry-leading software technologies from Sun, including Java, MySQL, OpenSolaris and Open Storage. Signaling a massive opportunity to open the world's nascent cloud market, Sun also outlined that a core element of its strategy is to offer public clouds and previewed plans to launch the Sun Cloud, its first public cloud service targeted at developers, student and startups. . . .

As part of the company's commitment to building communities, Sun also announced the release of a core set of Open APIs, unveiled broad partner support for its cloud platform and demonstrated innovative features of the Sun Cloud. Sun is opening its cloud APIs for public review and comment, so that others building public and private clouds can easily design them for compatibility with the Sun Cloud. Sun's Cloud API specifications are published under the Creative Commons license, which essentially allows anyone to use them in any way. Developers will be able to deploy applications to the Sun Cloud immediately, by leveraging pre-packaged VMIs (virtual machine images) of Sun's open source software, eliminating the need to download, install and configure infrastructure software. To participate in the discussion and development of Sun's Cloud APIs, go to sun.com/cloud.

In related news, according to the Wall Street Journal, IBM is negotiating to acquire Sun Microsystems.

Cloud Computing: DuraSpace Report to Mellon Foundation

The Andrew W. Mellon Foundation has released a progress report from the DuraSpace project, a joint project of the DSpace Foundation and the Fedora Commons. (Thanks to RepositoryMan.)

Here's an excerpt from "DSpace Foundation and Fedora Commons Receive Grant from the Mellon Foundation for DuraSpace" that describes the project:

Over the next six months funding from the planning grant will allow the organizations to jointly specify and design "DuraSpace," a new web-based service that will allow institutions to easily distribute content to multiple storage providers, both "cloud-based" and institution-based. The idea behind DuraSpace is to provide a trusted, value-added service layer to augment the capabilities of generic storage providers by making stored digital content more durable, manageable, accessible and sharable.

“What Cloud Computing Really Means”

Eric Knorr and Galen Gruman provide a concise overview of "cloud computing" in "What Cloud Computing Really Means."

Here's an excerpt:

Cloud computing comes into focus only when you think about what IT always needs: a way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software. Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT's existing capabilities.

Leslie Carr on Repositories and Cloud Computing

In "The Cloud, the Researcher and the Repository," Leslie Carr discusses repositories and cloud computing, especially the problem of large file deposit.

Here's an excerpt from:

The solution that Tim [Brody] has come up with is to allow the researcher's desktop environment to directly use EPrints as a file system—you can 'mount' the repository as a network drive on your Windows/Mac/Linux desktop using services like WebDAV or FTP. As far as the user is concerned, they can just drag and drop a whole bunch of files from their documents folders, home directories or DVD-ROMs onto the repository disk, and EPrints will automatically deposit them into a new entry or entries. Of course, you can also do the reverse—copy documents from the repository back onto your desktop, open them directly in applications, or attach them to an email.

RAD Lab: Cloud Computing Made Easy

The RAD Lab (Reliable Adaptive Distributed Systems Laboratory) is working to "enable one person to invent and run the next revolutionary IT service, operationally expressing a new business idea as a multi-million-user service over the course of a long weekend."

Read more about it at "RAD Lab Technical Vision" and "Trying to Figure Out How to Put a Google in Every Data Center."