Home | Intro | Background | Participants | Links | Discussion | Proceedings


Time & Bits Conference Background Paper
by Peter Lyman & Howard Besser

I. Problem Statement
II. The Preservation of Digital Signals: Strategic Factors.
III. The Uses of the Preservation Metaphor.
IV. The Impermanence of Digital Memory.
V. A Review of the Current State of Preservation Planning, Organization, Funding and Technology.

I. Problem Statement

Digital information saturates every aspect of everyday life: from international financial markets to the grocery checkout counter; from children's mental picture of dinosaurs (derived from computer-generated movie images) to Voyager's computer generated images of the solar system; from databases mapping location of toxic dumps to those tracking our social security and retirement funds. What if the digital information with which we manage our lives and institutions were to disappear? What kind of picture of life in the late twentieth century could the historians of the future uncover, if the only lasting record of our time were printed on paper?

In fact, our digital cultural heritage is disappearing. Bits are electronic signals, which tend to disappear unless they are recorded (unlike atoms, which tend to persist). And even recorded digital records tend to become inaccessible, made obsolete by the dynamic pace of change of innovation in information technology, which counts new technical generations twice a decade. Consider the following:

  • Most computers have electronic clocks which will not allow them, or the data, images and information they contain to operate past the year 1999. Were this to happen, the 21st Century, predicted to be the age of information, may begin with the loss of most of its digital memory, and with it the recent history of science and technology, the information records of government and corporations, popular culture and the digital arts.
  • Research at the Internet Archive suggests that half of the World Wide Web disappears every month, even as the Web itself doubles every year.
  • Into the Future: On the Preservation of Knowledge in the Electronic Age a film on the preservation of our digital cultures, documents the disappearance of digital records on the location of toxic waste dumps in New York State, and of Voyager's digital records and other NASA data.
  • Electronic mail and word processing constitute the informal life of nearly every organization, but often disappear when a hard disk crashes.

How, then, can we create a historical record of our digital cultural heritage, like that created by libraries and archives for print, and by museums for artifacts? Or are digital signals destined to be a kind of oral culture, living only as long as they are remembered and repeated? What can be done to preserve our digital cultural heritage?

KEY DOCUMENT:

Into the Future: On the Preservation of Knowledge in the Electronic Age. A film by Terry Sanders. Copyright 1997 Commission on Preservation and Access. Available from The American Film Foundation, PO Box 2000, Santa Monica, CA 90406 (310)459-2116

II. The Preservation of Digital Signals: Strategic Factors.

In the development of every new medium for communication there is a moment of realization that the origins of a new culture are being lost, and must be saved. This was true of paper, of film and photography, of the early days of radio and television, and is now true of digital works.

Early television broadcasts were live and not recorded; when they were recorded on videotape, the tape was often re-used soon thereafter; when the videotape was archived, it was discovered that tape itself decays, and must continuously be re-recorded onto new physical media to refresh its content. The costs of preservation of the history of television are only now becoming acceptable, as secondary markets develop for its intellectual property.

But for each new medium, the preservation solution must be unique, reflecting the nature, use and economic value of the message; the technical characteristics of its medium; its social and organizational contexts, and the intellectual property regime that governs the markets shaping its creation, consumption and use.

1. The life cycle of information.

How do different genres of digital documents differ in design, use and value in ways that affect their survival? How do digital documents create patterns of use at different points in their life cycle, thereby creating an interest and incentive to preserve them?

2. Technical profile.

What is the inherent persistence of digital information as a technical artifact? What design and manufacturing tradeoffs affect their survival over time?

3. Economic and cultural factors.

What are the incentives to invest in the persistence of digital documents? What cultural factors shape these incentives? How might new visions and concepts change this behavior?

4. Organizational contexts.

How do the social and organizational contexts within which digital documents are used create priorities for selecting material to be preserved? What kinds of social organizations and professions will be responsible for the different genres of digital documents? Will institutions originally established to house print documents be robust enough to act as sites for digital archives?

5. Legal.

What intellectual property regimes will govern digital documents? How do different copyright, patent and contract regimes create incentives and disincentives for their preservation?

In sum, the long-term preservation of information in digital form requires not only technical solutions and new organizational strategies, but also the building of a new culture which enables the persistence of bits over time. This is a problem which requires that a diverse community of experts --- computer scientists, archivists, social scientists, artists, lawyers and politicians --- engage in a dialogue about the preservation of a new kind of cultural heritage, the digital document.

Key document:

Documents in the Digital Culture: Shaping the Future. The Hawaii International Conference on Systems Science. January 1995.

III. The Uses of the Preservation Metaphor

A dialogue about time and bits, the preservation of our digital cultural heritage, might best begin by exploring the wealth of practical expertise developed in the preservation and conservation of the traditional media for our cultural heritage --- paper and artifacts. Although bits and atoms have very different technical profiles and social contexts, the expertise gained in the management of the acid paper crisis and in conservation of works of art is the best guide available with which to explore strategies for preserving digital documents. Moreover, the expertise of the computer science community will be of great benefit to the preservation and conservation communities in their continuing work with paper and artifacts. A preservation strategy, then, has this profile:

1. Technical.

The environmental factors within which acid free paper will persist are now well defined. Microfilm is a standard mode of preservation of printed records, with a projected 500-year life within standard conditions. Given the persistence of paper documents, much of the cost of preservation is focused on storage techniques, e.g., catalogs to organize and retrieve documents and space with requisite HVAC. Acidic paper, however, is unstable, thus new technologies have been mobilized to save its content, ranging from microfilm, to digitization, to de-acidification.

2. Economic and Cultural.

Economic incentives exist as long as printed information is used. Tacitly the longevity of printed documents is based upon redundancy as well as persistence. The issue of economic value affects the context within which the preservation of information is managed. For example, as long as a printed document has economic value, it tends to be managed by market forces; when it does not, it tends to be managed by nonprofit organizations (e.g., libraries and archives). Emerging markets for new media have recently created economic incentives to reclaim intellectual property rights for information which has been managed by the public sector.

3. Organizational and social context.

Libraries and archives have the mission of selecting, organizing and preserving knowledge printed upon paper. For example, the Library of Congress has a funded mission as an archive of all copyrighted material from the United States, and the National Archives and Presidential Libraries for government documents. Most published paper documents are widely distributed, thus responsibility for preservation is diffuse, and it is expensive and inefficient for individual archives and libraries to develop local preservation techniques. To avoid redundant preservation efforts and assure best practices, coordination has been provided by traditional institutions taking new roles, and the creation of new kinds of organizations. For example, the National Endowment for the Humanities has provided central funding for microfilm preservation, and the Commission on Preservation and Access was founded to provide intellectual leadership by commissioning and publishing educational and training materials.

4. The life cycle of information use.

Paper creates value by virtue of its low cost, portability and redundancy, and persistence. In practice the quality and redundancy of paper reflects the estimated value of its printed content, whether economic or political. More precisely, it often reflects the predicted economic value of its content over time; thus there will be many copies of a newspaper printed on acidic paper with a very short life, and few copies of an important scholarly book printed on very good paper. This kind of triage means that the longevity of popular culture is most at risk, even though it may be heavily used; for example, research libraries buy paper newspapers to use, but archive only the microfilm version for preservation purposes. On the other hand, important government documents and scholarly publications from underdeveloped countries may be printed on acidic paper or not be archived in low humidity, low temperature environments, and may survive only in Western research libraries, if they survive at all. For these reasons, the preservation of printed documents is shaped by their potential value for scholarship, as defined by the collection policies of major libraries and archives, and as funded by the government, foundations, local communities and by higher education.

Finally, as printed documents lose their market value, the methodology for preserving them may shift to "access," that is, minimizing the potential costs involved in finding and using a document. The access rationale often leads to the digitizing of digital documents, both to preserve the original from the wear and tear of use, and to make it possible to use a document without travel. An important initiative to test this link between preservation and access is the Journal Storage Project (JSTOR), a Mellon Foundation created non-profit corporation to create a national digital archive of back issues of scholarly journals (see http://www.jstor.org). JSTOR's innovation is not so much the technology involved as the economic model: will individual institutions agree to fund a national digital archive dedicated to preservation and access of rarely used but potentially important journals, in place of local copies of the same journals which require recurring space and maintenance costs?

5. Is there a Legal Right to Preserve?

Copyright establishes a time delimited right to distribute the original expressions of an idea, balanced by some public rights such as Fair Use. Copyright laws now also permit copying for the purpose of preservation of printed records, although in practical terms the right to preservation copying is clearest when potential historical interest outweighs economic value. Given the creation of secondary markets for intellectual property in new media, however, there is considerable political pressure to extend the length of copyright, and a growing practice of protecting intellectual property through contract (e.g. licensing instead of selling, and particularly, shrink wrap licenses) which do not grant preservation rights, or may reduce the right to use materials which have been preserved.

Paper made possible the first information society, the print revolution, but with the expansion of the use of paper records in the 19th Century, mass production techniques created acidic paper which oxidized and quickly turned brittle. The film Slow Fires by the Commission on Preservation and Access defined public consciousness of this crisis, much as "the year 2000" crisis is shaping public consciousness of the fragility of digital documents. The case of acid paper began with the recognition that an economic incentive to lower the price of paper led to a technical decision that had unanticipated long term consequences for the persistence of the written record. This, in turn, has produced a number of kinds of preservation strategies which might serve as instructive models for digital documents:

  • Redesigning technology. The brittle paper problem has caused paper manufacturers to redesign the technology for paper production;
  • Consciousness raising. Understanding the tradeoff between costs and durability of the paper medium enables publishers to make better decisions about the long term cost and value of information through its entire life cycle;
  • Managing for preservation. Librarians now monitor storage conditions affecting the rate of decay of paper (e.g., humidity and heat), regulate the use of brittle books, and are trained to intervene to repair and conserve them;
  • New technologies. Although they are still expensive, new techniques for de-acidification have been developed to preserve vulnerable documents deemed to be high priority for preservation;
  • Changing media. The information content of vulnerable paper documents can be saved by recording their format into microfilm or digital media.

In sum, the strength of the strategies that have been developed to preserve paper documents is that paper is relatively well integrated into organizational frameworks of various kinds, and managed by professional librarians and archivists who are aware of the problem and trained to manage it; and, given the persistence of the medium, the fundamental strategy is educational--- documenting the conditions and practices that reinforce the essential stability of the medium. The weakness of the preservation strategy is that awareness of the problem often becomes apparent at the end of the information life cycle when there is relatively little economic incentive or organized social interest to solve it. "Preservation," then, is not a systemic solution and for this reason it is unlikely to attract new technology. While all of the strategies for preservation of paper are useful models for enhancing the persistence of digital documents, given the pace of technical innovation it is still possible to redesign the medium itself to enable the persistence of bits over time, if the technology community becomes aware that its work constitutes a cultural heritage which is disappearing.

 

Key Documents.

Slow Fires: On the preservation of the human record. A film on the problem of preservation of the printed record. The Commission on Preservation and Access.

Technical and educational publications of The Commission on Preservation and Access are listed at their Publications Web Site ( http://clir.stanford.edu/cpa/ ).

IV. The Impermanence of Digital Memory.

And yet, however instructive the lessons of paper preservation may be, bits and atoms are very different information media, which will require very different preservation strategies. Digital documents are not artifacts, they are signals --- signals that must be continuously refreshed or they disappear. The default condition of paper is persistence, if not interrupted; the default condition of electronic signals is interruption, if not periodically renewed. Thus, the preservation method of storing a work under safe conditions is not at all effective in helping digital works to persist. The characteristic persistence of digital documents is suggested by the term "memory," a term which simultaneously manages to suggest both the persistence of memory and the possibility of forgetting. To summarize the profile of digital documents:

1. Technical.

As signals, the persistence of digital documents is dependent upon the nature of the artifacts that it may inhabit at any given point in time. In active memory (RAM) it is dependent upon the presence of electricity. Stored on a Ferro-magnetic medium (disk or tape) or optical medium (CD or other optical), it may well outlast the operating system which is required to access it, but in any case neither medium is rated to last as long as fifty years even in ideal conditions. While new physical media (such as glass substrate optical disks) may be engineered to last longer, a more critical problem is that the file formats in which digital documents are encoded will not be readable by future systems unless proactive measures are taken to make them persist. Suggestions to address this include periodic migration from system to system as quickly as the process of innovation demands, or periodic creation of emulation software to make these documents accessible on future systems.

2. Economic.

Both the form and content of knowledge in digital forms are still being continuously reinvented through a process of technical innovation, thus digital documents present a continuous problem of obsolescence, not only in their form but in their content as well. This obsolescence is both technical and economic, in that there must be continuous and active technical investment to preserve digital documents.

Like paper, there is a marked distinction in the nature of the persistence in different kinds of digital documents.

  • Digitized documents were once paper, but preserved in digital form in order to provide a new kind of access to works unlikely to be preserved in paper formats for economic reasons.
  • Digital documents, now sometimes called "born digital," are likely to be refreshed as long as they are of economic value to the organization that created them, but few organizations have created or support digital archives for information that is no longer used .

However, many of the new genres of digital documents have new kinds of economic value: because they are accessible, like the Web; or because they are the fastest medium for distribution of information, like electronic journals; or because they are a kind of performing art or entertainment, like games, MOOs and MUDs. Like other forms of popular culture, digital cultures may be thought of as transitory until it is too late to save them. In this sense, digital documents may often occupy a cultural niche as part of electronic culture, like radio and television, which is enjoyed but not treated as part of our cultural heritage until it is too late.

3. Organizational and Social Contexts.

Organizations dependent upon digital records, such as corporations and governments, have developed straightforward risk management strategies for "refreshing" digital documents, focused upon minimizing the vulnerability of the media upon which they are stored, and upon moving data from one system to another. (Corporate data processing centers university computing centers, and other mainframe-based organizations understand issues of backup and refreshment of data, but don't necessarily understand the implications of real migration, particularly for more complex digital documents, or digital information within software environments that are still evolving rapidly and thus creating subtle kinds of obsolescence.)

Unlike print, which has had 500 years to create institutional contexts, digital documents are still in the early stages of innovation. Typically, in the first phase of innovation a new technology will imitate an older one; thus digitized documents are created from printed originals as a means to create new value by expanded access. But new social and organizational contexts made possible by digital documents are still emerging, only suggested by terms like "virtual community" and "distance education." In this sense, the organizational contexts which are responsible for digital documents remain to be defined and founded.

There are serious questions as to who will take responsibility for making digital information persist over time. In the world of printed publications, owners of intellectual property rights generally do not take on this role, but cede it to libraries. In the digital world, legal, technical, and economic factors may preclude libraries from doing this. Even if the legal and technical issues are overcome for libraries (or if content owners decide to manage persistence over time themselves), we may need to develop social and legal mechanisms for "rescuing" digital information when the organization in charge of making it persist no longer has the resources or will to do so. We also need to deal with the issue of how many copies of anything digital we need to save; published material persists in part because of redundant saving of the items.

4. Use and Value.

If paper creates value by virtue of its portability and persistence, digital documents may also create value by virtue of universal or global accessibility and malleability. Given that this malleability means that information may be customized, and will evolve as it is read and added to by readers, it is not clear whether there will ever be authoritative versions of digital documents that might be selected into a definitive collection. The form of a print document is fixed by the author and publisher, its facticity making it authoritative, but the reader gives form (and often adds content) to digital documents which may draw from links to information on many other servers, the content of which may well be evolving in real time. Thus, for example, software may give custom form to information that is derived from a continuously changing database; thus a given representation of a database may be time stamped, to signify the moment at which this representation was frozen from a dynamic source. Given this malleability, what should be preserved? Is the word "document" or "collection" even applicable to digital information?

5. The Legal Right to Preserve.

In the debates about the forms that intellectual property might take in dealing with digital documents, such as the Commerce Department's Report of the Working Group on Intellectual Property Rights (September 1995) and subsequent proposed legislation, there is tension between two principles. On the one hand, it is often granted that preservation is a legitimate public good which should have special status in the law; on the other, the concept of public good is being replaced by the commodification of digital knowledge in order to create the electronic marketplace. In the print world, the space between is occupied by the doctrine of Fair Use, which exempts educational and preservationist activities from copyright. In the digital world there is as yet no Fair Use, indeed the White Paper rejects the concept as an unfair subsidy of education by publishers. Thus, for example, the Preserving Digital Information paper on digital preservation commissioned by the Research Library Group and the Commission on Preservation and Access suggests that "certified archives" may have to be more like time capsules than archives or libraries, that is, removed from public access until their contents had come into the public domain with the end of copyright, whenever that might be. The obvious question is, given this withdrawal from use, what incentive would exist to fund such an archive? Further, it is increasingly likely that digital documents will not be governed by an intellectual property regime at all, but will be protected either by technological means ("cryptolopes," digital boxes) which would make them even more difficult to preserve, or by licensing mechanisms under the Uniform Commercial Code section IIB (e.g., "shrinkwrap" licenses). One dimension of the creation of methodologies for the preservation of digital documents, then, must be participation in the ongoing legislation, litigation, and treaty negotiations which are shaping the intellectual property regime within which digital documents will be managed and preserved.

Key Documents:

Brewster Kahle, "Preserving the Internet." Scientific American, March 1997.

Michael Lesk. Preserving Digital Objects: Recurrent Needs and Challenges.

Ejan Mackaay. "The Economics of Emergent Property Rights on the Internet." The Future of Copyright in a Digital Environment. (P. Bernt Hugenholtz ed., 1996).

Preserving Digital Information: The Report of the Task Force on Archiving of Digital Information. Commissioned by The Commission on Preservation and Access and The Research Libraries Group, Inc.

V. A Review of the Current State of Preservation Planning, Organization, Funding and Technology.

The short history of the personal computer makes it clear that information recorded in digital form will not continue to be accessible through rapidly changing software and hardware formats unless users take proactive steps to make it accessible. Around 1980, most digital files were stored on 8-inch floppy disk drives, yet today very few such drives exist anywhere in the world. As recently as a dozen years ago (when there was already a huge installed base of personal computers) most personal computer data was stored on 5.25-inch floppies, yet it is becoming increasingly difficult to find disk drives that will read this type of diskette.

Some people have advocated "refreshing" -- copying files from a storage medium that is becoming archaic onto a newer more commonplace medium. But this addresses only the physical strata aspect of the problem. A key problem it fails to address is that of file formats. Returning to our example from a dozen years ago, the file that we might have transferred from 5.25-inch to 3-inch diskettes would probably have been a WordStar or Visi-Calc file. Such files would be completely unreadable by today's word processors and spreadsheets (which often can't even read files created by older versions of the same).

The Preserving Digital Information task force suggested that physical strata and file format problems might be handled through either emulation strategies or migration strategies which continually move data into new formats that work with contemporary applications. Such strategies may be more effective if coupled with attempts to periodically "exercise" digital documents; using them is one method to make sure that they will continue to be usable. But these strategies become more problematic as we move to information environments which are not fully contained within a limited space. These strategies can work well with models based upon single files or discrete groups of files, but become more difficult to follow when the boundaries of an information environment are unclear or very large. Today we see this most frequently when one wishes to backup their website or move aspects of their website onto a non-networked machine (for example, for a public demonstration or to work on a laptop). What does one choose to move? Only a particular hierarchy of their own files? Or some of the files that are linked to? And if so, how deep does one follow the links? (Some current web backup applications allow the user to specify how many levels of links to backup beyond a particular hierarchy.) And, as many of the linked files are periodically revised, how can we best save a dynamically changing environment? As information environments become larger and more interconnected, can we develop strategies to save or migrate these (other than the Internet Archive's approach to save a snapshot of nearly everything in the public domain of the Web)?

In addition to the problem of saving a given work within its more general context, there is also the problem of saving the various behaviors of a work, and making sure that these persist over time. For instance, a "digital book" might incorporate behaviors that let the user jump from a table of contents to a given chapter, or from a section of text to the accompanying footnote. We need to find ways to make not only the text persist over time and into new display environments, but we should find ways to make the appropriate behaviors persist as well (or, at a very minimum, to encapsulate with the document notes indicating what those behaviors were). This whole set of problems becomes even more difficult (if not untenable) with attempts to preserve fully interactive programs/environments (such as shared virtual worlds).

Another key set of issues revolves around longevity problems that may be provoked by attempts to fix more immediate problems like those involving delivery, storage, and commerce. In order to solve immediate problems, much work in recent years has focused on altering file formats. Compression schemes have been developed to ease storage and delivery problems. Encryption schemes and container architectures have been developed to enhance digital commerce. Many of these file alterations use proprietary schemes and do not adhere to widely-accepted standards. These alterations add an extra level of complexity to viewing a file, one that may not work well in future information environments that look radically different from those of today. While some of us may have faith that 50 years from now we'll have the tools to display an ordinary Microsoft Word file from today, the ability to display an encrypted Word file (or one wrapped in a Digibox or a Cryptalope) is likely to be quite problematic.

Metadata enabling a digital object to describe itself is not yet adequate to address some of these problems (such as those revolving around file formats and compression). Embedded metadata (such as uncompressed clearly-readable ASCII within a header) will allow future users and applications to know what type of digital object this is, what file format, what will be needed to emulate a viewer for this file, etc. More extensive metadata can carry information to allow for better display (such as color metrics or palette), to note links to other digital objects, etc.

A number of different approaches are now emerging to deal with some or all of these problems, and most of these do not exclude the other approaches. Since important experiments and prototypes are emerging from sectors that often do not easily exchange information with each other --- hardware manufacturers, software designers and vendors, academic computer scientists, archivists and librarians, and publishers --- information is still difficult to obtain.

Save everything -- The Internet Archive follows the approach of trying to save most of what is on the public Web in time slices governed by how long it takes their web-crawlers to copy everything they are seeking (currently approximately two full slices per month). Internet Archive, a non-profit organization, uses a data archive developed by Alexa Internet, which generates income from access to "lost" files and data-mining. Long-term questions involve how well this will scale as the Internet continues its explosive growth, strategies for making sure these extensive backups will persist over time, by evolving copyright regimes and liabilities, and whether decisions that have been made about what not to save (i.e. the potential liability of archiving, such as pornographic material under legislation like the Communications Decency Act, or protected or inaccessible sites, etc.) may have long-term historical implications.

Refreshing strategies -- The practice of moving bits to a new physical storage medium is well understood by those who maintain backup files of large quantities of data. Most large organizations with mainframe computers understand the technical and economic issues involved in periodic movement of data to new physical strata, and in storage of that strata over time. But this strategy does not take into account the problem that the file formats which are continuously refreshed may not be readable by future applications (the "WordStar problem").

Migration strategies -- We know very little about the process of refreshing while at the same time moving to new, more accessible file formats. We know even less about what it takes to create emulation software. We need to begin to understand the technical issues and costs involved in following these kind of strategies.

"Exercising" strategies -- The idea that continual community use of a digital object will make it persist over time is a very powerful one, but this is something that we don't yet understand very well. We can imagine that members of the public will continuously convert popular works into forms viewable in newer environments. But we know little about how this will affect any guarantee of the integrity of these works, nor of how this strategy will affect less popular works. We need further work on the likely impacts of these strategies.

Life-cycle Management -- Innovators within the archives and records management community have begun stressing "life-cycle management" -- information professionals becoming involved with documents as early as possible in the document's life in order to best ensure that that document will persist over time. This paradigm (which differs from the late-stage interventionist model that has characterized much previous preservation work) has been applied particularly to collections of electronic records. (See the Society of American Archives, Australian Archives, and Bearman/Trant article below.)

Attempts to Articulate the Problems and Establish a Research Agenda to Solve them -- The library preservation community made its first formal attempt to grasp this large problem with the publication of "Preserving Digital Information: The Report of the Task Force on Archiving of Digital Information". The Association of Research Libraries, the Commission on Library & Information Resources, and the Coalition for Networked Information have proposed a series of follow-up workshops to better articulate the problems of long-term preservation of digital information, and to produce an agenda for action and research.

Until a viable holistic approach emerges, what should we do to assure that our works will persist as long as possible (or as needed) into the future? In the long run, both print and digital preservation strategies are best sustained by the definition of technical standards; but standards evolve from best practices, which are currently evolving in the following areas.

* Save in the most common file formats. The more files that exist in a given format, the more likely that file converters or emulators will be written for that format (because of economies of scale). Conversely, files stored in less common formats will face more obstacles being viewed in future environments.

*Avoid compression where possible. Future viewing environments will not necessarily know how to decompress current compression formats. When compression is necessary, use the most common compression formats (see note on file formats above). Whenever possible avoid lossy compression, as we do not yet know the full implications of that loss on future viewing environments.

*For image capture, use color bars and scales. Color bars within an image can be effective in rebalancing color for future viewing environments. Scales can provide users with an idea of object size, and could potentially be used to develop online measurement tools.

*Keep a log of processes and changes to the digital object. Many seemingly innocuous things we do to a digital object may have significance for future viewing environments. (For example, the color palette of a scanner or attempts to match screen color to the color of an original object may severely affect the use of color management systems in future viewing environments.) Keeping a record of processes and changes done to an object (including ICC color profiles or the makes and models of scanners and other tools) should help future environments adjust for artifacts we add today.

*The Rosetta Stone strategy: Save as much metadata as possible. When we create or alter a digital object, we usually have much greater access to information about that object than at any other point in its life-cycle. Because we know so little about future viewing environments, we don't know which of the seemingly innocuous bits of metadata may later prove important to those environments. The more information we can save, the more likely we will be able to provide future generations with a "key" for unlocking the contents of whole classes of lost data.

Key Documents:

Brewster Kahle, "Preserving the Internet." Scientific American, March 1997.

David Bearman and Jennifer Trant, "Electronic Records Research Working Meeting May 28-30, 1997 A Report from the Archives Community" D-Lib Magazine, July/August 1997

Australian Archives, "Managing Electronic Records - A Shared Responsibility (1997 revision of policy statement)

The National Library of Australia, National Preservation Office. Statement of Principles: Preservation of and Long-Term Access to Australian Digital Objects.

Preserving Digital Information: The Report of the Task Force on Archiving of Digital Information. Commissioned by The Commission on Preservation and Access and The Research Libraries Group, Inc.

Jeff Rothenberg, "Ensuring the Longevity of Digital Documents." Scientific American, January 1995 (24-29) description of layered emulation strategy for refreshing digital documents.

Society of American Archivists, "Statement on the Preservation of Digitized Reproductions" (6/97)

 

As in every other dimension of the information revolution, our thinking about the persistence of digital memories tends to be trapped inside the cultural habits of an industrial age. Thus we speak of the twenty-first Century using metaphors of the Nineteenth Century: information "highways," "digital libraries," "electronic publishing." The problem of preservation or the persistence of digital documents, "time and bits," might best be solved by changing the rules, that is, by new visions and metaphors which redefine the way we think about information.

While an intellectual property law for digital documents has tended to extend the rules governing print documents, some have argued that the nature of information itself has changed, and new law must be based upon an information environmentalism which combines the idea of an ecology (the complex interconnections between living things) and welfare economics (markets routinely fail to calculate the costs of unintended consequences). (See James Boyle, "A Politics of Intellectual Property: Environmentalism for the Net?" Law in the Information Society Homepage: http://www.wcl.american.edu/pub/faculty/boyle/intprop.htm) For the problem of the persistence of digital documents, then, an Information ecology model might these kinds of tenets: become extinct, some digital information may become lost. Yet it is important that a diversity of information persists over time. Information technologies from the very beginning for life-cycle management, using traditional interventionist preservation only in failed areas. Most can be designed using technical standards which enable them to survive for a substantial amount of time, or be recoverable in principle. This implies sustaining not only information, but also inter-relationships between information, and information about standards, together as a system. life cycle management, including an understanding of the sources of economic support for sustainability: continuity of digital collections (e.g., the National Archives, Social Security, Federal Court, Patents, Libraries) and the continuity of cultural heritage in other digital formats more generally, thus there will be many strategies, not one. This is only one metaphor, but one which illustrates the point that a strategy for the persistence of bits may well begin with a new vision of the nature of information itself, not with the hardware of technique, organizations, markets and law.

Finally, acid paper created brittle books for many decades before it was defined in the public consciousness as a problem, primarily through the work of the Commission on Preservation and Access. How can the problem of the preservation of digital documents be defined in a manner that engages a diverse community of interests in a dialogue about how to solve it?

Home | Intro | Background | Participants | Links | Discussion | Proceedings

 
about 'long now'
library
purpose_
projects_
location_
clock
shop
time links
discussions
contribute
main