Awesome Web Archiving Overview

An Awesome List for getting started with web archiving

🏠 Home · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 iipc/awesome-web-archiving · ⭐ 2K · 🏷️ Miscellaneous

[ Daily / Weekly / Overview ]

Awesome Web Archiving Awesome

Web archiving is the process of collecting portions of the World Wide Web to ensure the information is preserved in an archive for future researchers, historians, and the public. Web archivists typically employ Web crawlers for automated capture due to the massive scale of the Web. Ever-evolving Web standards require continuous evolution of archiving tools to keep up with the changes in Web technologies to ensure reliable and meaningful capture and replay of archived web pages.

Contents

Training/Documentation

Resources for Web Publishers

These resources can help when working with individuals or organisations who publish on the web, and who want to make sure their site can be archived.

Tools & Software

This list of tools and software is intended to briefly describe some of the most important and widely-used tools related to web archiving. For more details, we recommend you refer to (and contribute to!) these excellent resources from other groups:

Acquisition

Replay

Search & Discovery

Utilities

WARC I/O Libraries

Analysis

Quality Assurance

Curation

Community Resources

Other Awesome Lists

Blogs and Scholarship

Mailing Lists

Slack

Twitter

Web Archiving Service Providers

The intention is that we only list services that allow web archives to be exported in standard formats (WARC or WACZ). But this is not an endorsement of these services, and readers should check and evaluate these options based on their needs.

Self-hostable, Open Source

Hosted, Closed Source