Top of page

Notice: You can now access our beta presentation of web archives playback with limited content. Read more about these improvements.

Program Web Archiving

Web Archive Datasets

The Library of Congress makes web archive datasets available for researchers to explore, visualize, and re-use the web archives. Datasets are made available either as experimental data packages or as permanent collection datasets. See more info below.


Data Packages with LC Labs

Data packages are datasets packaged together with robust documentation and how-to information. All Library of Congress data packages are available at https://data.labs.loc.gov/packages/ as LC Labs experiments.


United States Elections, Web Archives Data Package


Selected Dot Gov Media Types, Web Archives Data Package


Permanent Collection Datasets

Datasets that have been added to the Library of Congress permanent collections can be found listed in the Selected Datasets collection, such as the Dinosaur comics, Meme generator, and Giphy datasets that contain CSV metadata for images from the Web Cultures Web Archive. The Dinosaur comics dataset also includes the original images.

See all >


Connect with us!

The Library encourages interested parties to contact us to learn more, and to help us understand what other derivative or summary data would be of interest.

Contact Us

Comments, questions, and suggestions related to Web Archiving and this website can be sent to us online.

Location

Web Archiving Program
Library of Congress
101 Independence Avenue, SE
Washington DC 20540-1310