Web Caching

Web Traces and Logs

    Traces of web traffic and logs from web servers, proxies, and clients are very useful to web caching researchers. This is a table of known publicly available traces and logs, and some information about each (in no particular order). I am very interested in obtaining access to additional logs (as are many others in the research community). If you know of more available logs, please let us know.
Name Available From When Collected Type Workload Characteristics Notes
UC Berkeley Web Proxy Traces Internet Traffic Archive November 1996 web proxy home modem users  
Digital's Web Proxy Traces Digital Equipment Corp. Aug-Sep 1996 web proxy corporate lan users  
Weekly Access Logs at NLANR's Proxy Caches IRCache Current Week web proxy upper level proxy anonymized daily
Daily Access Logs from CA*netII's Proxy Caches CA*netII November 1999 web proxy upper level proxy consistent ip hiding so logs may be concatenated
Boeing Proxy Logs W3C WCA Repository Mar 1-5 1999 web proxy corporate firewall anonymized and consistent across proxies, but not across days
Univ. of Pisa Seven month traces of from CS dept. email: cao@cs.wisc.edu (286MB too large for public access -- any volunteers?) Sep 1996 - Mar 1997 ? CS dept. users URLs and client ids anonymized
Virginia Tech's Proxy Traces Virginia Tech Feb-Oct 1995 web proxy and client four sets; 25-61 workstations, 12k-227k requests  
Boston University CS Dept Client Traces Boston University CS Dept and Internet Traffic Archive Nov 1994 - May 1995 client 2 sets; 5-32 workstations, 17k-118k requests  
EPA-HTTP: a day of HTTP logs from a busy EPA WWW server. Internet Traffic Archive and Univ. of Wisconsin  Aug 29, 1995 (24 hours) web server Internet users  
SDSC-HTTP: one day of HTTP logs from a busy San Diego Supercomputing Center web server Internet Traffic Archive and Univ. of Wisconsin  Aug 22, 1995 web server Internet users  
Calgary-HTTP: a year of HTTP logs from a University CS department web server Internet Traffic Archive and Univ. of Wisconsin  Oct 94 - Oct 95 web server Internet users  
ClarkNet-HTTP: two weeks of HTTP logs from a busy ISP Internet Traffic Archive and Univ. of Wisconsin  Aug-Sep 1995 web server Internet users  
Saskatchewan-HTTP: Seven months of HTTP logs from a University web server Internet Traffic Archive and Univ. of Wisconsin Jun-Dec 1995 web server Internet users  
NASA-HTTP: Two months of HTTP logs from a busy NASA web server Internet Traffic Archive and Univ. of Wisconsin  Jul-Aug 1995 web server Internet users  
Music Machines-HTTP: Two years and two months of HTTP logs from another web server University of Washington  Sep-Oct 1997 or
Feb 1997-Apr 1999
web server Internet users Anonymized; web caching disabled; no sizes reported in initial portions
Soccer World Cup 1998 Internet Traffic Archive and W3C WCA Repository  Apr 30-Jul 26 1998 web server(s) Internet users Anonymized, 1.3 billion requests
Cache Now!

 Copyright ©1999-2008
 All rights reserved.

Last modified: Sat Feb 9 13:40:20 EST 2008
Comments, corrections, and suggestions gratefully accepted.