Google engineer, author of three books on web productivity and an Open Source activist,
Steve Souders presented a new
HTTP Archive project, which should be a great addition to the well-known
Internet archive .
Unlike the Internet Archive, there will be stored not the content of pages, but technical information: content types, page size, use of various technologies, the most popular scripts and JavaScript libraries, image formats, the number of pages with errors, etc. This will allow in the future to more fully trace the evolution of the web.
For example, statistics show that over the past six months, the use of flash on the web has decreased by 16%, and the average page size with content has grown by 88 KB.
Statistics is collected from approximately
17,000 of the largest sites (data was taken from Alexa, Fortune 500 and Quantcast ratings), updated every two weeks. The collected data is
laid out . The HTTP Archive source code is also
open , so that a similar service can be easily launched on Runet.
')
For example, here are some charts from the March 29, 2011
statistical summary .












Steve Soders has been collecting data since October 2010, so some patterns are already traced in the charts in
the trends section . For example, the average size of a web page during this time increased by 15% (88 KB).

