|
|
Why do research in Web Caching?
There are a number of reasons to do research in web caching. Many of
them were brought up in a discussion on the
ircache mailing list.
For the original posts, see the thread on 'free bandwidth' in the
ircache archives
for September
1998.
I generally break up motivations in web caching research as short term and
long term. In the short term, there is the desire to improve upon web
caching systems as they can affect the performance of the web today. This
includes:
- Reducing the cost of connecting to the Internet.
Traffic on the web consumes more bandwidth than any other Internet
service, and so any method that can reduce the bandwidth requirements is
particularly welcome in parts of the world in which telecommunication
services are much more expensive than in the U.S., and in which service
is often provided on a cost per bit basis. Even when telecommunication
costs are reasonable, a large percentage of traffic is destined to or from
the U.S. and so must use expensive trans-oceanic network links.
- Reducing the latency of today's WWW.
One of the most common end-user desires is for more speed. Many people
believe web caching can help reduce the "World Wide Wait", and so research
that improves upon user latency is quite welcome. Latency improvements
are most noticable, again, in areas of the world in which data must travel
long distances, accumulating significant latency as a result of
speed-of-light constraints, accumulating processing time by many
systems over many network hops, and increased likelihood of experiencing
congestion as more networks are traversed to cover such distances.
High latency as a result of speed-of-light constraints is particularly
taxing in satellite communications.
In the long term one might argue that research in web caching is
not necessary, as the cost of bandwidth continues to drop.
However, research in web caching will continue to reap benefits as:
- Bandwidth will always have some cost.
The cost of bandwidth will never reach zero, even though costs are
currently going down as competition increases, the market grows, and
economies of scale contribute. No matter what the cost for bandwidth, one
will always want to maximize the return on investment, and caching will
often help.
- Non-uniform bandwidth and latencies.
Because of physical limitations such as environment and location, as
well as financial limitations, there will always be variations in
bandwidth and latencies. Caching can help to smooth these effects.
- Bandwidth demands continue to increase.
New users are still getting connected to the Internet in large numbers.
Even as growth in the user base slows, demand for increased bandwith
will continue as high-bandwidth media such as audio and video
increase in popularity.
If the price is low enough, demand will always outstrip supply.
Additionally, as the availability of bandwidth increases, user
expectations are also likely to increase.
- Hot spots in the web will continue.
While some user demand can be predicted (such as for the latest version of
a free web browser), and thus have the potential for intelligent load
balancing by distributing content among many systems and networks, other
popular web destinations come as a surprise, sometimes as a result of
current events, but also potentially just as a result of desirable content
and word of mouth. These 'hot spots' will continue to affect availability
and response time and can be alleviated through web caching.
- Communication vs. computation.
Communication is likely to always be more expensive (to some
extent) than computation. We can build CPUs that are much faster than
main memory, and so memory caches are utilized. Likewise, caches will
continue to be used as computer systems and network connectivity both
get faster.
There are a number of parallels to this situation; one is that of
main memory in
computer systems. The cost of RAM has decreased tremendously over the
past decades. Yet relatively few people would claim to have enough, and
in fact, demand for additional memory to handle larger applications
continues unabated. Therefore,
virtual memory (caching) continues to be used strategically instead of
purchasing additional memory.
Of course, there are additional secondary benefits to web caching. These
include reduced load on the originating servers, and improved reliability
as objects may be available in a cache even when the original web server
is currently inaccessible.
|