Effectiveness of caching in a distributed digital library system

Hollmann, Jochen; Ardö, Anders; Stenström, Per

Effectiveness of caching in a distributed digital library system

Mark

Hollmann, Jochen ; Ardö, Anders ^LU and Stenström, Per (2007) In Journal of Systems Architecture 53(7). p.403-416

Abstract: Today independent publishers are offering digital libraries with fulltext archives. In an attempt to provide a single user-interface to a large set of archives, the studied Article-Database-Service offers a consolidated interface to a geographically distributed set of archives. While this approach offers a tremendous functional advantage to a user, the fulltext download delays caused by the network and queuing in servers make the user-perceived interactive performance poor.

This paper studies how effective caching of articles at the client level can be achieved as well as at intermediate points as manifested by gateways that implement the interfaces to the many fulltext archives. A central research question in this... (More); Today independent publishers are offering digital libraries with fulltext archives. In an attempt to provide a single user-interface to a large set of archives, the studied Article-Database-Service offers a consolidated interface to a geographically distributed set of archives. While this approach offers a tremendous functional advantage to a user, the fulltext download delays caused by the network and queuing in servers make the user-perceived interactive performance poor.

This paper studies how effective caching of articles at the client level can be achieved as well as at intermediate points as manifested by gateways that implement the interfaces to the many fulltext archives. A central research question in this approach is: What is the nature of locality in the user access stream to such a digital library? Based on access logs that drive the simulations, it is shown that client-side caching can result in a 20% hit rate. Even at the gateway level temporal locality is observable, but published replacement algorithms are unable to exploit this temporal locality. Additionally, spatial locality can be exploited by considering loading into cache all articles in an issue, volume, or journal, if a single article is accessed. But our experiments showed that improvement introduced a lot of overhead. Finally, it is shown that the reason for this cache behavior is the long time distance between re-accesses, which makes caching quite unfeasible. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/1033949

author

Hollmann, Jochen ; Ardö, Anders ^LU and Stenström, Per

organization

Department of Electrical and Information Technology

publishing date

2007

type

Contribution to journal

publication status

published

subject

Electrical Engineering, Electronic Engineering, Information Engineering

keywords

STRATEGIES, document caching, REPLACEMENT ALGORITHM, PERFORMANCE, H.3 information storage and retrieval, H.3.7 digital libraries, performance

in

Journal of Systems Architecture

volume

53

issue

7

pages

403 - 416

publisher

Elsevier

external identifiers

scopus:34248564619

ISSN

1383-7621

DOI

10.1016/j.sysarc.2006.11.011

language

English

LU publication?

yes

id

1130bd6e-40a9-4a77-8557-5c704f56868a (old id 1033949)

alternative location

http://www.eit.lth.se/fileadmin/eit/home/hs.aar/Publ/JSA2007pub.pdf

date added to LUP

2016-04-01 15:40:21

date last changed

2025-10-30 13:22:28

@article{1130bd6e-40a9-4a77-8557-5c704f56868a,
  abstract     = {{Today independent publishers are offering digital libraries with fulltext archives. In an attempt to provide a single user-interface to a large set of archives, the studied Article-Database-Service offers a consolidated interface to a geographically distributed set of archives. While this approach offers a tremendous functional advantage to a user, the fulltext download delays caused by the network and queuing in servers make the user-perceived interactive performance poor.<br/><br>
<br/><br>
This paper studies how effective caching of articles at the client level can be achieved as well as at intermediate points as manifested by gateways that implement the interfaces to the many fulltext archives. A central research question in this approach is: What is the nature of locality in the user access stream to such a digital library? Based on access logs that drive the simulations, it is shown that client-side caching can result in a 20% hit rate. Even at the gateway level temporal locality is observable, but published replacement algorithms are unable to exploit this temporal locality. Additionally, spatial locality can be exploited by considering loading into cache all articles in an issue, volume, or journal, if a single article is accessed. But our experiments showed that improvement introduced a lot of overhead. Finally, it is shown that the reason for this cache behavior is the long time distance between re-accesses, which makes caching quite unfeasible.}},
  author       = {{Hollmann, Jochen and Ardö, Anders and Stenström, Per}},
  issn         = {{1383-7621}},
  keywords     = {{STRATEGIES; document caching; REPLACEMENT ALGORITHM; PERFORMANCE; H.3 information storage and retrieval; H.3.7 digital libraries; performance}},
  language     = {{eng}},
  number       = {{7}},
  pages        = {{403--416}},
  publisher    = {{Elsevier}},
  series       = {{Journal of Systems Architecture}},
  title        = {{Effectiveness of caching in a distributed digital library system}},
  url          = {{http://dx.doi.org/10.1016/j.sysarc.2006.11.011}},
  doi          = {{10.1016/j.sysarc.2006.11.011}},
  volume       = {{53}},
  year         = {{2007}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Effectiveness of caching in a distributed digital library system