Scalability study of database-backed file systems for High Throughput Computing
(2017)Computer Science and Engineering (BSc)
- Abstract
- The purpose of this project is to study the read performance of transparent
database-backed file systems, a meld between two technologies with seemingly
similar purposes, in relation to conventional file systems. Systems such
as the ARC middleware relies on reading several millions of files every day,
and as the number of files increases, the performance suffers. To study the
capabilities of a database-backed file system, a candidate is chosen and put
into test. The candidate, ultimately being Database File System (DBFS), is
Oracle Database using FUSE to create a transparent file system interface.
DBFS is put into test by storing millions of small files in its datafile and
executing a scanning process of the ARC software. With the... (More) - The purpose of this project is to study the read performance of transparent
database-backed file systems, a meld between two technologies with seemingly
similar purposes, in relation to conventional file systems. Systems such
as the ARC middleware relies on reading several millions of files every day,
and as the number of files increases, the performance suffers. To study the
capabilities of a database-backed file system, a candidate is chosen and put
into test. The candidate, ultimately being Database File System (DBFS), is
Oracle Database using FUSE to create a transparent file system interface.
DBFS is put into test by storing millions of small files in its datafile and
executing a scanning process of the ARC software. With the performance
data gathered from these tests, it was concluded that DBFS, while performing
well on an HDD when compared to ext4 in terms of scalability and read
performance, is simply outperformed by XFS with small (from 50 000 files)
and large (up to 1 600 000 files) directories. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8924095
- author
- Trinh, Andy
- organization
- year
- 2017
- type
- M2 - Bachelor Degree
- subject
- keywords
- database-backed file system, dbfs, scalability, xfs, ext4, database, file system, fuse, arc, read performance, alternative storage, rdbms, file system interface
- language
- English
- id
- 8924095
- date added to LUP
- 2017-08-30 04:11:03
- date last changed
- 2018-10-18 10:36:59
@misc{8924095, abstract = {{The purpose of this project is to study the read performance of transparent database-backed file systems, a meld between two technologies with seemingly similar purposes, in relation to conventional file systems. Systems such as the ARC middleware relies on reading several millions of files every day, and as the number of files increases, the performance suffers. To study the capabilities of a database-backed file system, a candidate is chosen and put into test. The candidate, ultimately being Database File System (DBFS), is Oracle Database using FUSE to create a transparent file system interface. DBFS is put into test by storing millions of small files in its datafile and executing a scanning process of the ARC software. With the performance data gathered from these tests, it was concluded that DBFS, while performing well on an HDD when compared to ext4 in terms of scalability and read performance, is simply outperformed by XFS with small (from 50 000 files) and large (up to 1 600 000 files) directories.}}, author = {{Trinh, Andy}}, language = {{eng}}, note = {{Student Paper}}, title = {{Scalability study of database-backed file systems for High Throughput Computing}}, year = {{2017}}, }