XBase: making your gigabyte disk queriable.

Hongjun Lu,Guoren Wang,Ge Yu,Yubin Bao,Jianhua Lv,Yaxin Yu
DOI: https://doi.org/10.1145/564691.564785
2002-01-01
Abstract:With the rapid development of the Internet and the World Wide Web (WWW), very large amount of information is available and ready for downloading, most of which are free of charge. At the same time, hard disks with large capacity are available at affordable prices. Most of us nowadays often dump a large number of various types of documents into our computers without much thinking. On the other hand, file systems have not changed too much during the past decades. Most of them organize files in directories that form a tree structure, and a file is identified by its name and pathname in the directory tree. Remembering name of files created sometime ago and digging them out from a disk with dozen gigabytes of data in hundred thousands of files becomes never an easy task. Tools available for helping such a search are still far from satisfactory.Xbase (XML-based document BASE) is a prototype system aiming at addressing the above problem. By XML-based, we meant that XML is used to define the metadata. The current version of XBase stores text-based files, including semi-structured data such as XML, HTML, plain text documents (e.g., tex files, computer programs) and those files that can be converted into text (e.g., postscript files, PDF files). In XBase, file name is optional. Users can just load a file into XBase without giving a name and the directory where it should be stored. XBase will automatically associate it with attributes such as the time when the file was saved, its source, its size and type, and etc., To retrieve those files, XBase provides three access methods, explorative browsing, querying using query languages, and keyword based search.
What problem does this paper attempt to address?