The Case for a Structured Approach to Managing Unstructured Data
AnHai Doan,Jeff Naughton,Akanksha Baid,Xiaoyong Chai,Fei Chen,Ting Chen,Eric Chu,Pedro DeRose,Byron Gao,Chaitanya Gokhale,Jiansheng Huang,Warren Shen,Ba-Quy Vuong
DOI: https://doi.org/10.48550/arXiv.0909.1783
2009-09-10
Abstract:The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. And yet we are risking letting this opportunity go by, ceding the playing field to other players, ranging from communities such as AI, KDD, IR, Web, and Semantic Web, to industrial players such as Google, Yahoo, and Microsoft. In this essay we explore what we can do to improve upon this situation. Drawing on the lessons learned while managing relational data, we outline a structured approach to managing unstructured data. We conclude by discussing the potential implications of this approach to managing other kinds of non-relational data, and to the identify of our field.
Databases,Information Retrieval