Social Informatics Data Grid

Bennett Bertenthal,Robert Grossman,David Hanley,Mark Hereld,Sarah Kenny,Gina-Anne Levow,Michael E. Papka,Stephen W. Porges,Kavithaa Rajavenkateshwaran,Rick Stevens,Thomas D. Uram,Wenjun Wu
2007-01-01
Abstract:The Social Informatics Data Grid is a new infrastructure designed to transform how social and behavioral scientists collect and annotate data, collaborate and share data, and analyze and mine large data repositories. An important goal of the project is to be compatible with existing databases and tools that support the sharing, storage and retrieval of archival data sets. It is built on web and grid services to enable transparent access to data and analysis resources from anywhere and to leverage new and emerging web-based technologies created by a large and growing community of developers around the world. At the heart of the SIDGrid design is a rich data model that captures notions of time, data streams, and semi- structured data attached to these streams to enable powerful manipulations of multimodal data spread across data resources. Through query and analysis services deployed against the data warehoused in the SIDGrid users can perform new classes of experiments. Shared data resources available from anywhere over the Web introduces new capabilities to the process of collection and analysis of data - collaborative annotation among them - without relinquishing control over sensitive data via an embedded security model. Through a series of workshops at which we engaged members of the broader community, and by cultivation of a few collaborative projects, we have steered the development process to provide the most important components and functions first.
What problem does this paper attempt to address?