Abstract:We are presently moving towards a distributed, wholly interconnected information environment, in which data items will be accessed from many locations that may be geographically distributed world-wide. In such an environment, data items are usually replicated, i.e., stored in local databases of multiple processors, for data access availability and its performance. Improper management of the replicated data may result in heavy network traffic, and lead to severe performance problems. Such a problem is especially significant in the mobile computing environments, in which the aerial bandwidth is limited, and wireless communication is much more costly than the cable wired network communication. In this thesis, several mathematical models for evaluating the network traffic involved with the replicated objects are presented. Convergence is introduced for evaluating the replicated data management algorithms of regular network traffic. Competitiveness is introduced for the worst case analysis. A stochastic method is introduced for the expected-case study. Meanwhile, I will present a couple of dynamic data replication algorithms, each of which is for different network architecture. I will compare the proposed algorithms with the traditional static management algorithm for the replicated data, analytically and experimentally. The analytical comparison is based on the above proposed measurements, and the experimental comparison is based on our simulation results.

Towards a Better Replica Management for Hadoop Distributed File System.

An Experimental Evaluation of Performance of A Hadoop Cluster on Replica Management

NGN Management with NGOSS Framework-Based IMS Use Case

Load Balance Optimization with Replication Degree Customization.

Heterogeneous Replicas for Multi-dimensional Data Management

An Efficient Replicated System for the Metadata of HDFS

Dynamic Data Storage and Management Strategies for Distributed File System

ERMS: an Elastic Replication Management System for HDFS

Optimize Multidimensional Arrays Queries with Heterogeneous Replica Method

A dynamic optimal replication strategy in data grid environment

Hadoop High Availability Through Metadata Replication

Rer: A Replica Efficiency Based Replication Strategy

A Combination Replication Strategy for Data-Intensive Services in Distributed Geographic Information System

Optimizing Hadoop Block Placement Policy and Cluster Blocks Distribution

A Dynamic Replication Management Strategy in Distributed GIS

Dynamic Replicas Strategy Based on Predicted Popularity

An Optimized Replica Distribution Method in Cloud Storage System

The Impact of Data Replicatino on Job Scheduling Performance in Hierarchical data Grid

A Two-Layered Replica Management Method

Replica-aware Data Recovery Performance Improvement for Hadoop System with NVM

Dynamic Data Replication in Distributed Systems