The Architecture Design of MICAPS4 Server System
Ruotong Wang,Jianmin Wang,Xiangdong Huang,Yifeng Dong,Mingsheng Long
DOI: https://doi.org/10.11898/1001-7313.20180101
2018-01-01
Abstract:Meteorological data are typical non-structure data,which reach dozens of TBs per day.Data pre-processing,data storage and data access based on RDBMS and file system become the bottleneck of MICAPS3.To fulfill MICAPS4 users' need of fast,in-time query of meteorological real-time data,according to the multi-dimension model and the user query behavior of meteorological data,using non-relational key-value DDBMS,a high performance massive meteorological data storage system and a stable 7 × 24 distributed data pre-processing system is designed and established.MICAPS4 uses a client/server system architecture,and high-performance server cluster system is the critical component of MICAPS4.Using distributed keyvalue data model and P2P infrastructure,MICAPS4 server system distributes all real-time data which arrive at a very high speed to multiple servers through an automatic load balance algorithm,and all data are stored in memory initially and persistent to hard disk periodically,which can not only reduce the disk I/O operating times,but also guarantee the reduction of writing pressure accompanying the high load of reading pressure.To enhance the data and system reliability,distributed system architecture and multiple data replica are used,which also improves the throughput capacity of the system.According to statistic results gained from product environment,the performance of MICAPS4 server system improves 100 times more than MICAPS3.MICAPS4 server system transits all meteorological real-time data storage from file system to database,from centralized system to distributed system.The system becomes the core production system of China Meteorological Administration in 2015 and is popularized nationwide.Under the condition of massive meteorological data and concurrent access of many users,it shows high stability and excellent read-write performance,and it is also highly scalable and maintenance friendly.MICAPS4 high performance server system includes 5 sub-systems including distributed storage system,distributed pre-processing system,station data polling system,data query server and monitoring probe.The distributed storage system provides high performance data accessing services of meteorological real-time data in both random and sequence mode,the distributed pre-processing system implements the stream computing function of massive meteorological real-time data by adopting the peer to peer distributed system infrastructure,the station data polling system implements the heterogeneous station observation replica data synchronization function over different systems,the data query server implements MICAPS4 client real-time computing function by means of the multi-threading server technology,and the monitoring probe is deployed in each server node and reports host health messages periodically.The overall design of MICAPS4 server system is depicted,and the motivation,core technologies and the design of each sub-system are also introduced.