Design and implementation of second-hand housing data statistical analysis system

Taizhi Lv,Jun Zhang
DOI: https://doi.org/10.23977/FERM.2021.040408
2021-09-30
Abstract:The content of this paper is the statistical analysis of the housing price data in Wuxi. Obtain data on the net, visualize the data, to see the prices clearly, judge the influence prices of each element, use linear regression to find out the price per square meter and the relationship between the building area, through the KNN algorithm to divided into high-grade village, compare the Euclidean distance and the Manhattan distance of the differences in house prices problem. This system is based on Python language, MongoDB stores data, uses MySQL to process relevant data, uses PyCharm as the development tool, Python 3.9 as the running environment, uses Scrapy framework to crawl the second-hand house data of LianJia network, and stores the data into MongoDB. After dirty data processing, we use lightweight Web application framework Flask and Echarts to conduct visual analysis on the Web page. Finally, the linear regression algorithm is used to find out the elements related to the price, and the KNN classification algorithm is used to divide the residential area into three grades by the level of the housing price.
Computer Science
What problem does this paper attempt to address?