Towards an automated repository for indexing, analysis and characterization of municipal e-government websites in Mexico

Sergio R. Coria,Leonardo Marcos-Santiago,Christian A. Cruz-Melendez,Juan M. Jimenez-Canseco
DOI: https://doi.org/10.48550/arXiv.2006.14746
2020-06-26
Abstract:This article addresses a problem in the electronic government discipline with special interest in Mexico: the need for a concentrated and updated information source about municipal e-government websites. One reason for this is the lack of a complete and updated database containing the electronic addresses (web domain names) of the municipal governments having a website. Due to diverse causes, not all the Mexican municipalities have one, and a number of those having it do not present information corresponding to the current governments but, instead, to other previous ones. The scarce official lists of municipal websites are not updated with the sufficient frequency, and manually determining which municipalities have an operating and valid website in a given moment is a time-consuming process. Besides, website contents do not always comply with legal requirements and are considerably heterogeneous. In turn, the evolution development level of municipal websites is valuable information that can be harnessed for diverse theoretical and practical purposes in the public administration field. Obtaining all these pieces of information requires website content analysis. Therefore, this article investigates the need for and the feasibility to automate implementation and updating of a digital repository to perform diverse analyses of these websites. Its technological feasibility is addressed by means of a literature review about web scraping and by proposing a preliminary manual methodology. This takes into account known, proven, techniques and software tools for web crawling and scraping. No new techniques for crawling or scraping are proposed because the existing ones satisfy the current needs. Finally, software requirements are specified in order to automate the creation, updating, indexing, and analyses of the repository.
Computers and Society,Digital Libraries
What problem does this paper attempt to address?