Automated Image Color Mapping for a Historic Photographic Collection

Taylor Arnold,Lauren Tilton
2024-11-07
Abstract:In the 1970s, the United States Environmental Protection Agency sponsored Documerica, a large-scale photography initiative to document environmental subjects nation-wide. While over 15,000 digitized public-domain photographs from the collection are available online, most of the images were scanned from damaged copies of the original prints. We present and evaluate a modified histogram matching technique based on the underlying chemistry of the prints for correcting the damaged images by using training data collected from a small set of undamaged prints. The entire set of color-adjusted Documerica images is made available in an open repository.
Computer Vision and Pattern Recognition,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to repair the damaged color photos in the Documerica photo collection. Specifically, during the digitization process, these photos were scanned using damaged negatives, resulting in a serious red shift in the images, which affected their aesthetics and persuasiveness. The author proposed and evaluated an algorithm based on histogram - matching techniques, using a small number of undamaged photos as training data to correct these color deviations and restore the original color quality of the photos. ### Problem Background The Documerica project was a large - scale photography initiative launched by the United States Environmental Protection Agency (EPA) from 1972 to 1977, aiming to document environmental problems in the United States. Although the project produced more than 15,000 digitized photos in the public domain, most of the photos were scanned from damaged negatives, resulting in image color distortion, especially a strong red shift. This color deviation not only affects the aesthetics of the photos but also weakens their value as historical documents. ### Solution To solve this problem, the author proposed an improved histogram - matching technique, which is based on the following points: 1. **Understanding Negative Materials**: Considering the chemical properties of negative materials, the author believes that the color shift may be caused by the different effects of factors such as heat, light, and humidity on the three dyes (cyan, magenta, yellow). 2. **CMY Color Channel Correction**: To correct the color shift, the author chose to directly process the RGB/CMY color channels instead of using color spaces adapted to human vision (such as LAB or LUV). 3. **Avoiding Over - fitting**: To avoid over - fitting the color distribution of a single image, the author created a more general transformation function by calculating the median of the histogram transformations of the entire training set (22 pairs of images). 4. **Median Histogram Matching (MHM)**: This method learns three monotonic functions \(\hat{f}_C\), \(\hat{f}_M\) and \(\hat{f}_Y\), corresponding to the cyan, magenta, and yellow channels respectively, mapping the input color intensities to the output color intensities. ### Results and Verification The author verified the effectiveness of this method through a series of quantitative and qualitative analyses. The results show that the corrected images are closer to the colors of the original photos and can restore the diversity and aesthetic value of the photos. In addition, the author further proved the superiority of this method through cross - validation and comparison with manually corrected photos. ### Summary Through this algorithm, the author hopes to restore the color quality of the damaged photos in the Documerica photo collection without relying completely on perfectly matching reference images, thereby enhancing the aesthetics and persuasiveness of these historical photos and providing an effective solution for other similar problems.