Abstract:Previous chapter Next chapter Full AccessProceedings Proceedings of the 2014 SIAM International Conference on Data Mining (SDM)Adaptive Quantization for Hashing: An Information-Based Approach to Learning Binary CodesCaiming Xiong, Wei Chen, Gang Chen, David Johnson, and Jason J. CorsoCaiming XiongDepartment of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Search for more papers by this author, Wei ChenDepartment of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Search for more papers by this author, Gang ChenDepartment of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Search for more papers by this author, David JohnsonDepartment of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Search for more papers by this author, and Jason J. CorsoDepartment of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Department of Computer Science and Engineering, SUNY at Buffalo.Search for more papers by this authorpp.172 - 180Chapter DOI:https://doi.org/10.1137/1.9781611973440.20PDFBibTexSections ToolsAdd to favoritesDownload CitationsTrack CitationsEmail SectionsAboutAbstract Large-scale data mining and retrieval applications have increasingly turned to compact binary data representations as a way to achieve both fast queries and efficient data storage; many algorithms have been proposed for learning effective binary encodings. Most of these algorithms focus on learning a set of projection hyperplanes for the data and simply binarizing the result from each hyperplane, but this neglects the fact that informativeness may not be uniformly distributed across the projections. In this paper, we address this issue by proposing a novel adaptive quantization (AQ) strategy that adaptively assigns varying numbers of bits to different hyperplanes based on their information content. Our method provides an information-based schema that preserves the neighborhood structure of data points, and we jointly find the globally optimal bit-allocation for all hyperplanes. In our experiments, we compare with state-of-the-art methods on four large-scale datasets and find that our adaptive quantization approach significantly improves on traditional hashing methods. Previous chapter Next chapter RelatedDetails Published:2014eISBN:978-1-61197-344-0 https://doi.org/10.1137/1.9781611973440Book Series Name:ProceedingsBook Code:PRDT14Book Pages:1-1086

HyperMinHash: MinHash in LogLog space

Streaming Algorithms for Estimating High Set Similarities in LogLog Space

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting

Min-Max Hash for Jaccard Similarity

A Memory-Efficient Sketch Method for Estimating High Similarities in Streaming Sets

Minwise-Independent Permutations with Insertion and Deletion of Features

Maximum-Margin Hamming Hashing

BitHash: an Efficient Bitwise Locality Sensitive Hashing Method with Applications

Learned Monotone Minimal Perfect Hashing

Minimizing Reconstruction Bias Hashing Via Joint Projection Learning and Quantization

ShockHash: Near Optimal-Space Minimal Perfect Hashing Beyond Brute-Force

ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale

Compressed Hashing

Maintaining $k$-MinHash Signatures over Fully-Dynamic Data Streams with Recovery

Adaptive Quantization for Hashing: an Information-Based Approach to Learning Binary Codes.

Deep Hashing Via Discrepancy Minimization

Deep Cauchy Hashing For Hamming Space Retrieval

Extensible Max-min Collaborative Retention for Online Mini-batch Learning Hash Retrieval

Accelerated Large Scale Optimization by Concomitant Hashing

A Review for Weighted MinHash Algorithms

Collaborative Learning for Extremely Low Bit Asymmetric Hashing