Abstract:Data mining, or knowledge discovery in databases (KDD), is an interdisciplinary field that integrates techniques from several research areas including machine learning, statistics, database systems, and pattern recognition, for the analysis of large volumes of possibly complex, highly-distributed and poorly-organized data. The prosperity of the data mining field may attribute to two essential reasons. Firstly, a huge amount of data is collected and stored everyday. On the one hand, along with the continuing development of advanced technologies in many domains, data is generated at enormous speeds. For examples, purchases data at department/grocery stores, bank/credit card transaction data, e-commerce data, Internet traffic data that describes the browsing history of Web users, remote sensor data from agricultural satellites, and gene expression data from microarray technology. On the other hand, the progress made in hardware technology allows today’s computer systems to store very large amounts of data. Secondly, with these large volumes of data at hand, the data owners have an imminent intent to turn them into useful knowledge. From a commercial viewpoint, the ultimate goal of the data owners is to gain more and pay less for their business activities. Under the competition pressure, they want to enhance their services, develop cost-effective strategies, and target the right group of potential customers. From a scientific viewpoint, when traditional techniques are infeasible in dealing with the raw data, data mining may help scientists in many ways, such as classifying and segmenting data. By applying the knowledge extracted from data mining, the business analyst may rate customers by their propensity to respond to an offer, the doctor may estimate the probability of an illness re-occurrence, the website publisher may display customized Web pages to individual Web users according to their browsing habit, and the geneticist may discover novel gene-gene interaction patterns. In this talk, we aim to provide a general picture for important data mining steps, topics, algorithms and challenges.

Knowledge Discovery in Very Large Databases.

Knowledge Discovery in Multiple Databases

Discovering Associations with Uncertainty from Large Databases

Large Scale Data Mining Based on Data Partitioning.

An overview of data mining and knowledge discovery

Data Mining: Algorithms and Problems

Discovery of General Knowledge in Large Spatial Databases

Database classification for multi-database mining

Proceedings of the 1st International Workshop on Cross Domain Knowledge Discovery in Web and Social Network Mining

Data mining: an overview from a database perspective

Mining Large-Scale News Video Database Via Knowledge Visualization

Multi-Database Mining.

Fundamentals of Association Rules in Data Mining and Knowledge Discovery

On Spatial Data Mining and Knowledge Discovery (SDMKD)

International study on Internet/Web data mining with the state of art and advances

A Data Mining System for Very Large Databases

23-bit Metaknowledge Template Towards Big Data Knowledge Discovery and Management

Kddlog:Performance And Scalability In Knowledge Discovery By Declarative Queries With Aggregates

Universal Knowledge Discovery from Big Data Using Combined Dual-Cycle

Interesting Instance Discovery in Multi-Relational Data

A Multistrategy Approach to Relational Knowledge Discovery in Databases