Stock Movement Prediction and N-dimensional Inter-Transaction Association Rules
Hongjun Lü,Jiawei Han,Ling Feng
1998-01-01
Abstract:1 Inadequacy in association rule mining for stock movement prediction Among all the data mining problems, discovering association rules from large databases is probably the most signiicant contribution from the database community to the eld 1, 2, 5, 9, 10, 7]. The most often cited application of association rules is market basket analysis using transaction databases from supermarkets and departmental stores. We can discover rules like R 1 : 80% of customers who bought diaper also bought beer (diaper) beer (20%; 80%)), where 80% is the conndence level of the rule and 20% is the support level of the rule indicating how frequent the rule holds. The same concept can be applied to other applications as well. For example, to predict the stock market price movement, we can construct a transaction database in such a way that: each record (transaction) in the database represents one trading day and contains a list of winners (closing price is x% more than the previous day's closing price where x% is the trading overhead). Thus we can nd rules like R 2 : When the prices of IBM and SUN go up, 80% of time the price of Microsoft goes up (on the same day). While rule R 2 reeects some relationship among the prices, its role in price prediction is limited. It is rather obvious that the traders may be more interested in the following kind of rules: R 3 : If the prices of IBM and SUN go up, Microsoft's will most likely (80% of time) go up the next day. Unfortunately, current association rule miners cannot discover this kind of rules. Since the stock movement prediction is time-related, we thought sequential pattern discovery 3] might be of help. To apply sequential pattern mining techniques, we reorganize that database as follows: each stock corresponds to a customer, and transactions are represented by ups and downs. The rules that can be found are like R 4 : 80% stock will go up after 3 consecutive loses. This is not really what we like. There is a fundamental diierence between rule R 3 and the other rules. The classical association rules express the associations among items purchased by one customer or share price movement within a day, i.e., associations among items within the same transaction record. We call them intra-transaction association rules. Sequential pattern discovery is also intra-transaction mining in nature because each sequence …