Locating White Box Reuse Via Data Mining
Margot Postema,Heinz Schmidt,Xindong Wu
1998-01-01
Abstract:ion User 1. 2. 3. 4. 5. 6. 7. Figure 1: The Abstraction Technique 4 ing [1]. Hence, the attribute values are interfaces, which can be expanded to show further values. We will provide a three-dimensional view, similar to a drilling technique in executive information systems. As a demonstration, consider the feature-name example attribute. First pass of the data has indicated that two features may be candidate for abstraction. Both of these features are then selected for further analysis, revealing their attribute values. A second pass over this data identi es additional knowledge and assists the abstraction process. With a tuning aid approach [13], the technique highlights candidate modules for restructuring, allowing the user to request further analysis. The user then accepts and continues the abstraction, or rejects the suggestions. 4 Conclusion Reverse engineering of legacy code is mostly a manual process, aided by the assistance of tools. We have described the problems with understanding and comprehension of legacy code, and outlined the current state of reverse engineering research. The focus is to identify modular units (or objects), within legacy code, for either conversion to new systems, or restructuring of current systems. Abstraction or restructuring techniques are mostly manual. Our technique can be used for knowledge discovery of white box reuse in source code, candidate for redesign and abstraction to black box components. The technique, will not be limited to a speci c programming language, and can be modi ed for areas such as rule discovery in business process reengineering, and database mining. The system design incorporates user input to guide the discovery process. We have demonstrated that an all-purpose system, with an architectural approach of blackboard systems can be built, where the user acts as the controller. The source code transformed to three-dimensional attribute values, are used to solve the goal of identify, match (or partial-match), and analyse for abstraction. Furthermore, the technique can be applied to other data mining applications. References [1] J. Beck and D. Eichmann. Program and interface slicing for reverse engineering. In R.C. Waters & E.J. Chikofsky, editors Working Conference on Reverse Engineering, IEEE Computer Society Press, p. 54-63, 1993. [2] T.J. Biggersta . The concept assignment problem in program understanding. In R.C. Waters & E.J. Chikofsky, editors, Proceedings Working Conference on Reverse Engineering, IEEE Computer Society Press, P. 27-73, 1993. [3] Blackboard Technology Group, Inc. http://www.bbtech.com. [4] R.W. Bowdidge and W.G. Griswold. Automated Support for Encapsulating Abstract Data Types. In Proc. 2nd ACM SIGSOFT Symposium on Foundations of Software Engineering, December, 19, 5, p. 97-110, 1994. [5] Y.R. Chen, G.S. Fowler and R.S. Wallach. Ciao: A graphical navigator for software and document repositories. In G. Caldiera and K. Bennett, editors, International Conference on Software Maintenance, IEEE Computer Society Press, p. 66-75, 1995. [6] B. Childs and J. Sametinger. Reuse Measurement with Line and Word Runs. In C. Mingins, R. Duke, and B. Meyer, editors, Technology of Object-Oriented Languages and Systems: TOOLS 21, Monash Printing Services, Melbourne, Australia, p. 91-103, 1996. [7] W.W. Cohen. Recovering software speci cations with inductive logic programming. In Proceedings of the Twelfth National Conference on Arti cial Intelligence, AAAI Press, 1, p. 142-148, 1994. [8] P. Freeman. A conceptual analysis of the Draco approach to constructing software systems. In IEEE Transactions on Software Engineering, 13, p. 830-844, 1987. [9] G. Maughan. Object-oriented architectural restructuring through abstraction and re-implementation. PhD Dissertation, Department of Software Development, Monash University, 1996. [10] B. Meyer. Object-oriented Software Construction. Prentice Hall, 1988. [11] H.L. Ong, S. Long and H.Y. Lee. Machine Discovery of Static Software Reuse Potential Metrics. In Proc. 2nd Singapore Intl. Conf. on Intelligent Systems (SPICIS'94), Singapore, p. B289-B291, 1994. [12] A. Quilici and D.N. Chin. A cooperative environment for reverse-engineering legacy software. In L. Wills, P. Newcomb and E. Chikofsky, editors, Second working Conference on Reverse Engineering, IEEE Computer Society Press, Canada, p. 156-165, 1995. [13] M. Postema, X. Wu and T. Menzies. A Tuning Aid for Discretization in Rule Induction. In H. Lu, H. Motoda and H. Liu, editors, KDD: Techniques and Applications. World Scienti c, 1997, p. 79-87. [14] Reasoning Inc. http://www.reasoning.com, August, 1997. [15] S. Rugaber, K. Stirewalt and L.M. Wills. Understanding Interleaved Code. In Automated Software Engineering Special Issue: Reverse Engineering, Kluwer Academic Publishers, June, 3m 1/2, p. 47-79, 1996. [16] H.M. Sneed. Reverse engineering as a bridge to case. In L. Wills, P. Newcomb & E. Chikofsky, editors, Second Working Conference on Reverse Engineering, IEEE Computer society Press, Canada, p. 300-313, 1995. [17] M. Weiser. Program slicing. In IEEE Transactions on Software Engineering, IEEE Computer Society, July, SE-10, p. 352-357, 1984. 5