TY - GEN
T1 - Aspect mining using model-based clustering
AU - McFadden, Renata Rand
AU - Mitropoulos, Frank J.
PY - 2012
Y1 - 2012
N2 - Legacy systems contain critical and complex business code that has been in use for a long time. This code is difficult to understand, maintain, and evolve, in large part due to crosscutting concerns: software system features, such as persistence, logging, and error handling, whose implementation is spread across multiple modules. Aspect-oriented techniques separate crosscutting concerns from the base code, using separate modules called aspects and, thus, simplify the legacy code. Aspect mining techniques identify aspect candidates so that the legacy code can then be refactored into aspects. This study shows that model-based clustering using a carefully selected vector-space of features can be more effective than extant aspect mining methods based on heuristic methods as such hierarchical or partitional clustering. Three model-based algorithms were experimentally compared against existing heuristic methods, such as k-means clustering and agglomerative hierarchical clustering, using six different vector-space models. Model-based algorithms performed better in not spreading the methods of the concerns across the multiple clusters and were significantly better at partitioning the data such that, given an ordered list of clusters, fewer clusters and methods were needed to be analyzed to find all the concerns. In addition, model-based algorithms automatically determined the optimal number of clusters, a great advantage over the heuristic-based algorithms. Lastly, the newly defined vector-space models performed better, relative to aspect mining, than the previously defined vector-space models.
AB - Legacy systems contain critical and complex business code that has been in use for a long time. This code is difficult to understand, maintain, and evolve, in large part due to crosscutting concerns: software system features, such as persistence, logging, and error handling, whose implementation is spread across multiple modules. Aspect-oriented techniques separate crosscutting concerns from the base code, using separate modules called aspects and, thus, simplify the legacy code. Aspect mining techniques identify aspect candidates so that the legacy code can then be refactored into aspects. This study shows that model-based clustering using a carefully selected vector-space of features can be more effective than extant aspect mining methods based on heuristic methods as such hierarchical or partitional clustering. Three model-based algorithms were experimentally compared against existing heuristic methods, such as k-means clustering and agglomerative hierarchical clustering, using six different vector-space models. Model-based algorithms performed better in not spreading the methods of the concerns across the multiple clusters and were significantly better at partitioning the data such that, given an ordered list of clusters, fewer clusters and methods were needed to be analyzed to find all the concerns. In addition, model-based algorithms automatically determined the optimal number of clusters, a great advantage over the heuristic-based algorithms. Lastly, the newly defined vector-space models performed better, relative to aspect mining, than the previously defined vector-space models.
KW - Aspect Mining
KW - Aspect-Oriented Programming
KW - Crosscutting Concerns
KW - Fan-in metric
KW - Heuristic-Based Clustering
KW - Model-Based Clustering
KW - Software Metrics
U2 - 10.1109/SECon.2012.6196984
DO - 10.1109/SECon.2012.6196984
M3 - Conference contribution
AN - SCOPUS:84861494387
SN - 9781467313742
T3 - 2012 Proceedings of IEEE Southeastcon
BT - 2012 Proceedings of IEEE SoutheastCon, SOUTHEASTCON 2012
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2012 IEEE SoutheastCon, SOUTHEASTCON 2012
Y2 - 15 March 2012 through 18 March 2012
ER -