AspectAssay: A Technique for Expanding the Pool of Available Aspect Mining Test Data Using Concern Seeding

  • David Gerald Moore

    Student thesis: Doctoral ThesisDoctor of Philosophy

    Abstract

    Aspect-oriented software design (AOSD) enables better and more complete separation ofconcerns in software-intensive systems. By extracting aspect code and relegatingcrosscutting functionality to aspects, software engineers can improve the maintainabilityof their code by reducing code tangling and coupling of code concerns. Further, thenumber of software defects has been shown to correlate with the number of non-encapsulated nonfunctional crosscutting concerns in a system.Aspect-mining is a technique that uses data mining techniques to identify existing aspectsin legacy code. Unfortunately, there is a lack of suitably-documented test data for aspect-mining research and none that is fully representative of large-scale legacy systems.Using a new technique called concern seeding--based on the decades-old concept oferror seeding--a tool called AspectAssay (akin to the radioimmunoassay test in medicine)was developed. The concern seeding technique allows researchers to seed existing legacycode with nonfunctional crosscutting concerns of known type, location, and quantity, thusgreatly increasing the pool of available test data for aspect mining research.Nine seeding test cases were run on a medium-sized codebase using the AspectAssay tool.Each test case seeded a different concern type (data validation, tracing, and observer) andattempted to achieve target values for each of three metrics: 0.95 degree of scatteringacross methods (DOSM), 0.95 degree of scattering across classes (DOSC), and 10concern instances. The results were manually verified for their accuracy in producingconcerns with known properties (i.e., type, location, quantity, and scattering). Theresulting code compiled without errors and was functionally identical to the original. Theachieved metrics averaged better than 99.9% of their target values.Following the small tests, each of the three previously mentioned concern types wasseeded with a wide range of target metric values on each of two codebases--onemedium-sized and one large codebase. The tool targeted DOSM and DOSC values in therange 0.01 to 1.00. The tool also attempted to reach target number of concern instancesfrom 1 to 100. Each of these 1,800 test cases was attempted ten times (18,000 totaltrials). Where mathematically feasible (as permitted by scattering formulas), the teststended to produce code that closely matched target metric values.Each trial's result was expressed as a percentage of its target value. There were 903 testcases that averaged at least 0.90 of their targets. For each test case's ten trials, thestandard deviation of those trials' percentages of their targets was calculated. There wasan average standard deviation in all the trials of 0.0169. For the 808 seed attempts thataveraged at least 0.95 of their targets, the average standard deviation across the ten trialsfor a particular target was only 0.0022. The tight grouping of trials for their test casessuggests a high repeatability for the AspectAssay technique and tool.The concern seeding technique opens the door for expansion of aspect mining research.Until now, such research has focused on small, well-documented legacy programs.Concern seeding has proved viable for producing code that is functionally identical to theoriginal and contains concerns with known properties. The process is repeatable andprecise across multiple seeding attempts and also accurate for many ranges of targetmetric values.Just like error seeding is useful in identifying indigenous errors in programs, concernseeding could also prove useful in estimating indigenous nonfunctional crosscuttingconcerns, thus introducing a new method for evaluating the performance of aspectmining algorithms.
    Date of AwardJan 1 2013
    Original languageEnglish
    SupervisorSumitra Mukherjee (Supervisor), Michael J Laszlo (Advisor), Sumitra Mukherjee (Advisor) & William Tribbey (Advisor)

    Cite this

    '