Monday, June 3, 2019

Metrics and Models in Software Testing

Metrics and Models in Softw be TestingHow do we measure the progress of hear? When do we carrier bag the softw atomic number 18 system? Why do we devote much(prenominal) than than clock cadence and re book of factss for psychometric demonstrateing a particular module? What is the dependableness of softw ar at the condemnation of release? Who is responsible for the selection of a poor exam suite? How galore(postnominal) an(prenominal) duty periods do we expect during interrogationing? How oft beat and resources ar inevitable to test a software? How do we know the potentness of test suite? We whitethorn keep on framing such questions without much effort? However, finding answers to such questions are non easy and whitethorn require star significant standard of effort. computer software scrutiny poetic rhythm whitethorn befriend us to measure and quantify m each things which may find some answers to such important questions.10.1 Software MetricsWhat cannot be measured, cannot be controlled is a creation in this world. If we want to control something we should first be able to measure it. Therefore, everything should be measurable. If a thing is not measurable, we should make an effort to make it measurable. The sweep of measurement is very important in every field and we have mature and establish prosody to quantify various things. However, in software approach pattern this area of measurement is still in its developing stage and may require significant effort to make it mature, scientific and effective.10.1.1 Measure, Measurement and MetricsThese scathe are often use of goods and servicesd interchangeably. However, we should understand the difference amongst these terms. Pressman explained this clearly as PRES05A measure provides a quantitative indication of the extent, amount, dimension, capacity or size of some attributes of a merchandise or swear out. Measurement is the act of determining a measure. The metric is a quantitative measure of the degree to which a product or process possesses a disposed(p) attribute. For example, a measure is the lean of chastenings see during testing. Measurement is the way of recording such disasters. A software metric may be average cast of trials undergo per hour during testing.Fenton FENT04 has delimitate measurement asIt is the process by which numbers or symbols are assigned to attributes of entities in the real world in such a way as to describe them according to clearly defined rules.The sanctioned issue is that we want to measure every attribute of an entity. We should have established prosody to do so. However, we are in the process of developing metrics for some attributes of various entities apply in software engineering.Software metrics can be defined as GOOD93 The continuous application of measurement establish techniques to the software development process and its products to supply consequenceful and cartridge cliply management lear ning, together with the use of those techniques to improve that process and its products.M any things are covered in this definition. Software metrics are related to measures which, in turn, involve numbers for quantification, these numbers are use to produce mitigate product and improve its related process. We may like to measure feeling attributes such as testability, complexity, reliability, maintainability, capability, portability, enhanceability, usability etc.for a software. We may also like to measure size, effort, development prison term and resources for a software.10.1.2 ApplicationsSoftware metrics are applicable in in all phases of software development life cycle. In software requirements and analysis phase, where output is the SRS document, we may have to estimate the cost, manpower requirement and development condemnation for the software. The client may like to know cost of the software and development metre before signing the contract. As we all know, the S RS document acts as a contract among customer and developer. The readability and metier of SRS document may assistant to increase the confidence level of the customer and may provide better compriseations for designing the product. Some metrics are available for cost and size estimation like COCOMO, Putnam resource allocation imitate, get point estimation model etc. Some metrics are also available for the SRS document like number of mistakes found during verification, change put across frequency, readability etc. In the design phase, we may like to measure stability of a design, coupling amongst modules, cohesion of a module etc. We may also like to measure the amount of data input to a software, processed by the software and also produced by the software. A count of the amount of data input to, processed in, and output from software is called a data structure metric. Many such metrics are available like number of variables, number of operators, number of operands, number of live variables, variable spans, module weakness etc. Some information flow metrics are also hot like FANIN, FAN OUT etc. wasting disease cases may also be employ to design metrics like counting actors, counting use cases, counting number of links etc. Some metrics may also be designed for various applications of weathervanesites like number of static web pages, number of dynamic web pages, number of internal page links, word count, number of static and dynamic content objects, time taken to search a web page and retrieve the desired information, similarity of web pages etc. Software metrics have number of applications during implementation phase and after the utmost of such a phase. Halstead software size measures are applicable after coding like token count, program length, program volume, program level, difficulty, estimation of time and effort, language level etc. Some complexity measures are also popular like cyclomatic complexity, knot count, feature count etc. Software met rics have found good number of applications during testing. One area is the reliability estimation where popular models are Musas basic exertion time model and logarithmic Poisson execution time model. Jelinski Moranda model JELI72 is also used for the calculation of reliability. Source recruit coverage metrics are available that opine the percentage of source code covered during testing. Test suite effectiveness may also be measured. take of failures experienced per unit of time, number of paths, number of independent paths, number of du paths, percentage of statement coverage, percentage of branch condition covered are also useful software metrics. Maintenance phase may have many metrics like number of pauses reported per year, number of requests for changes per year, percentage of source code modified per year, percentage of obsolete source code per year etc.We may find number of applications of software metrics in every phase of software development life cycle. They provide meaningful and timely information which may wait on us to take corrective actions as and when required. Effective implementation of metrics may improve the grapheme of software and may help us to deliver the software in time and within budget.10.2 Categories of MetricsThere are two broad categories of software metrics namely product metrics and process metrics. Product metrics describe the characteristics of the product such as size, complexity, design features, performance, efficiency, reliability, portability, etc. Process metrics describe the effectiveness and quality of the processes that produce the software product. Examples are effort required in the process, time to produce the product, effectiveness of defect removal during development, number of defects found during testing, maturity of the process AGGA08.10.2.1 Product metrics for testingThese metrics provide information about the testing status of a software product. The data for such metrics are also generated during testing and may help us to know the quality of the product. Some of the basic metrics are effrontery as(i) turning of failures experienced in a time interval(ii) conviction interval between failures(iii) cumulative failures experienced upto a specified time(iv) m of failure(v) Estimated time for testing(vi) Actual testing timeWith these basic metrics, we may find some additional metrics as minded(p) below(i)(ii) Average time interval between failures(iii) Maximum and minimum failures experienced in any time interval(iv) Average number of failures experienced in time intervals(v) Time remaining to complete the testing.We may design similar metrics to find the indications about the quality of the product.10.2.2 Process metrics for testingThese metrics are actual to monitor the progress of testing, status of design and development of test cases and outcome of test cases after execution.Some of the basic process metrics are inclined below(i) Number of test cases designed(ii) Nu mber of test cases executed(iii) Number of test cases passed(iv) Number of test cases failed(v) Test case execution time(vi) Total execution time(vii) Time spent for the development of a test case(viii) Total time spent for the development of all test casesOn the basis of above direct measures, we may design following additional metrics which may convert the base metric data into more(prenominal) useful information.(i) % of test cases executed(ii) % of test cases passed(iii) % of test cases failed(iv) Total actual execution time / total estimated execution time(v) Average execution time of a test caseThese metrics, although simple, may help us to know the progress of testing and may provide meaningful information to the testers and project manager.An effective test plan may force us to capture data and convert it into useful metrics for process and product both. This document also guides the arranging for future projects and may also suggest changes in the existing processes in or der to produce a good quality maintainable software product.10.3 fair game Oriented Metrics used in TestingObject oriented metrics capture many attributes of a software and some of them are germane(predicate) in testing. Measuring structural design attributes of a software system, such as coupling, cohesion or complexity, is a promising approach towards early quality assessments. There are several metrics available in the literature to capture the quality of design and source code.10.3.1 Coupling MetricsCoupling traffic increase complexity, reduce encapsulation, potential reuse, and limit understanding and maintainability. The coupling metrics requires information about attribute usage and method invocations of former(a) crystalisees. These metrics are disposed in table 10.1. Higher abide bys of coupling metrics indicate that a kinsfolk under test testament require more number of stubs during testing. In addition, each interface will require to be tested thoroughly.MetricDef initionSourceCoupling between Objects. (CBO)CBO for a class is count of the number of other classes to which it is coupled.CHID94Data Abstraction Coupling (DAC)Data Abstraction is a technique of creating new data types suited for an application to be programmed.DAC = number of ADTs defined in a class.LI93Message Passing Coupling. (MPC)It counts the number of send statements defined in a class.Response for a Class (RFC)It is defined as set of methods that can be potentially executed in response to a message received by an object of that class. It is given byRFC=RS, where RS, the response set of the class, is given byCHID94Information flow-based coupling (ICP)The number of methods invoked in a class, weighted by the number of parameters of the methods invoked.LEE95Information flow-based inheritance coupling. (IHICP) corresponding as ICP, but only counts methods invocations of ancestors of classes.Information flow-based non-inheritance coupling (NIHICP)Same as ICP, but only counts meth ods invocations of classes not related through inheritance.Fan-inCount of modules (classes) that call a given class, plus the number of global data elements.BINK98Fan-outCount of modules (classes) called by a given module plus the number of global data elements adapted by the module (class).BINK98 tabular array 10.1 Coupling Metrics10.3.3 Inheritance MetricsInheritance metrics requires information about ancestors and descendants of a class. They also pull information about methods overridden, inherited and added (i.e. neither inherited nor overrided). These metrics are summarized in table 10.3. If a class has more number of children (or sub classes), more amount of testing may be required in testing the methods of that class. More is the depth of inheritance tree, more complex is the design as more number of methods and classes are involved. Thus, we may test all the inherited methods of a class and testing effort well increase accordingly.MetricDefinitionSourcesNumber of Children (NOC)The NOC is the number of immediate subclasses of a class in a power structure.CHID94Depth of Inheritance Tree (DIT)The depth of a class within the inheritance hierarchy is the maximum number of steps from the class node to the root of the tree and is measured by the number of ancestor classes.Number of Parents (NOP)The number of classes that a class directly inherits from (i.e. multiple inheritance).LORE94Number of Descendants (NOD)The number of subclasses (both direct and indirectly inherited) of a class.Number of Ancestors (NOA)The number of superclasses (both direct and indirectly inherited) of a class.TEGA92Number of Methods Overridden (NMO)When a method in a subclass has the same name and type signature as in its superclass, then the method in the superclass is said to be overridden by the method in the subclass.LORE94Number of Methods Inherited (NMI)The number of methods that a class inherits from its super (ancestor) class.Number of Methods Added (NMA)The number of new m ethods added in a class (neither inherited, nor overriding).Table 10.3 Inheritance Metrics10.3.4 Size MetricsSize metrics indicate the length of a class in terms of lines of source code and methods used in the class. These metrics are given in table 10.4. If a class has more number of methods with greater complexity, then more number of test cases will be required to test that class. When a class with more number of methods with greater complexity is inherited, it will require more rigorous testing. Similarly, a class with more number of public methods will require thorough testing of public methods as they may be used by other classes.MetricDefinitionSourcesNumber of Attributes per Class (NA)It counts the total number of attributes defined in a class.Number of Methods per Class (NM)It counts number of methods defined in a class.Weighted Methods per Class (WMC)The WMC is a count of sum of complexities of all methods in a class. Consider a class K1, with methods M1,.. Mn that are def ined in the class. Let C1,.Cn be the complexity of the methods.CHID94Number of public methods (PM)It counts number of public methods defined in a class.Number of non-public methods (NPM)It counts number of private methods defined in a class.Lines Of Code (LOC)It counts the lines in the source code. Table 10.4 Size Metrics10.4 What should we measure during testing?We should measure every thing (if possible) which we want to control and which may help us to find answers to the questions given in the beginning of this chapter. Test metrics may help us to measure the current performance of any project. The collected data may become historical data for future projects. This data is very important because in the absence of historical data, all estimates are just the guesses. Hence, it is essential to record the key information about the current projects. Test metrics may become an important indicator of the effectiveness and efficiency of a software testing process and may also identify r isky areas that may need more testing.10.4.1 TimeWe may measure many things during testing with respect to time and some of them are given as1) Time required to run a test case.2) Total time required to run a test suite.3) Time available for testing4) Time interval between failures5) Cumulative failures experienced upto a given time6) Time of failure7) Failures experienced in a time intervalA test case requires some time for its execution. A measurement of this time may help to estimate the total time required to execute a test suite. This is the simplest metric and may estimate the testing effort. We may calculate the time available for testing at any point in time during testing, if we know the total allotted time for testing. Generally unit of time is seconds, transactions or hours, per test case. Total testing time may be defined in terms of hours. Time needed to execute a mean test suite may also be defined in terms of hours.When we test a software, we experience failures. Th ese failures may be recorded in different ways like time of failure, time interval between failures, cumulative failures experienced upto given time and failures experienced in a time interval. Consider the table 10.5 and table 10.6 where time based failure specification and failure based failure specification are givenSr. zero(prenominal) of failure occurrencesFailure time measured in minutesFailure intervals in minutes1121222614335094380355012670207106368125199155301020045Table 10.5 Time based failure specificationTime in minutesCumulative failuresFailures in interval of 20 minutes200101400403600501800601100060012007011400801160090118009002001001Table 10.6 Failure based failure specificationThese two tables give us the idea about failure pattern and may help us to define the following1) Time taken to experience n failures2) Number of failures in a particular time interval3) Total number of failures experienced after a specified time4) Maximum / minimum number of failures experien ced in any regular time interval.10.4.2 Quality of source codeWe may know the quality of the delivered source code after reasonable time of release using the following formulaWhere WDB Number of weighted defects found before releaseWDA Number of weighted defects found after releaseThe weight for each defect is defined on the basis of defect validity and removal cost. A severity is assigned to each defect by testers based on how important or serious is the defect. A lower shelter of this metric indicates the less number of error detection or less serious error detection.We may also calculate the number of defects per execution test case. This may also be used as an indicator of source code quality as the source code progressed through the series of test activities STEP03.10.4.3 Source Code CoverageWe may like to execute every statement of a program at least once before its release to the customer. Hence, percentage of source code coverage may be calculated asThe higher value of thi s metric given confidence about the effectiveness of a test suite. We should write additional test cases to cover the uncovered portions of the source code.10.4.4 Test Case Defect DensityThis metric may help us to know the efficiency and effectiveness of our test cases.Where Failed test case A test case that when executed, produced an undesired output.Passed test case A test case that when executed, produced a desired outputHigher value of this metric indicates that the test cases are effective and efficient because they are able to detect more number of defects.10.4.5 Review EfficiencyReview efficiency is a metric that gives insight on the quality of review process carried out during verification.Higher the value of this metric, better is the review efficiency.10.5 Software Quality Attributes Prediction ModelsSoftware quality is dependent on many attributes like reliability, maintainability, fault proneness, testability, complexity, etc. Number of models are available for the predi ction of one or more such attributes of quality. These models are especially beneficial for large-scale systems, where testing experts need to focus their attention and resources to riddle areas in the system under development.10.5.1 Reliability ModelsMany reliability models for software are available where emphasis is on failures rather than faults. We experience failures during execution of any program. A fault in the program may lead to failure(s) depending upon the input(s) given to a program with the purpose of executing it. Hence, time of failure and time between failures may help us to find reliability of software. As we all know, software reliability is the probability of failure free operation of software in a given time under specified conditions. Generally, we consider the calendar time. We may like to know the probability that a given software will not fail in one month time or one week time and so on. However, most of the available models are based on execution time. T he execution time is the time for which the computer actually executes the program. Reliability models based on execution time normally give better results than those based on calendar time. In many cases, we have a mapping table that converts execution time to calendar time for the purpose of reliability studies. In order to differentiate both the timings, execution time is re leaveed byand calendar time by t.Most of the reliability models are applicable at system testing level. Whenever software fails, we note the time of failure and also try to locate and correct the fault that caused the failure. During system testing, software may not fail at regular intervals and may also not follow a particular pattern. The variation in time between successive failures may be described in terms of following gets () average number of failures upto time () average number of failures per unit time at time and is know as failure colour function.It is expected that the reliability of a program increases due to fault detection and correction over time and hence the failure enduringness decreases accordingly.(i) Basic Execution Time ModelThis is one of the popular model of software reliability assessment and was developed by J.D. MUSA MUSA79 in 1979. As the name indicates, it is based on execution time (). The basic assumption is that failures may occur according to a non-homogeneous poisson process (NHPP) during testing. Many examples may be given for real world events where poisson processes are used. Few examples are given as* Number of users using a website in a given period of time.* Number of persons requesting for railway tickets in a given period of time* Number of e-mails expected in a given period of time.The failures during testing re takes a non-homogeneous process, and failure chroma decreases as a function of time. J.D. Musa assumed that the decrease in failure fervor as a function of the number of failures observed, is constant and is given asWhere Initi al failure intensity at the start of testing. Total number of failures experienced upto infinite time Number of failures experienced upto a given point in time.Musa MUSA79 has also given the race between failure intensity () and the mean failures experienced () and is given in 10.1.If we take the first derivative of compare given above, we get the cant over of the failure intensity as given belowThe negative sign shows that there is a negative slope indicating a decrementing trend in failure intensity.This model also assumes a uniform failure pattern meaning thereby equal probability of failures due to various faults. The relationship between execution time () and mean failures experienced () is given in 10.2The derivation of the relationship of 10.2 may be obtained asThe failure intensity as a function of time is given in 10.3.This relationship is useful for calculating present failure intensity at any given value of execution time. We may find this relationshipTwo additional eq uations are given to calculate additional failures required to be experienced to reach a failure intensity objective (F) and additional time required to reach the objective. These equations are given as Where Expected number of additional failures to be experienced to reach failure intensity objective. Additional time required to reach the failure intensity objective. Present failure intensity Failure intensity objective. and are very interesting metrics to know the additional time and additional failures required to make a failure intensity objective.Example 10.1 A program will experience 100 failures in infinite time. It has now experienced 50 failures. The sign failure intensity is 10 failures/hour. Use the basic execution time model for the following(i) Find the present failure intensity.(ii) Calculate the decrement of failure intensity per failure.(iii) Determine the failure experienced and failure intensity after 10 and 50 hours of execution.(iv) Find the additional failure s and additional execution time needed to reach the failure intensity objective of 2 failures/hour.Solution(a) Present failure intensity can be calculated using the following equation(b) Decrement of failure intensity per failure can be calculated using the following(c) Failures experienced and failure intensity after 10 and 50 hours of execution can be calculated as(i) After 10 hours of execution(ii) After 50 hours of execution(d) and with failure intensity objective of 2 failures/hour(ii) Logarithmic Poisson Execution time modelWith a slight modification in the failure intensity function, Musa presented logarithmic poisson execution time model. The failure intensity function is given asWhere Failure intensity decay parameter which represents the relative change of failure intensity per failure experienced.The slope of failure intensity is given asThe expected number of failures for this model is always infinite at infinite time. The relation for mean failures experienced is given asThe expression for failure intensity with respect to time is given asThe relationship for additional number of failures and additional execution time are given asWhen execution time is more, the logarithmic poisson model may give large values of failure intensity than the basic model.Example 10.2 The initial failure intensity of a program is 10 failures/hour. The program has experienced 50 failures. The failure intensity decay parameter is 0.01/failure. Use the logarithmic poisson execution time model for the following(a) Find present failure intensity.(b) Calculate the decrement of failure intensity per failure.(c) Determine the failure experienced and failure intensity after 10 and 50 hours of execution.(d) Find the additional failures and additional and failure execution time needed to reach the failure intensity objective of 2 failures/hour.Solution(a) Present failure intensity can be calculated as= 50 failures= 50 failures= 0.01/faluresHence= 6.06 failures/hour(b) Decrement of failure intensity per failure can be calculated as(c) Failure experienced and failure intensity after 10 and 50 hours of execution can be calculated as(i) After 10 hours of execution(ii) After 50 hours of execution(d) and with failure intensity objective of 2 failures/hour(iii) The Jelinski Moranda ModelThe Jelinski Moranda model JELI72 is the earliest and simples software reliability model. It proposed a failure intensity function in the form ofWhere = unbroken of proportionalityN = total number of errors presenti = number of errors found by time interval ti.This model assumes that all failures have the same failure rate. It means that failure rate is a step function and there will be an improvement in reliability after fixing an error. Hence, every failure contributes equally to the overall reliability. Here, failure intensity is directly proportional to the number of errors remaining in a software.Once we know the value of failure intensity function using any reliability mo del, we may calculate reliability using the equation given belowWhere is the failure intensity and t is the operating time. Lower the failure intensity and higher is the reliability and vice versa.Example 10.3 A program may experience 200 failures in infinite time of testing. It has experienced 100 failures. Use Jelinski-Moranda model to calculate failure intensity after the experience of 150 failures?SolutionTotal expected number of failures (N) = 200Failures experienced (i) =100Constant of proportionality () = 0.02We know= 2.02 failures/hourAfter 150 failures= 0.02 (200-150+1)=1.02 failures/hourFailure intensity will decrease with every additional failure experience.10.5.2 An example of fault prediction model in practiceIt is clear that software metrics can be used to capture the quality of object oriented design and code. These metrics provide ways to evaluate the quality of software and their use in earlier phases of software development can help organizations in assessing a la rge software development quickly, at a low cost.To achieve help for planning and executing testing by focusing resources on the fault prone parts of the design and code, the model used to predict faulty classes should be used. The fault prediction model can also be used to identify classes that are prone to have severe faults. One can use this model with respect to high severity of faults to focus the testing on those parts of the system that are likely to cause serious failures. In this section, we describe models used to find relationship between object oriented metrics and fault proneness, and how such models can be of great help in planning and executing testing activities MALH09, SING10.In order to perform the analysis we used public domain KC1 NASA data set NASA04 The data set is available on www.mdp.ivv.nasa.gov. The 145

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.