## Histogram Analysis in Performance Testing

In this blog we are going to see some interesting topic Histogram Analysis in Performance Testing. Test analysis is a critical phase in performance testing. It requires experience and some statistical skills to analyze and provide recommendations.

In LoadRunner or Jmeter or Rational Performance Tester or CA LISA provides following parameters in the test results: Average response time, mean, median, standard deviation, percentile, throughput etc.

First we will see the definitions and the MS Excel formula to calculate it. Average  which is the arithmetic mean, and is calculated by adding a group of numbers and then dividing by the count of those numbers. For example, the average of 2, 3, 3, 5, 7, and 10 is 30 divided by 6, which is 5. Formula : =AVERAGE(A1:A20)

Mode: Returns the most frequently occurring, or repetitive, value in an array or range of data. Formula : =MODE(A1:A20)

Median   which is the middle number of a group of numbers; that is, half the numbers have values that are greater than the median, and half the numbers have values that are less than the median. For example, the median of 2, 3, 3, 5, 7, and 10 is 4.

A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. To find 90th percentile, use =PERCENTILE(A1:A20,0.9)

Histogram is a bar data chart which helps you to create a picture of data distribution. E.g. if you have a data of {1,1,2,4,4,6,8,9,10,10,12}. Number of ones: 2, number of two: 1, number of fours: 2 and so on. But in case if the data is unique, bar chart will be one for all the data. Hence, we create bin, bin is the data range, i.e.

Data range                          Frequency

0-2                                          2

3-5                                          2

6-8                                          2

9-11                                       3

12-14                                     1

If we plot the graph, it will look like below.

Now we will see how to plot histogram graph to analyze the performance testing results. Following data is required and you should have Excel 2003 or higher version:

• Response time
• Timestamp (optional)
• Analysis ToolPak – default Excel Add-in

How to enable Analysis ToolPak add-in in MS Excel 2010?

• Open Excel
• Go to File > Options
• In right side pane, select Excel Add-ins and click on Go
• Check Analysis ToolPak and Analysis ToolPak – VBA
• Click on OK to enable
• Go to Data tab, you could be able to see Data Analysis under Analysis group.

How to plot Histogram analysis in MS Excel 2010?

Download this excel sheet for your reference. In Response Time tab, you could see the data for your analysis. I have around 18000 rows of data in that sheet. First step is you need to calculate Standard Deviation, Average, Median, Min, Max, and 90th/95th percentile as shown below.

You can check the excel formulas for the bin calculation. Now to plot Histogram, go to Data > Data Analysis and then select Histogram, click on OK.

Select the Response Time as a input range and select calculated Bin as a bin range as shown below and then click on OK. Bin and Frequency will get generated in the new sheet (Sheet 4). To plot a chart, select the data range and insert chart by navigating Insert > Scatter chart, Scatter chart looks good for Histogram analysis.

To generate Cumulative data, select Cumulative Percentage in Histogram window as shown above. Below is the final histogram graph which represents the response time spread.

To simply put, list out the response time in column A, find out the maximum response time say 5000. In Column B, create a bin range from 100 to 5000 in the interval of 100 as shown in Your Response Time sheet. Generate a Histogram for that input range.

Below is the screencast on how to do histogram analysis in performance testing: