Print Page | Close Window

Metrics used for Evaluating the performance of Web

Printed From: One Stop Testing
Category: Types Of Software Testing @ OneStopTesting
Forum Name: Performance & Load Testing @ OneStopTesting
Forum Discription: Discuss All that is need to be known about Performance & Load Testing and its Tools.
URL: http://forum.onestoptesting.com/forum_posts.asp?TID=7304
Printed Date: 18Jan2025 at 2:37am


Topic: Metrics used for Evaluating the performance of Web
Posted By: Mithi25
Subject: Metrics used for Evaluating the performance of Web
Date Posted: 09Nov2009 at 12:47am

During a test session, virtual clients generate result data (metrics) as they run

scenarios against an application. These metrics determine the applicationÙs

performance, and provide specific information on system errors and individual

functions. Understanding these different metrics will enable you to match them to

the application function and build a more streamlined test plan. (The names may

differ, but the following metrics are provided by most of the popular testing

software.)

 

I. Measuring Scalability and Performance

Depending on your application, scalability and performance testing could be a

priority. For example, if the website is open to the public during a sporting event,

i.e. the Olympics, the application will need to scale effortlessly as millions of people

log on. Conversely if you are testing an in-house benefits enrollment application, the

number of users is predictable as is the timeframe for the applicationÙs use and

ensuring availability is a simpler test process.

So, how do you measure the scalability or performance of an application? How do

you compare two sessions and know which session created more stress on the

application? Which statistic should you use to compare two load machines testing the

same application? To start with, itÙs important to understand what you are testing

when you are looking at site usage.

5

Hits Per Second

A Hit is a request of any kind made from the virtual client to the application being

tested. The higher the Hits Per Second, the more requests the application is

handling per second.

A virtual client can request an HTML page, image, file, etc. Testing the application

for Hits Per Second will tell you if there is a possible scalability issue with the

application. For example, if the stress on an application increases but the Hits Per

Second does not, there may be a scalability problem in the application.

One issue with this metric is that Hits Per Second relates to all requests equally.

Thus a request for a small image and complex HTML generated on the fly will both

be considered as hits. It is possible that out of a hundred hits on the application, the

application server actually answered only one and all the rest were either cached on

the web server or other caching mechanism.

So, it is very important when looking at this metric to consider what and how the

application is intended to work. Will your users be looking for the same piece of

information over and over again (a static benefit form) or will the same number of

users be engaging the application in a variety of tasks – such as pulling up images,

purchasing items, bringing in data from another site? To create the proper test, it is

important to understand this metric in the context of the application. If youÙre

testing an application function that requires the site to ‘work,Ù as opposed to present

static data, use the pages per second measurement.

Pages Per Second

Pages Per Second measures the number of pages requested from the application per

second. The higher the Page Per Second the more work the application is doing per

second. Measuring an explicit request in the script or a frame in a frameset provides

a metric on how the application responds to actual work requests. Thus if a script

contains a Navigate command to a URL, this request is considered a page. If the

HTML that returns includes frames they will also be considered pages, but any other

elements retrieved such as images or JS Files, will be considered hits, not pages.

This measurement is key to the end-userÙs experience of application performance.

There is a correlation between the Page Per Second and the stress inflicted on an

application. If the stress increases, but the Page Per Second count doesnÙt, there

may be a scalability issue. For example, if you begin with 75 virtual users requesting

25 different pages concurrently and then scale the users to 150, the Page Per

Second count should increase. If it doesnÙt, some of the virtual users arenÙt getting

their pages. This could be caused by a number of issues and one likely suspect is

throughput.

Throughput

This is an important baseline metric and is often used to check that the application

and its server connection is working. Throughput measures the average number of

bytes per second transmitted from the application being tested to the virtual clients

running the test agenda during a specific reporting interval. This metric is the

response data size (sum) divided by the number of seconds in the reporting interval.

Generally, the more stress on an application, the more Throughput. There should be

a strong, predictable correlation between these two. If the stress increases, but the

Throughput does not, there may be a scalability issue or an application issue.

6

Another note about Throughput as a measurement – it generally doesnÙt provide any

information about the content of the data being retrieved. Thus it can be misleading

especially in regression testing. When building regression tests, leave time in the

testing plan for comparing returned data quality.

Rounds

Another useful scalability and performance metric is the testing of Rounds. Rounds

tells you the total number of times the test agenda was executed versus the total

number of times the virtual clients attempted to execute the Agenda. The more

times the agenda is executed, the more work is done by the test and the application.

There is a correlation between Rounds and the stress executed on an application.

The test scenario the agenda represents influences the rounds measurement.

This metric can provide all kinds of useful information from the benchmarking of an

application to the end-user availability of a more complex application. It is not

recommended for regression testing because each test agenda may have a different

scenario and/or length of scenario.

II. Application Responses and Availability

Now that we have the scalability and performance issues outlined, which metrics

should be used to measure the userÙs experience?

Hit Time

Hit time is the average time in seconds it took to successfully retrieve an element of

any kind (image, HTML, etc). The time of a hit is the sum of the Connect Time, Send

Time, Response Time and Process Time. It represents the responsiveness or

performance of the application to the end user. The more stressed the application,

the longer it should take to retrieve an average element. But, like Hits Per Second,

caching technologies can influence this metric. Getting the most from this metric

requires knowledge of how the application will respond to the end user.

This is also an excellent metric for application monitoring after deployment. Using

baseline data, a test insert ‘probeÙ can alert test or QA when the application slows to

a certain response time.

Time to First Byte

This measurement is important because end users often consider a site

malfunctioning if it does not respond fast enough. Time to First Byte measures the

number of seconds it takes a request to return its first byte of data to the test

softwareÙs Load Generator. For example, Time to First Byte represents the time it

took after the user pushes the “enter” button in the browser until the user starts

receiving results. Generally, more concurrent user connections will slow the response

time of a request. But there are also other possible causes for a slowed response.

For example, there could be issues with the hardware, system software or memory

issues as well as problems with database structures or slow-responding components

within the application.

Page Time

Page Time calculates the average time in seconds it takes to successfully retrieve a

page with all of its content. This statistic is similar to Hit Time but relates only to

pages. In most cases this is a better statistic to work with because it deals with the

7

true dynamics of the application. Since not all hits can be cached, this data is more

helpful in terms of tracking a userÙs experience (positive or frustrated). ItÙs

important to note that in many test software application tools you can turn caching

on or off depending on your application needs.

Generally, the more stress on the site the slower its response. But since stress is a

combination of the number of concurrent users and their activity, greater stress may

or may not impact the user experience. It all depends upon the applicationÙs

functions and users. A site with 150 concurrent users looking up benefit information

will differ from a news site during a national emergency. As always, metrics must be

examined within context.

III. Integrity and Failures

Failed Rounds/Failed Rounds Per Second

During a load test itÙs important to know that the application requests perform as

expected. The Failed Rounds and Failed Rounds Per Second tests the number of

rounds that fail.

This metric is an “indicator metric” that provides QA and test with clues to the

application performance and failure status. If you start to see Failed Rounds or

Failed Rounds Per Second, then you would typically look into the logs to see what

types of failures correspond to this metric report. Also, with some software test

packages, you can set what the definition of a failed round in an application.

Sometimes, basic image or page missing errors (HTTP 404 error codes) could be set

to fail a round, which would stop the execution of the test agenda at that point and

start at the top of the agenda again, thus not completing that particular round.

Failed Hits/Failed Hits Per Second

This test offers insight into the applicationÙs integrity during the load test. An

example of a request that might fail during execution is a broken link or a missing

image from the server. The number of errors should grow with the load size. If

there are no errors with a low load, the number of errors with a high load should

remain zero. If the percentage of errors only increases during high loads, the

application may have a scalability issue.

Failed Connections

This test is simply the number of connections that were refused by the application

during the test. This test leads to other tests. A failed connection could mean the

server was too busy to handle all the requests, so it started refusing them. It could

be a memory issue. It could also mean that the user sent bogus or malformed data

to which the server couldnÙt respond so it refused the connection.

Conclusion

When a testing team pairs exactly what can be learned from each test to the user

experience, testing time can be spent wisely. Working with the applicationÙs sponsor

and the development team early in this process is key, as it will impact the types of

metrics needed to examine and ensure performance. Also, this provides the testing

team with the time needed to draw up an efficient test plan, gather any additional

hardware and software tools needed, and schedule the testing. While this isnt

always possible, it should be a team goal, as it will absolutely ensure better results

through comprehensive and thorough testing that is performed early and often

-------------
http://www.quick2sms.com - Send Unlimited FREE SMS to Any Mobile Anywhere in INDIA,
Click Here



Print Page | Close Window