Traditional load estimation
Traditionally servers are provisioned to support a maximum number of concurrent users, and load is often defined this way.
There is a problem with using this approach to run a queue for a website - a problem we solve with rates.
The problem arises because visitors are only actually using the webserver while their browser is loading pages.
Here's an example
For example, let’s say a site is selling tickets. From the start of the visitor journey to the end of the visitor journey there are five pages as the visitor
- arrives at the site,
- selects an event,
- chooses seats and tickets,
- enters payment details and
- sees an order confirmation page.
The entire transaction process takes five minutes, on average.
Let's say, again for example, 100 visitors arrive to buy tickets every minute, and the average time to complete the process is five minutes. In any given minute there are 500 people in the transaction process, and the server must be provisioned to support 500 concurrent users.
Counting concurrent users
One might think that one could create a Concurrent User Counter that is incremented every time a new visitor arrives at the home page, and decremented every time a visitor completes the transaction, and that this would tell you how many concurrent users your web server has at any one time, but this won’t work.
It won’t work because not every visitor completes the transaction. People change their minds, or get distracted, or go away and come back later.
Because the webserver is only interacting with the visitors when their browsers are loading pages, the webserver has no way of knowing when this has happened, or to whom.
Instead, to determine that a visitor is no longer part of the transaction process, the webserver has to wait to see if no more page requests arrive from that visitor and then time out that visitor’s session. Timeouts must always be set to much longer than the average transaction time (say 20 minutes in this example) as some visitors take much longer than others to complete their transaction.
One could add the facility to also decrement one’s notional Concurrent User Counter every time a visitor session times out in this way, but this gives very poor results.
If 10% of the 100 visitors that arrive every minute do not go on to complete their transaction, the ones that do complete will be active for five minutes on average and the ones that don’t complete will be considered active for 20 minutes always.
That’s 90 * 5 + 10 * 20 = 650, and the server will report 650 concurrent users, even though it is actually less busy!
What about the timing out users?
Furthermore, as many as 10 * 20 = 200 of those concurrent users are not actually using the site and are in the process of timing out, which is over 30% of the reported concurrent users, even though it’s only 10% of the visitors that fail to complete the transaction.
Now let’s say one wishes to add a queue to this website, controlled by our Concurrent User Counter. Once the site is at capacity, then an individual who is at the front of the queue will only be passed to the site once the counter decrements. This is called a one-out-one-in queue. What will happen is that 30% of the time, a person at the front of the queue will be waiting for someone who isn’t going to complete their transaction to finish not completing their transaction.
That is very clearly broken.
There is also the additional technical integration work and load on the website of creating and running the counter, and sharing that information with the queue system. If the sharing mechanism goes down or breaks, the queue gets stuck too.
The solution: Use Rates
All these problems are easily and simply solved by instead sending the visitors from the front of the queue at a constant, fixed rate – in this case 100 visitors per minute.
That way there’s no need to measure the actual number of concurrent users, and no need for complex integration with the ticketing website, and no risk of the queue getting stuck.
That’s why we invented and patented the rate-based Virtual Waiting Room for busy websites in 2004.
If you know your web server’s maximum number of concurrent users and the average transaction time or visit duration, you can convert this into a Queue Rate by dividing the number of users by the duration, like this:
Queue rate = Concurrent users / Transaction time