What is network latency?
Latency is a measure of delay. In a network, latency measures the time it takes for some data to get to its destination across the network. It is usually measured as a round trip delay - the time taken for information to get to its destination and back again. The round trip delay is an important measure because a computer that uses a TCP/IP network sends a limited amount of data to its destination and then waits for an acknowledgement to come back before sending any more. Thus, the round trip delay has a key impact on the performance of the network.
Latency is usually measured in milliseconds (ms).
What are typical values for latency?
Typical, approximate, values for latency that you might experience include:
- 800ms for satellite
- 120ms for 3G cellular data
- 60ms for 4G cellular data which is often used for 4G WAN and internet connections
- 20ms for an mpls network such as BT IP Connect, when using Class of Service to prioritise traffic
- 10ms for a modern Carrier Ethernet network such as BT Ethernet Connect or BT Wholesale Ethernet in the UK
Why is latency important?
People often assume that high performance comes from high bandwidth, but that's not full picture.
- The bandwidth of a network or a network circuit refers to its capacity to carry traffic. It is measured in bit per second; commonly Megabits per second (Mbps).
- A higher bandwidth means that more traffic can be carried; for example, more simultaneous conversations. It does not imply how fast that communication will take place (although if you attempt to put more traffic over a network than the available bandwidth, you'll get packets of data being discarded and re-transmitted later, which will degrade your performance).
Latency, on the other hand, refers to the length of time it takes for the data that you feed into one end of your network to emerge at the other end. Actually, we usually measure the round trip time; for data to get to one end, and back again.
Why is it important to count the time in both directions?
Well, as we'll see below, TCP sends acknowledgement bits back to the sender, and it turns out, that this is critical.
- It's fairly intuitive that a bigger delay means a slower connection.
- However, due to the nature of TCP/IP (the most widely used networking protocol), latency has a more complex and far reaching impact on performance: latency drives throughput.
Latency drives throughput
A network typically carries multiple simultaneous conversations.
- Bandwidth limits the number of those conversations that can be supported.
- Latency drives the responsiveness of the network - how fast each conversation can be had.
- For TCP/IP networks, latency also drives the maximum throughput of a conversation (how much data can be transmitted by each conversation in a given time).
Latency can become a particular problem for throughput because of the way TCP (Transmission Control Protocol) works.
TCP is concerned with making sure all of the packets of your data get to their destination safely, and in the correct order. It requires that only a certain amount of data is transmitted before waiting for an acknowledgement.
Imagine a network path is a long pipe filling a bucket with water. TCP requires that once the bucket is full, the sender has to wait for an acknowledgement to come back along the pipe before any more water can be sent.
In real life, this bucket is usually 64KB in size. That's 65535 (ie 2^16) x 8 = 524,280 bits. It's called the TCP Window.
Let's imagine a scenario in which it takes half a second for water to get down the pipe, and another half a second for the acknowledgement to come back ... a latency of 1 second.
In this scenario the TCP protocol would prevent you from sending any more than 524,280 bits in any one second period. The most you could possibly get down this pipe is 524,280 bit per second (bps) - otherwise expressed as half a megabit per second.
Notice that (barring other issues that may slow things down) the only thing driving this is latency.
Max throughput can never be more than the bucket size divided by the latency.
So how does latency impact throughput in real life?
Clearly, if you have latency-sensitive applications then you need to be mindful of the latency of your network. Look out for situations where there might be unexpectedly excessive latency that will impact throughput. For example, international circuits.
Another interesting case is with 4G Cellular WAN, where one uses the 4G network to create a reliable, high speed connection to your corporate network, or to the internet. This involves the use of multiple SIMs that are often bonded together into a single, highly reliable connection. In this case, the latency of the bonded connection tends towards the greatest latency of all the individual connections.
If you consider the difference between 3G and 4G in the list above, you'll see that including 3G connections can have a big impact on the overall latency. Read more about 4G WAN in our Guide to 4G WAN.
Remember, though, that latency is not the only cause of poor application performance. When we researched the root cause of performance issues in our customers' networks, we found that only 30% were caused by the network. The other 70% were caused by issues with the application, database or infrastructure. To get to the bottom of such problems, you often need an Application Performance Audit, or perhaps to set up Critical Path Monitoring on your IT estate. Generally, you'll track latency and other performance-impacting indicators using a Network and Application monitoring toolset. See this post for more on building the best managed network provider monitoring.
Latency impact in action
Have a look at this calculator. Set the slider to some of the latencies listed above, and see what they do to the maximum throughput for TCP/IP traffic.