Using Adaptive Read-Ahead to Improve I/O Throughput on Linux
Mulyadi Santosa and Fengguang Wu
For years, I/O has been a major problem when dealing
with latency. CPU, memory, network connection, and bus are getting faster
in significant way. But what about disk? With the birth of SATA, disk is
getting some acceleration, but the problem with disk access is that it is
so slow compared to memory access.
Disk cache was invented to overcome this. When a read
operation is issued, data is brought to RAM. From there, the CPU can access
it. Furthermore, when data is written to the disk, it is put to buffer and
later pushed to the disk. This is done for two reasons. The first reason is
for reading operations. Here we are caching the data to anticipate further
read requests on the same blocks. As stated in the temporal locality
principle, a program tends to read same data over and over again during a
certain interval. If we apply that to disk access, on the second access,
there is no need to read from the disk, the CPU can just access it through
disk cache.
The second reason is for writing operations. If the CPU
must wait until the data is actually written to the disk, the CPU will be
completely blocked. A solution for this is using buffer as temporary
storage. The CPU just puts the data on buffer and continues to work.
There's no need to wait for lengthy disk access.
Another feature can help you overcome disk latency:
read-ahead. The basic idea is that if you read a block from a file, there is a chance that you will read the next block of the
same file. Thus, if somehow the kernel can predict it, it can "read-ahead" to the next block(s). So, when the data is
needed, it is already on the disk cache.
|