Selecting and Tuning a Scheduler
The obvious first question that must be answered is this: Which scheduler should you use? It's a deceptively difficult question, and the answer will depend on many inter-related factors, including the applications running, the size of the files you are read and writing, the frequency of your file read and writes, and the pattern of these reads and writes. The only thing that can be said with much certainty is that unless you are using a solid state drive or RAM disk which can access all files equally fast the noop scheduler is the worst choice. The other, active schedulers should all perform better than noop with conventional spinning disks.
There's some evidence that when many different types of applications are requesting many different types of disk reads and writes the deadline scheduler is the best all-round performer, but in the end the best course of action is probably to test all three active schedulers and choose the one that gives the best results.
So once you've chosen a scheduler to test, how do you get your system to use it? There two primary ways to do this: at boot through a configuration file or on the fly from the command line. The examples we use here work for Red Hat Enterprise Linux but should be similar for any distribution you happen to be using.
To set a scheduler at boot, edit
to the end of the line that specifies the kernel. Scheduler names include "noop", "cfq", "deadline" and "as" (for anticipatory).
Alternatively, to set a given scheduler for disk hda on the fly, simply bring up a terminal and type:
To verify what scheduler hda is currently using, type
You'll see something like:
noop anticipatory [deadline] cfq
which would indicate that the deadline scheduler is currently in use.
Once you've chosen and set a scheduler, you can tune it to work optimally with your system by altering various parameters. These parameters differ for each scheduler. The exception to this is the noop scheduler, which actually has no tunable parameters.
The parameters are set in files located at:
For example, the parameter read_expire on a device hda is stored in:
The parameter can be set to 80 (milliseconds) using the command:
echo 80 > /sys/block/hda/queue/iosched/read_expire
This has five tunable parameters:
- read_expire - the most important parameter for the deadline scheduler, it dictates the maximum time a read request must wait before being serviced.
- write_expire - the maximum time before a write request must be serviced.
- fifo_batch - the number of requests in each batch sent for immediate servicing once they have expired.
- writes_starved - this alters the amount of priority that read requests get over write requests. Since unserviced reads affect performance more than unserviced writes, it's usual that reads get some measure of priority.
- front_merges - this is disabled by default. Setting its value to 1 results in requests contiguous to existing requests in the scheduler being merged to the front of the queue instead of the back.
In addition to read_expire and write_expire, the anticipatory scheduler also includes:
- read_batch_expire - the time spent servicing read requests before servicing pending write requests.
- write_batch_expire - the reverse of the above
- antic_expire - the time in milliseconds that the scheduler pauses while waiting for a follow-on request from an application before servicing the next request in the queue
- quantum - the number of internal queues that the requests are taken from in one cycle and moved to the dispatch queue for processing. The cfq scheduler may have 64 internal queues, but only move requests to the dispatch queue by visiting the first eight internal queues, followed by the second eight in the next cycle, and so on.
- queued - the maximum number of requests allowed in a given internal queue.
Although there are no hard and fast rules, a sensible strategy is probably to benchmark I/O using different schedulers with their default parameter settings, and then to choose the most appropriate scheduler and attempt to fine-tune the parameter settings by referring to application-specific recommendations.
Don't forget that scheduling isn't the be all and end all of I/O optimization. Other factors that can affect performance include prefetching, disk capacity and spin rate (because these can affect seek time) and even the file system you choose for a given disk. One thing is certain: Whatever performance you are getting now, you can almost certainly improve it if you have the time, the will and the knowledge.
Paul Rubens is an IT consultant and journalist based in Marlow on Thames, England. He has been programming, tinkering and generally sitting in front of computer screens since his first encounter with a DEC PDP-11 in 1979.