ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 【I/O scheduler】Linux的磁盘调度策略

【I/O scheduler】Linux的磁盘调度策略

原创 Linux操作系统 作者:ballontt 时间:2013-08-10 19:59:36 0 删除 编辑

磁盘的调度算法有多种,先来先服务(First Come,First Server,FCFS),最短寻道优先(Shortest Seek Time First,SSTF),扫描算法SCAN等等。

这里介绍Linux支持的4种磁盘调度算法:

The Schedulers

There are currently 4 available:

  • Noop Scheduler
  • Anticipatory IO Scheduler ("as scheduler")
  • Deadline Scheduler
  • Complete Fair Queueing Scheduler ("cfq scheduler")

Noop Scheduler

This scheduler only implements request merging.

在Linux2.4或更早的版本的调度程序,那时只有这一种I/O调度算法.
NOOP算法的全写为No Operation。该算法实现了最最简单的FIFO队列,所有IO请求大致按照先来后到的顺序进行操作。之所以说“大致”,原因是NOOP在FIFO的基础上还做了相邻IO请求的合并,并不是完完全全按照先进先出的规则满足IO请求。NOOP假定I/O请求由驱动程序或者设备做了优化或者重排了顺序(就像一个智能控制器完成的工作那样)。在有些SAN环境下,这个选择可能是最好选择。Noop 对于 IO 不那么操心,对所有的 IO请求都用 FIFO 队列形式处理,默认认为 IO 不会存在性能问题。这也使得 CPU 也不用那么操心。当然,对于复杂一点的应用类型,使用这个调度器,用户自己就会非常操心。
NOOP对于闪存设备,RAM,嵌入式系统是最好的选择.

Anticipatory IO Scheduler ("as scheduler")

The anticipatory scheduler is the default scheduler in older 2.6 kernels – if you’ve not specified one, this is the one that will be loaded. It implements request merging, a one-way elevator, read and write request batching, and attempts some anticapatory reads by holding off a bit after a read batch if it thinks a user is going to ask for more data. It tries to optimise for physical disks by avoiding head movements if possible – one downside to this is that it probably give highly erratic performance on database or storage systems.

CFQ和DEADLINE考虑的焦点在于满足零散IO请求上。对于连续的IO请求,比如顺序读,并没有做优化。为了满足随机IO和顺序IO混合的场景,Linux还支持ANTICIPATORY调度算法。ANTICIPATORY的在DEADLINE的基础上,为每个读IO都设置了6ms 的等待时间窗口。如果在这6ms内OS收到了相邻位置的读IO请求,就可以立即满足 Anticipatory scheduler(as) 曾经一度是 Linux 2.6 Kernel 的 IO scheduler 。Anticipatory 的中文含义是”预料的, 预想的”, 这个词的确揭示了这个算法的特点,简单的说,有个 IO 发生的时候,如果又有进程请求 IO 操作,则将产生一个默认的 6 毫秒猜测时间,猜测下一个 进程请求 IO 是要干什么的。这对于随即读取会造成比较大的延时,对数据库应用很糟糕,而对于 Web Server 等则会表现的不错。这个算法也可以简单理解为面向低速磁盘的,因为那个”猜测”实际上的目的是为了减少磁头移动时间。 

Deadline Scheduler

The deadline scheduler implements request merging, a one-way elevator, and imposes a deadline on all operations to prevent resource starvation. Because writes return instantly within linux, with the actual data being held in cache, the deadline scheduler will also prefer readers – as long as the deadline for a write request hasn’t passed. The kernel docs suggest this is the preferred scheduler for database systems, especially if you have TCQ aware disks, or any system with high disk performance.

DEADLINE在CFQ的基础上,解决了IO请求饿死的极端情况。除了CFQ本身具有的IO排序队列之外,DEADLINE额外分别为读IO和写IO提供了FIFO队列。读FIFO队列的最大等待时间为500ms,写FIFO队列的最大等待时间为5s。FIFO队列内的IO请求优先级要比CFQ队列中的高,,而读FIFO队列的优先级又比写FIFO队列的优先级高。

优先级可以表示如下: 
FIFO(Read) > FIFO(Write) > CFQ 

deadline 算法保证对于既定的 IO 请求以最小的延迟时间,从这一点理解,对于 DSS 应用应该会是很适合的。

Complete Fair Queueing Scheduler ("cfq scheduler")

The complete fair queueing scheduler implements both request merging and the elevator, and attempts to give all users of a particular device the same number of IO requests over a particular time interval. This should make it more efficient for multiuser systems. It seems that Novel SLES sets cfq as the scheduler by default, as does the latest Ubuntu release. As of the 2.6.18 kernel, this is the default schedular in kernel.org releases.

CFQ算法的全写为Completely Fair Queuing。该算法的特点是按照IO请求的地址进行排序,而不是按照先来后到的顺序来进行响应。 
在传统的SAS盘上,磁盘寻道花去了绝大多数的IO响应时间。CFQ的出发点是对IO地址进行排序,以尽量少的磁盘旋转次数来满足尽可能多的IO请求。在CFQ算法下,SAS盘的吞吐量大大提高了。但是相比于NOOP的缺点是:先来的IO请求并不一定能被满足,可能会出现饿死的情况。 
Completely Fair Queuing (cfq, 完全公平队列) 在 2.6.18 取代了 Anticipatory scheduler 成为 Linux Kernel 默认的 IO scheduler 。cfq 对每个进程维护一个 IO 队列,各个进程发来的 IO 请求会被 cfq 以轮循方式处理。也就是对每一个 IO 请求都是公平的。这使得 cfq 很适合离散读的应用(eg: OLTP DB)。我所知道的企业级 Linux 发行版中,SuSE Linux 好像是最先默认用 cfq 的. 

Changing Schedulers

The most reliable way to change schedulers is to set the kernel option ‘elevator’ at boot time. You can set it to one of "as", "cfq", "deadline" or "noop", to set the appropriate scheduler.

It seems under more recent 2.6 kernels (2.6.11, possibly earlier), you can change the scheduler at runtime by echoing the name of the scheduler into /sys/block//queue/scheduler, where devicename is the base name of the block device, eg sda for /dev/sda

Which one should I use?

I’ve not personally done any testing on this, so I can’t speak from experience yet. The anticipatory scheduler will be the default one for a reason however – it is optimised for the common case. If you’ve only got single disk systems (ie, no RAID – hardware or software) then this scheduler is probably the right one for you. If it’s a multiuser system, you will probably find cfq or deadline providing better performance, and the numbers seem to back deadline giving the best performance for database systems.

Tuning the IO schedulers

The schedulers may have parameters that can be tuned at runtime. Read the linux documentation on the schedulers listed in theReferences section below

More information

Read the documents mentioned in the References section below, especially the linux kernel documentation on the anticipatory and deadline schedulers.


ballontt
2013/8/10
---The End---

如需转载,请标明出处和链接,谢谢!

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/27425054/viewspace-768224/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2012-08-06

  • 博文量
    28
  • 访问量
    107413