INFO: task blocked for more than 120 seconds.
|This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)|
Under heavy IO load on servers you may see something like:
INFO: task nfsd:2252 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
...probably followed by a call trace that mentions your filesystem, and probably io_schedule and sync_buffer.
Don't worry about how serious such a trace looks, this message is purely informational (unless you set sysctl_hung_task_panic, in which case your host is now panicked), but still probably something you want to do something about.
The code for this sits in hung_task.c, which is a kernel thread that detects tasks stuck in D state (basically meaning waiting for IO so long it hasn't been scheduled for 120 seconds (default)). The code is relatively new, added somewhere around 2.6.30.
- probably most likely to happen for a process that was ioniced into the idle class, in which case this this message indicates intended or at least expectable behaviour for that process under constant IO load
- if not, this can easily mean your IO system is slower than your IO use -- often specifically caused by overhead, such as that from head seeking
- tweaking the linux io scheduler for the device may help (See Computer hard drives#Drive_specifics)
- if your load is fairly sequential, you may get some relief from using the noop io scheduler (instead of cfq
- if it's relatively random upping the queue depth may help
- if it happens nightly, it's probably some cron job, and load from something like updatedb.
- if it happens on a fileserver, you may want to consider spreading to more fileservers, or using a parallel filesystem
- NFS seems to be a common culprit, probably because it's good at filling the writeback cache, something which implies blocking while writeback happens - which is likely to block various things related to the same filesystem. (verify)