Linux admin notes
From Helpful
| Stuff related to shell use and command line tricks |
| These are primarily notes This is probably not going to be complete in any real sense, and exists to contain bits of useful information. |
| This is probably going to be split into various parts instead of a page, once it's decided how. |
Sort of a set:
- Linux notes
- Linux admin notes
- Specific or lowish level linux notes,
Contents |
Reading (linux) system use and health
Some of these utilities are fairly standard to most unices, some of them report information specifically from recent linux kernels, and some OSes have better utilities than these
Reading top
top is a simple way to get an overview of what your system is doing. Top is also a little hard to read, and fives you a bunch of information you probably won't find interesting until much later.
First things first, look at the output:
top - 21:18:31 up 18 days, 1:16, 13 users, load average: 2.61, 2.47, 2.01
Tasks: 124 total, 4 running, 120 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 3.0%sy, 97.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 773500k total, 760236k used, 13264k free, 36976k buffers
Swap: 5992224k total, 689324k used, 5302900k free, 265284k cached
PID USER PR NI VIRT RES S %CPU %MEM TIME+ COMMAND
11192 root 30 5 59136 55m R 43.6 5.5 0:03.30 cc1plus
11199 root 26 5 12896 9244 R 24.8 0.9 0:00.25 cc1plus
11193 root 22 5 4972 3332 S 1.0 0.3 0:00.02 i686-pc-linux-gnu-g++
11197 root 15 0 2100 1140 R 1.0 0.1 0:00.07 top
11198 root 24 5 2144 884 S 1.0 0.1 0:00.01 i686-pc-linux-gnu-g++
1 root 15 0 1496 432 S 0.0 0.0 0:07.40 init [3]
2 root 34 19 0 0 R 0.0 0.0 0:06.24 [ksoftirqd/0]
...and so on
The header can be seen as roughly three parts: 'CPU and IO status', 'memory status' and 'swap status'.
CPU and IO
top - 21:18:31 up 18 days, 1:16, 13 users, load average: 2.61, 2.47, 2.01 Tasks: 124 total, 4 running, 120 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 3.0%sy, 97.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Of this, the cpu usage is usually most interesting. The whole thing contains:
- Time, time since the last restart, amount of logins
- Load average is how many processes were jousting for CPU-time in the last one, five and fifteen minutes (see load average for more details)
- The Tasks line is fairly self explanatory.
- CPU percentages:
- user cpu time is programs.
- system time is the kernel doing things programs asked it to. This has some priority, and is inherent in some jobs.
- nice cpu time are programs that have a niceness of >0 - those that will back off to let processes with lower niceness use more CPU time.
- wait time means the system is waiting for IO. If this happens the system is usually swapping like crazy, or in some cases doing a lot of work on the hard disk. In general, wait time is bad.
- hi and si: 'hard interrupts' and 'soft interrupts'. They represent driver time, networking and a few other things. They are rarely higher than a few percent.
Zombie processes are not that important. They are processes that are finished, are not using resources anymore (other than a tiny bit of memory in the process table) but have not yet been cleaned up by the process that started.
Memory and swap
Resident
Mem: 773500k total, 760236k used, 13264k free, 36976k buffers
Swap: 5992224k total, 689324k used, 5302900k free, 265284k cached
The total amount of memory should represent how much user memory you have.
These 'Used' and 'free' report extremes, specifically the OS cache and buffers, which are generally very good things, and will back off when any program wants memory -- so you generally want to ignore these and look at how much leeway there probably is in the space taken by cache and buffers.
Free will generally only be high right after bootup, and right after a memory hog of a program just quit, when the OS cache hasn't seen anything to use that space for.
Swap
Mem: 773500k total, 760236k used, 13264k free, 36976k buffers Swap: 5992224k total, 689324k used, 5302900k free, 265284k cached
Swap total reflects the collective size of your enabled swap partitions. Usage is just that, and there is always some use, usually things that have not been used, such as allocated memory that has never been accessed, or parts of large executables that have never been used (It only makes sense to swap this out, as it gives more memory to active processes). (See also swappiness)
When the used swap is high, you have an active memory hog, too little memory, or sometimes a program that likes to allocate a lot of memory without using it. You can usually tell the difference by looking at top, seeing whether the figure changes continuously, or more accurately via something like If something like vmstat; if it reports si and so ('swapped in', 'swapped out') as 0 most of the time, the size you see in swap is probably not actively used.
Continuous swapping will make your computer sluggish. If this isn't a rare occurence, it may be wise to invest in a little more memory (or at least see whether it's not a misconfiguration of, say, a database engine; it could have been told it can use more memory than actually present).
Buffers and cache
Mem: 773500k total, 760236k used, 13264k free, 36976k buffers Swap: 5992224k total, 689324k used, 5302900k free, 265284k cached
The OS cache keeps things in memory that it figures may be used again soon, which in often largely filesystem data and metadata. Cache use counts towards basic memory use, which is why the 'free memory' figure is not that useful - cache will move out of the way for allocation, and when there is memory free, the cache will grow.
'Free memory'
There's a technical difference between memory that is immediately free and memory that is free in the sense that it is almost immediately yielded from the OS cache.
While the OS cache is a good thing to have, particularly since truly unused memory is effectively wasted, you often want to know how much memory there is free to run a program, daemon or whatnot.
You want to know, roughly, (free_memory + current_cache_memory), which you can eyeball from the figures in top, or get calculated from running the commandtotal used free shared buffers cached Mem: 773500 760236 13264 0 36976 265284 -/+ buffers/cache: 457976 315524 Swap: 5992224 689324 5302900
Practically, part of the cache is responsible for the snappiness of the system, for example by avoiding disk IO on consistently checked (system) files.
Usually, some part of the cache tends to be used because there hasn't yet been a reason to throw away the data. In the case above cache hasn't shrunk much because there is still free memory too. This signals either that there haven't been too many cachable requests lately - and/or that an application using ~200MB of memory has just quit.
Fairly few things take advantage of buffers, it tends to be quite constant, and a fairly small figure compared to everything else.
Process lines
(Note that the columns that are shown are configurable, so there are more)
PID USER PR NI VIRT RES S %CPU %MEM TIME+ COMMAND
11192 root 30 5 59136 55m R 43.6 5.5 0:03.30 cc1plus
11199 root 26 5 12896 9244 R 24.8 0.9 0:00.25 cc1plus
11193 root 22 5 4972 3332 S 1.0 0.3 0:00.02 i686-pc-linux-gnu-g++
11197 root 15 0 2100 1140 R 1.0 0.1 0:00.07 top
11198 root 24 5 2144 884 S 1.0 0.1 0:00.01 i686-pc-linux-gnu-g++
1 root 15 0 1496 432 S 0.0 0.0 0:07.40 init [3]
2 root 34 19 0 0 R 0.0 0.0 0:06.24 [ksoftirqd/0]
- PID is process ID. Handy mostly for killing the process (possibly from within top: type 'k', then type the PID, press enter)
- the USER owning the process (usually the one that started it, except for suid-ed executables)
- PRiority: is part niceness, part automatically managed since 2.6 kernels (the scheduler guesses the difference between background and interactive processes, and allows the latter to be snappier)
- NIceness is the willingness to give up time. You can renice from within top (type 'r' and the PID)
- S is process status. The interesting ones are Running and Stopped, and also D, for uninterruptable sleep, since that usually indicates wait time. You can sort by this column to figure out what's causing doing wait time (descending is useful for most other columns, but sorts D to the bottom).
Memory:
- VIRT refers to the amount of memory the process can address. This includes shared memory, libraries, mmaps, and memory that was reserved but never actually allocated. This value can be huge, and regularly indicates little besides memory behaviour.
- RES is how much of a process is RESident. This is a good indication of how much it uses at all, except when your system is trashing: swapped-out memory does not count towards this.
- SWAP (not there by default; press 'f', and 'p' in that screen): Amount of memory swapped out. Apparently VIRT=RES+SWAP(verify)
There are others, including SHR (shared pages, interesting in threaded applications), and a division like CODE (or TRS) for size of code and fixed data, DATA (or DRS) for size of instance: data and stack, and LRS (library size). (The alternative names are used by e.g. htop, which you can try if you don't like top)
Note these columns are customizable. I just removed PR, VIRT, and added SWAP.
Load average
Load average, as seen in e.g. inIt is an additional measure to indicate how loaded a computer has recently been. Processes counted as active are those that are:
- actively using the CPU,
- scheduled to continue to use the CPU (on a very short term; based on the scheduler)
- uninterruptably waiting for IO (often disk, sometimes network or other)
A load average number under 1.0 can be roughly seen as a long-term average of CPU usage.
When it's near 1.0 there is probably 100% CPU use, when it's higher, say,
load average: 1.69, 1.70, 1.73
...means there are probably two processes actively using the CPU, (and likely sharing its speed), but since it's under 2, one or both are probably not active all the time. Those three figures (for 1, 5, 15) minutes are about the same, so the processes are probably fairly consistent at it.
Notes:
- With load factors >1, CPU usage itself may be under 100%, since sleep/waiting is keeping the CPU from being continually used
- When trashing, the load factor may spike simply because many things are waiting to continue executing while the kernel spends a lot of IO time swapping things in and out
CPU use types
Wait time
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
When processes are in a state of uninterruptable sleep (shown as status 'D' in top and often in ps) this is usually specifically of IO wait time, a.k.a. 'wait time', 'iowait', 'wait', which is the time a process spends twiddling its thumbs waiting for the IO system to fulfil a blocking IO request. Often the disk is the major cause of wait time, though it can also come from networking and any other device that does a lot of interaction and/or handles a lot of data.
Resources that are tied up are regularly hard drive usage mostly because you often place the largest bandwidth requirements on it, and will happen most when you use it from one process at a time -- multiple accesses will causing drives to spend a lot more time seeking and less time moving data data.
Systems that are busy reading or writing a lot of data, or swapping, will often rattle the hard drive as fast as it'll go. This IO wait time is as unavoidable as the disk access itself is, although in some cases you may save time by doing things sequentially rather than at the same time. It can also indicate that your computer is swapping, and perhaps not caching as much as it could, both of which can often be improved by sticking in more memory.
Past observing that you have IO wait time, you may want to find out what process and what device is so busy. To see whether it's disk IO and to see which disk, you can use something like the following (linux kernels 2.6 and later):
iostat -x 1
Most of the time you can just see which drive is continually moving at least some data. In some cases, say if it's mostly fstatting, you may want to look at the await column, which shows the average amount of time that disk requests say in queue (waiting, seeking, reading/writing and everything around it), in in milliseconds. It's significant if it's much more than, perhaps twice the drive's seek time. For example, a few hundred or more rather than a dozen milliseconds on a drive with ~7ms seek time.
(If you do not have the iostat utility, it'll probably be in a package called sysstat.)
Finding a specific processes is a little harder, because the wait time may not be in the process that causes it. For example, on linux you'll often see the pdflush kernel process (which buffers and flushes data to disk) showing wait time in its process state, but the process that flushed a lot of data to it may or may not be doing so at the moment (offloading iowait is sort of the point of pdflush).
For this reason and others you'll probably want to observe the system over some (shortish time). One example:
while [ 1 ]; do (sleep .3; ps -lyfe | egrep '^D'); done
(you probably want to put that into a script since it's not trivial to type) (This is similar to using watch except the screen won't be cleared)
You can try to inspect whether a process is doing a lot of IO, and/or see what it's doing, with something like strace -p pid, or sometimes more usefully, use its -c option for a count-n-summarize to see whether it's indeed potentially IO-significant calls that a process is mostly doing (such as open, read, write, recv, poll, stat, send, and others).
Other system / kernel details
vmstat will give you an overall view of what the kernel is up to, including memory usage, disk blocks use, swap activity, context switches, and more. For an example, see:
vmstat 1
There are some other reports it can give, see its man page.
A related, slightly nicer looking third party app you may want to look at is dstat.
Things like htop, atop and iotop can be used to show IO speed and/or totals of processes, in case you want to figure out what program is busy using IO.
Networking
- ifconfig to see (or configure) the network interfaces
- netstat will list various things about networking and can show e.g.
- open connections (no parameter)
- listens and open connections (-a)
- udp and/or tcp (-u, -t) since you often don't care about all the unix sockets
- routing table (-r) (see also route)
- interface summary (-i)
- statistics (-s)
I use -pnaut (programname, noresolve, listen+connections, udp, tcp).
- ss is similar to netstat
- see also nettop, iptraf, or any of dozens of other network monitor apps.
- arp (arp -n to avoid resolves) to see the ARP table
- route (route -n to avoid resolves) to see the routing table
- iptables to change the IP filtering/nat/mangling tables (see also iptables). Possibly interesting to you are:
- iptables-save, which produces file-saveable text (and is also handy to see all of the iptables state), and
- iptables-restore, which reinstates a file saved through iptables-save.
- iwconfig to see (or configure) the wireless network interfaces
- (Other general wireless tools: iwevent, iwspy, iwlist, iwpriv)
- (Other specific wireless tools: wlanconfig, etc.)
Kernel, drivers
- lsmod lists currently loaded kernel modules (see also modprobe, insmod, rmmod)
- lspci lists PCI devices. Using -v is a litte more informative. (see also setpci)
- lsusb lists USB busses and devices on them
Drives and space
It can be handy to see what drives you can get at, and how much space is free on each. Both are possible using- The -hoption is useful to see human-readable sizes.
- df -B MiB (or MB) makes df report everything in megabytes, which can be useful when you're watching for usage differences on the order of megabytes per second (e.g. watch -d df -B MiB)
To see where in the directory tree the usage is, use du, detailed elsewhere.
To see an exhaustive list of things that can be mounted, see /etc/fstab (see also fstab).
/etc/mtab lists things that are mounted, more completely than df does, because df reports only things meant for storage, so which excludes things like proc, udev/devfs, and usbfs.
To see details about used swap partition, cat /proc/swaps, which is also what swapon -s does.
-->
Disk stuff
The hardcore, oldschool way of partitioning is fdisk.
The shinier, easier way is something like gparted (useful on a livecd / USB stick), which can also do things like resizing partitions with data still on them.
filesystems
Specific filesystems have specific upsides:
- FAT32 can be read and written by windows, linux, and OSX, but doesn't support storing files larger than 4GB
- ext2 is simple, and fast for various common tasks (but is bad at certain things, such as handing directories with many items)
- ReiserFS, XFS, ext3 and JFS are all journal filesystem metadata, meaning that when corruption occurs the filesystem structure is usually easily and automatically recoverable (though the data isn't - data journaling is slow and space-hungry).
- NTFS can now be read-written by linux and windows. Apparently it's sort of getting there in OSX(verify).
It seems reiserfs deals better with a lot of small files and XFS better with large ones -- or was that the other way around?(verify). Sometimes some specific operations are much faster for some filesystem, though that usually doesn't matter much to real-life practice.
- Sun's ZFS looks very useful indeed, with RAID5-like features built in, but isn't available for linux (yet?).
Swap
Total effective swap and usage can be shown using free, top, and similar utilities.
To view what swap is currently being used:
swapon -s
(basically reports /proc/swaps)
You format a partition for swap use with something like:
mkswap /dev/sda2
Note that you can also create a swap file (instead of partition).
Adding it in your fstab is usually easiest. Swap is usually distributed (to get little wait time while swapping), but you can give individual swaps priorities (0-32k, higher is more), for example to
If you want to manually add or remove swap, it's:
swapon /dev/sda2 swapoff /dev/sda2
(note: swapctl in some systems)
MBR backup and restore
When you install windows, it overwrites the MBR without asking. If you had a multiboot menu there, tough luck. However, it's not too hard to back it up before and restore it afterwards:
- boot in linux
- Back up the MBR (Master Boot Record), probably to somewhere on your linux disk where you can easily find it:
dd if=/dev/hda of=/mbr bs=512 count=1
- (install windows)
- Boot from a liveCD (knoppix, ubuntu, whatever)
- Mount the linux partition you saved the MBR to
- Restore the MBR:
dd if=/mnt/mylinux/mbr of=/dev/hda bs=512 count=1
Note/warning: You read and write the MBR from the drive device, not one of the partitions on it.
Note that if you didn't back up the MBR, tools like ms-sys lets you write various MBRs (mostly windows).
Perhaps-useful-to-knows
- Deleting an open file removes only the directory entry. The data is not freed on disk until the last process that has a handle to it closes that handle. This can be useful to know in the context of large files, log files, and temporary files (since you can create, open, delete a file, and the file handle you still have open now points to anonymous disk file that no other process can open()).
Potentially useful tools
| This article/section is a stub — probably a pile of half-sorted notes and assertions some of which may well be wrong, and not verified as a whole. Feel free to add or refine. |
volume management
LVM is basically an organizational tool, a logical layer on top of physical disks that makes it easy to:
- resize logical volumes
- combine disks into a single storage space
- move space between volumes
- take snapshots for (on-site) backup (that work in a diff-like way so need not take much space (!))
It is particularly useful for disk farms, as well as administrators that wish to divide areas on the filesystem into various physical disks (e.g. one volume for /var, one for /home), because storage requirements will likely grow and shift over time.
LVM allows block-based copy-on-write snapshots, so can be used to back up large directories (e.g. database data) but in fact only keep what conceptually are binary diffs. Snapshots are allocated as you create them, and act as mountable filesystems (are not files).
LVM requires kernel support, an administrator that understands it, and note the logical volumes won't be readable by OSes that don't understand LVM (...though that matters to personal installations, not servers).
md: software RAID
One argument in the hardware RAID versus software RAID discussion is that once a hardware controller fails, you may have lost all your data, not because another controller won't be able to read it, but because you need to get the same brand and model, since the actual encodings used are proprietary, different per company, and change over time.
That makes it somewhat safer to use software RAID, since it is much easier to duplicate the setup. This at the cost of doing all the data processing to the main CPU - which back with PCI systems was potentially slower, since bus saturation with RAID was pretty easy.
Recorded statistics
See Network tools. The somewhat wider tools like cacti also store system statistics.

