Linux admin notes - unsorted and muck

From Helpful
Jump to: navigation, search

Shell, admin, and both:

Shell - command line and bash notes · shell login - profiles and scripts ·· find and xargs and parallel · screen and tmux
Linux admin - disk and filesystem · users and permissions · Debugging · security enhanced linux · health and statistics · kernel modules · YP notes · unsorted and muck
Logging and graphing - Logging · RRDtool and munin notes
Network admin - Firewalling and other packet stuff ·


Remote desktops
VNC notes
XDMCP notes



semaphores

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

SysV semaphores and POSIX semaphores are kernel-managed objects that are useful to threaded programs, for synchronization, some resource management, IPC, and such.


The below is about SysV semaphores, not about POSIX (named) semaphores (verify)


Trouble

SysV semaphores aren't really owned by a process, which is why
ipcs -p
doesn't report them,

and why it is not hard for programs to leak semaphores. Leaking in that they will stick around after the originating program closes or crashes, which can lead to the kernel eventually running out and denying other programs semaphores.


Usually, problems related to semaphores/mutexes are related to the semget() call failing.

You may see

  • "No space left on device", which is the standard error string for ENOSPC, which SysV IPC seems to reuse for "can't give out any more"
  • "Invalid argument"
  • "Identifier removed" (an apache message?(verify))
  • "couldn't grab the accept mutex" (an apache message?(verify))

Inspecting and removing

For SysV semaphores (which are considered old style, relative to the new POSIX semaphores)


Probably most interesting for inspection is

ipcs -s -t

The -t isn't necessary, but seeing user and time of latest use is often useful. If you know for sure you can delete them, you can remove semaphores using:

ipcrm -s semid


Some interesting commands (note: the report format/details can vary between systems):

  • ipcs -s
    - semaphore use (list)
  • ipcs -s -u
    - semaphore use (just a count)
  • ipcs -s
    - inspect current use of semaphores
    • ipcs -s -c
      - mention creators (user/group)
    • ipcs -s -t
      - mention time of last use
    • ipcs -s -i semid
      - More details on a particular semaphore set
  • ipcs -s -l
    - report limits
    • the same values as in /proc/sys/kernel/sem
    • and fetchabe/settable via
      /sbin/sysctl -a



Because we had a structural problem with semaphore leaks, we made a script that removes all of the current user's semaphores:

# removes all semaphores belonging to the current user
# or, if root, all of them -- which you probably do NOT want
for id in `ipcs -s |cut -d ' ' -f 2|grep "^[0-9]"`; do
   echo Deleting semid: $id;
   ipcrm -s $id;
done

You do NOT want to run this as root, because that will removing all semaphores, which will probably break a few services.


In our case, switching to the offending users (using su) and cleaning all of theirs was easiest.

I later made a script that parsed the output of ipcs -s -t and threw away only old semaphores and skipped any owned by non-users (like apache).


Semaphore limit details

Very short version: If you're running out of semaphore sets, increase the last number. Unless you've got programs that use a lot of semaphores (most just use a few), the other values are fine.

I decided, fairly arbitrarily, on

250 32000 64 8192


Setting

You can read/set using

sysctl kernel.sem
sysctl -w kernel.sem="250 32000 64 8192"

For a persistent change, you'll want to add/edit /etc/sysctl.conf to mention something like:

kernel.sem = 250 32000 64 8192


Meaning of the values

When tweaking these, it helps to know some details.

Short story: If you need more semaphores, you usually want to increase the number of sets (SEMMNI), and probably scale the system-wide maximum (SEMMNS) along to avoid the case where it could hand out more sets but would reject allocations based on the system max.


The four values are, respectively:

  • SEMMSL – maximum number of semaphores per set or array
    • Each program tends to want more than one semaphore, so so the kernel hands out a bunch of them at the same time (called a set or array). Most programs don't use many, so you rarely need to increase this.
    • The max is 65535 (verify)
  • SEMMNS – maximum number of semaphores system–wide.
    • You could say that making this SEMMSL*SEMMSI is most sensible, but since very few processes will use anywhere near the maximum per set, you can often set this to a lowish percentage of that number. In practice, actively using more than 10K is a huge number and typically points to some program seriously misbehaving.
  • SEMOPM – maximum number of operations allowed for one semop call
    • It can make sense to make this the same order of magnitude as SEMMSL, but it only really matters for programs heavily using semaphores(verify))
  • SEMMNI – maximum number of semaphore sets to hand out
    • Probably the most interesting value to increase. A value like 128 can prove a bit conservative, a few hundred more is usually enough, a few thousand should fit fit most less-usual needs.
    • The max is 65535 (verify)



An old linux default seems to have been

250     32000   32      128

My home system's defaults seem to have been:

250     32000   32     4096

Other notes

What is the cost?

Generally nothing to worry about.(verify)


Files that are part of boot

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


init is the root of all processes, started by the kernel (often from /sbin/init). (This structure/convention is taken from SysV. It is not the only one around.)


Init delegates most of its work to other scripts, most of them via runlevel changes (there are other things it reacts to, such as Ctrl-Alt-Del, any UPS interaction, and more).


Init behaves according to /etc/inittab


It seems that inittab typically ties runlevel switches to /etc/init.d/rc, with the actual runlevel as its argument. Any further conventions tend to be part of that script.

Such further conventions include:

  • looking for a directory like /etc/rc3.d (whatever the runlevel argument is)
    • look for K* files for services to kill in this runlevel (sorted by the next two digits in the filename), then...
    • look for S* files for services to start in this runlevel
  • rc[2345].d scripts (so muli-user runlevels) may run /etc/rc.local at the end

Also seen on some systems:

  • /etc/rc.sysinit
  • /etc/rc.single (stuff for runlevel 1, a.k.a. S)
  • /etc/rc.multi (stuff for regular runlevels, usually 2 through 5)
  • /etc/rc.shutdown, for halts (0) and reboots (6)


rc scripts are good for some early system setup, though keep in mind that modern subsystems like udev are the proper place for most device configuration, particularly if they are hot-pluggable by nature.




Reading in passwords

Reading in passwords is usually done by disabling stdin echo, waiting a bit and/or testing whether that worked, then reading, and re-enabling echo.

In scripts, you can use:

read -s -p "Password: " VARIABLE

Where -s asks for silence on stdin, and -p "Password: " is slightly shorter than also having a echo -n "Password"

Modules

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)



changing hostname

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Kernel panic diagnosis

SysRq

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


SysRq is a key combination that the linux kernel intercepts at any time (even after a panic).


It's probably most useful when the system appears frozen. Depending on the type of lockup, you can sometimes recover, or tell the filesystem to sync so that it's left in a cleaner state before you take off the power. I've also had it be useful where a RAID controller panicked and system commands wouldn't even listen to a reboot anymore.


It needs to be enabled to work, and can be disabled. (How to easily check?)

SysRq is usually on the same physical button as PrintScreen.

The basic key combo (note that sysrq is used as a modifier):

  • in text mode: Alt+SysRq command
  • in graphical mode: Ctrl+Alt+SysRq command (to avoid it causing a print-screen)

Read this if you don't have a SysRq key

There is also a way to do it without keyboard access, e.g. from ssh:

echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger


Command is a letter. Assuming you have a QWERTY keyboard (and it is position/keycode, not letter on the physical key, that is important). Some of them:

  • r – keyboard to xlate mode (in case it was raw, as e.g. X uses) Why?

If you think you can recover it (e.g. it it's trashing but not frozen)

  • f -- force oom_kill
  • 0 through 9 - set log level
  • t -- print tasks
  • n -- reset niceness of high-priority tasks

If you do think it's really frozen:

  • s -- sync (write contents of disc cache to disk)
  • e – sends SIGTERM to all processes except init.
  • i – sends SIGKILL to all processes except init
  • u – remounts all the filesystems readonly (basically a measure to help you reboot safely)
  • b -- reboot

See also:

login messages

message of the day

Usually read from /etc/motd

May be re-written automatically at boot, possibly more often, and during some updates.

May have some more management. For example, ubuntu has /etc/update-motd.d/ which it uses to generate /etc/motd. You can tweak it to your own needs, or disable it.


Last login

...comes from SSH itself. It's somewhat useful for security, but if you want it gone, you can configure sshd with:

PrintLastLog no


Random data

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The entropy pool is a bunch of bits that are impossible to predict by an attacker, because they are collected from the computer's environment. This makes it useful for a number of cryptographic purposes.


The entropy pool is managed by the kernel, and updated when it finds data that (e.g based on keyboard hits, mouse use, IRQ timings) that passes proper-randomness tests.

Since the pool values quality over quantity and there are few good sources that are available on all computers, you should never count on the entropy pool being fast.

...or large: the entropy pool is typically at least a few hundred bits large, up to a few thousand when there are good sources of randomness.

On headless servers, the entropy pool may replenish much more slowly than on computers you sit at. To the point where keypair generation may be very slow or fail if you do it on a server.



Related devices

  • /dev/random
    • On linux this seems to basically entropy pool data (but may also be a pseudo-random number generator based on it).
    • Which is slow, good for many crypto purposes, will block until it can provide good enough random data
    • on BSD, this is basically an urandom-like implementation (see below)


  • /dev/urandom
    • Basically a Psueod-random number generator seeded by the entropy pool
    • ...so much faster source than the entropy pool itself (and better than a manually seeded or time-seeded PRNG(verify))
    • fastish, but not written for throughput - for example, if you want to wipe a drive using dd if=/dev/urandom, you may find that it's doing so at ~10MB/s, order of magnitude


If you just want data that avoids short-term patterns (e.g. testing hashes, diffs, various compression algorithms, whatnot), you could read a large enough (say, 64MB) chunk of data from /dev/urandom (ought to take a few seconds at most) and repeat it, or perhaps use
openssl rand
(verify)


If you need a better supply of randomness for your entropy pool, there are some things you can try. There is software that tries to get randomness from sound card noise, from clock bias differences, from sleep-time errors, from fancy expensive entropy generating hardware, etc.


See also:

Signal handling

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Keep in mind:

  • Signal names are more meaningful than their numbers. Not everything uses the same enumeration (e.g. Solaris)
  • Kernel/OSes may also differ in
    • default handlers
    • which signals can be caught or are unconditional
  • Unless specified otherwise, any signal may
    • be ignored, i.e. discarded
    • be blocked, telling the system to keep in and deliver it later (in most cases, you would only block for a short while, e.g. while doing signal-related housekeeping)
    • have its handler replaced
    • have a handler in addition to the default one



Terminal, job management

  • SIGHUP, 'Hangup'
    • sent automatically if a hangup/disconnect is detected on the controlling terminal, or a controlling process dies
    • default handler is to close. Ignoring the signal is what nohup does
  • SIGCHLD
    • sent by a process to its parent when it terminates, is stopped, or resumed
    • default is to ignore it


  • SIGINT, interrupt
    • sent when user sends interrupt - usually meaning the Ctrl-C key combo
  • SIGQUIT
    • sent when user sends quit signal, usually the Ctrl-D key combo (to something listening on stdin?)
  • SIGKILL
    • quit without clean-up operations
    • a process in uninterruptible sleep
    • a zombie does not react to KILL; it must be reaped
    • cannot be blocked, handled or ignored
  • SIGTERM, 'terminate',
    • asks program to quit cleanly. That is, programs can register this handler to do exactly that
    • default signal sent by
      kill
      ,
      killall
    • rebooting will often move from SIGTERM to SIGKILL if TERM doesn't work quit enough
  • SIGABRT
    • typically sent to itself (possibly via abort()) as emergency termination.
    • Mostly just makes it easier to jump to cleanup/kill code from anywhere in your program


  • SIGALRM, alarm clock signals
    • used for timers; you can ask for this signal to be delivered in the future, using alarm() or setitimer()
    • see also SIGVTALRM, under deprecated


  • SIGTSTP, 'terminal stop' (Optional in POSIX)
    • default handler saves process state so that a SIGCONT can continue it, but gets to CPU until then
    • May also be caught by default SIGTTIN and SIGTTOU handlers
    • Keep in mind that once suspended, a process will only handle SIGCONT or SIGKILL (verify)
  • SIGSTOP (Optional in POSIX)
    • Like SIGTSTP, but handler cannot be changed (verify)
    • Keep in mind that once suspended, a process will only handle SIGCONT or SIGKILL (verify)
  • SIGTTIN (Optional in POSIX)
    • sent when a program tries to read from stdin but is not part of the foreground group
    • default handler sends SIGTSTP to self? (or hust have that same effect?) (verify)
  • SIGTTOU (Optional in POSIX)
    • sent when a program tries to write to terminal but is not part of the foreground group
    • default handler sends SIGTSTP to self (or hust have that same effect?) (verify)
  • SIGCONT (Optional in POSIX)
    • sent to continue a process suspended via SIGTSTP, SIGSTOP, SIGTTIN, SIGTTOU

Note on foreground groups: One terminal can serve multiple process groups, but only one is process group is in the foreground




Memory:

  • SIGSEGV
    • sent by the kernel when it notices a memory access that was not part of the process' mapped space. Typically a bug in the offending process's code, related to memory allocation or pointer abuse.
  • SIGBUS
    • sent by the kernel when an address is mapped by does not translate to a valid part of memory hardware (or mapped IO or such)
    • similar to SIGSEGV, but lower-level. May e.g. happen when the disk underlying swap has failed.(verify)


IO:

  • SIGPIPE
    • A process is sent this when it writes into a pipe that is close. In other words, if two processes are connected via a pipe and the consumer process dies, the producer process is sent this.
    • Source of the "Broken pipe" message
  • See also SIGIO and SIGURG in the non-POSIX list


Others

  • Linux reserves SIGRTMIN through SIGRTMAX for real-time (actual value of both seems to vary, particularly SIGRTMIN)
  • SIGUSR1 and SIGUSR2 - usable by users
  • SIGFPE, 'floating point error'
    • usually caught and handled internally
  • SIGILL - illegal instruction. Not cleared when caught(verify)
  • SIGTRAP - trace trap, mostly used in debuggers. (Optional in POSIX)
  • SIGSYS - bad arguments to system call. (Optional in POSIX)
  • SIGEMT, 'emulate'. (Optional in POSIX)
  • SIGCLD - Child status change. (Optional in POSIX)


Not POSIX, or deprecated in POSIX:

  • SIGWINCH
    • used to signals window size change. Apparently not used much, but resizable terminals may.
  • SIGIO, a.k.a. SIGPOLL(verify)
    • Only sent if O_ASYNC is used on a file descriptor
    • notification about a file descriptor: that it is ready to receive, that it has new data, or that there is an error
  • SIGURG
    • sent when urgent data arrives on a file descriptor. Mostly used for out-of-band data
    • default handler ignores(verify)
  • SIGPWR
    • used to signal that we're on short-term emergency power (e.g. triggerd by UPS software)
    • Useful to signal daemons to clean up (and possibly shut down)
  • SIGVTALRM, 'virtual timer'
    • like SIGALRM, but sent some amount of CPU time in the future (instead of wall-clock time), and excluding system code(verify)
    • see also SIGALRM, above
  • SIGPROF - profiling timer expired.
    • like SIGVTALRM, but counts CPU time, including system code (verify)
  • SIGXCPU - exceeded CPU limit (resource limiting)
  • SIGXFSZ - exceeded file size limit (resource limiting)
  • SIGCANCEL - Seems to be used internally in pthreads, to help cancel threads.
  • SIGLOST - signals a resource (e.g. record-lock) is lost (meaning?) (verify)
  • SIGLWP - used by threading (verify)
  • SIGFREEZE - Possibly solaris-specific? (verify)
  • SIGTHAW - seems meant as "we have just resumed the system, this is your chance to do housekeeping before resuming regular operations" Possibly solaris-specific? (verify)
  • SIGWAITING - Possibly solaris-specific?


Unsorted

  • SIGTHR - thread interrupt (verify)
  • SIGINFO



http://www.lindevdoc.org/wiki/Category:Signals http://www.tutorialspoint.com/unix/unix-signals-traps.htm




Loop devices

A loop device uses a file and presents a block device, which is something you need when you want to mount a filesystem that is stored in a file (typically an encrypted filesystem-in-a-file, or images of a hard drive, CD, DVD, floppy, or such).


Loop devices on various systems:

  • In linux, the devices are /dev/loop0 and so on. Management is done via losetup (util-linux package)
  • In BSD and many derivations, the loop device is called a virtual node device and often at /dev/vnd0, /dev/rvnd0 or /dev/svnd0. Management is done via vnconfig.
    • except for FreeBSD, which merged the functionality into the memory disk driver (md). Management is done via mdconfig (verify)
  • Solaris calls it lofi, and places the devices at /dev/lofi/1 and so on. Management via lofiadm
  • OSX internalizes the functionality. You don't have to manage it.
  • Windows doesn't natively support this - though there are many available programs for the case of CD/DVD images


In linux

The least-bother way to use one is to let mount do the work. You may have at some point typed:

mount -o loop -t iso9660 image.iso /mnt/myimage


The loop option, when given no arguments, looks for a currently unused /dev/loop device and uses that. You could also specify one explicitly, for example:

mount -o loop=/dev/loop3 -t iso9660 image.iso /mnt/myimage


Even with the last option, it still creates the loop's mapping for you. If you insist on doing this manually

  • Associate loop0 with a specific image
losetup /dev/loop0 /images/image.iso
  • Now /dev/loop0 acts like a block device, and is backed by that iso, so you can mount it:
mount /dev/loop0 /mnt/isoimage

Once you're done, you'll also want to detach the file from the loop device

umount /mnt/isoimage
losetup -d /dev/loop0
# OR make umount do it: (though this is apparently done automatically since 2.6.25(verify))
umount -d 

It gets more complex when you want to have an encrypted loopback (to store things in an encrypted filesystem).


See also:


Partitions in image file

Wacom tablet notes

Your distro probably has a package that has the drivers. If not, you want http://linuxwacom.sourceforge.net/


X configuration

You still have to configure X yourself. See:

Note that aside from the global config, you can also use wacomcpl or xsetwacom to set behaviour details on general startup, and more interestingly, each user's login.


Sections

cursor: Mouse

To use it as a mouse, use the following:

Add to ServerLayout (regardless of input type):

InputDevice    "cursor1"    "SendCoreEvents"


Now, for USB devices add:

Section "InputDevice"
  Driver        "wacom"
  Identifier    "cursor1"
  Option        "Type"       "cursor"
  Option        "USB"        "on"                  
  Option        "Device"     "/dev/input/event0"  
                   # possibly /dev/input/wacom
EndSection


For the older serial devices, add:

Section "InputDevice"
   Driver        "wacom"
   Identifier    "cursor1"
   Option        "Type"       "cursor"
   Option        "Device"     "/dev/ttyS0"          
                              # (...you probably only have one serial port) 
 EndSection


You also need to tell X about using this mouse in the ServerLayout section. Usually you will have

  • a regular mouse as a "CorePointer"
  • a (few) wacom section(s) configured to "SendCoreEvents" so that it will also act like a mouse.

Even if you don't have another mouse configured, don't set the wacom as a CorePointer - you'll use pressure sensitivity and all other wacom-specific features (e.g. xidump cursor1, xsetwacom set cursor1 ...). You'll instead get the error:

X Error: 170 BadDevice, invalid or uninitialized input device


To use as more than a mouse, add more sections:

  • with identical device settings
  • with another
    Type
    (stylus, eraser, and possibly pad),
  • further behaviour options (most may differ per Type), and
  • mention it as a SendCoreEvents type InputDevice

Note that programs must know about and claim devices to react to pressure and such. Gimp is one of the relatively few that does. xidump is another, which is a test app for the X configuration.

main pen

Option "Type" "stylus"
InputDevice    "stylus"    "SendCoreEvents"

back-side eraser

Option "Type" "eraser"
InputDevice    "eraser"    "SendCoreEvents"

Expresskeys and strip

To get events for the keys and strip on the Intuos3, Cintiq 21UX, Graphire4, and Bamboo, add:

Option  "Type" "pad"
InputDevice    "pad"

No SendCoreEvents, even for the strip, apparently(verify).

Behaviour

Rotate

Option "Rotate" "NONE"       # default
Option "Rotate" "CW"         # 90 degrees right
Option "Rotate" "CCW"        # 90 degress left
Option "Rotate" "HALF"       # 180 degrees

To keep movement in X and Y direction porportional in Absolute, you need to set the active area to the screen shape:

Option "KeepShape" "On"

Relative or absolute mode

Option "Mode" "Absolute"     #or "Relative"

Defaults:

  • cursor is Relative
  • stylus and eraser are Absolute


Relative mode speed can be multiplied (default 1.0(verify))

Option "Speed" "0.7"


Pressure

Button response threshold:

Option "Threshold" "number"

Default is MaxPressure*3/50.


Pen response curve:

Option "PressCurve" "0,0,100,100"   # linear, default


Buttons

To report presses as different buttons:

Option "Buttonn" "action" # where action is 0 (disabled), a button number, or a keyboard event

For example, my mouse's scroll is inverted, so:


Report buttons only when tip is pressed:

Option "TPCButton" "on"


Pad and Screen mapping

Active area:

Option "TopX" "number"
Option "TopY" "number"
Option "BottomX" "number"
Option "BottomY" "number"


Multiple pen associations

Compiling a kernel

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Manual work

You usually have one or more kernels in /usr/src/linux-someversiondetail, and /usr/src/linux is a symlink to one of them.

The one it links to is the one considered current (usually managed by you, sometimes by a package manager or assisted by one). This is significant only in that some third party kernel-module compilers may assume this link contains the headers/source they should use.


Configuring and compiling a new kernel

If you want the make script to ask you about each new option, run
make oldconfig
first. This is generally important when moving to noticably different kernels, and you can generally skip this, particularly if you're sure you won't need all the options you'll not notice in the menu configuration.


Run
make menuconfig
for a console menu, or
make xconfig
for a graphical menu to change the configuration.


Hint: it's probably sensible to compile everything you may ever use as modules, that way you don't have to install-and-reboot as much when things change; modules take only filesystem space, no memory. (I wonder why this isn't common as a default)


Moving to a newer (or significantly different) kernel:

  • Copy out .config from the current one to the new one,
  • link /usr/src/linux to the new source
  • go to this source, run
    make oldconfig
    , to see everything that's new or has changed explicitly.
  • Then configure and as usual (people tend to use make menuconfig, make xconfig for nice option navigation)
  • and compile/install as usual


Compiling a kernel

After configuration:

  • compile the kernel:
    make bzImage
    makes a fairly well compressed image (note: not related to bzip2), or just make to generate a slightly less compressed file called vmlinuz.
  • compile kernel modules:
    make modules
    to build the modules and
    make modules_install
    to copy them to directories the system actualy uses.
  • install the kernel - using your preferred method.


Installing a compiled kernel

Note there are sometimes ways to make installation easier. Some distributions allow compilation into a distribution package, and in some the package manager automatically adds a kernel installed that way to the boot menu.


Simple automatic:

The simplest way is probably:

make install

This copies overly everything you could need to /boot. If you always do this, this will replace the previous kernel, so means you won't have to change boot menu settings -- and that it's also sort of risky in case you new kernel fails. I do believe it copies out the old kernel and appends something like .old, but this would mean you can do recover exactly one make install

I personally consider
make install
a little risky - it copies the current kernel to and .old version to keep around as a only a single backup/alternative in the boot menu. If you do something wrong twice you won't have a workable kernel. (It takes some skill to do that, but it does give you an unbootable system).


By package

Various systems allow you to package the compiled kernel for at least organizing your compiled kernels, and sometimes easier installation and removal. For example, debian allows

make deb-pkg

which creates a .deb package. Other options include rpm-pkg, binrpm-pkg, tar-pkg, targz-pkg and tarbz2.png.


By copying the kernel

Manually installing a kernel is actually fairly simple. It consists of copying the kernel image to /boot and telling the boot loader about it.

Copy out vmlinuz / bzImage, from a place depending on architecture (e.g. ./arch/i386/boot/bzImage) to some unique and descriptive name in /boot, then point your bootloader towards it.


Manually updating the bootloader

These days grub is the boot loader of choice since it's more flexible and forgiving than the now fairly oldschool lilo

You can edit the boot list manually, but grub makes it a little easier: [update-grub http://www.fifi.org/cgi-bin/man2html/usr/share/man/man8/update-grub.8.gz update-grub man page] looks for all files starting with vmlinuz- and adds menu entries for them.

There are good reasons to do this manually, though, since editing the list yourself means you can ensure the same order and the same default/fallback options, which can be handy on headless servers.


Notes:

  • /boot is mostly there for organisation. Historically it's also there so that you can make sure the partition lies at the start of the hard drive so that very simple boot loaders can access it (aren't required to support LBA addressing)


Boot parameters


Structure of linux in a wide sense

This is quite rough. It will be revised.

Bare booting

In terms of the basic system necessities, we have:

  • The BIOS, which looks for a (storage) device to boot from and hands control to it
  • A bootloader, which gives you the option of choosing one of several kernels (GRUB is now a common bootloader. It is more flexible than the oldschool LILO) (See also bootability)
  • The kernel with its built-in drivers, stored on the filesystem, loaded by the bootloader. Provides interfaces to hardware, handles things like threads, swapping and such. Tends to starts a few processes to be able to all it's configured to do.

Actually, there're more details. See e.g. this page at IBM illustrating the lower levels of the boot process.

Also perhaps more general bootability details.


The system's framework

The parts always provided and/or necessary, started automatically:

  • init, the parent of all processes, started by the kernel. Runs necessities, and things like the shells to log into (e.g. the fairly six standard consoles under Alt-F1 to -F6), and kicks off the rest of the bootup, usually using runlevels as a model.
  • Core executables and libraries that are required by init or services.
  • Some that deals with loading services in some order and considering dependencies between them. Runlevels are still fairly common, adapted from SysV init runlevels. This is what the /etc/init.d directory comes from, and rc0.d to rc6.d/, if you have them.
  • Services themselves. Usually, few or none of these are strictly necessary for any of the system's core to work.

Things that run

Many things work not in an always-present way like the kernel does, but more on demand. This includes:

  • Programs, of course. They may hook into libraries, the kernel and (when allowed by the kernel) even hardware fairly directly.
  • Libraries loaded by the system on-use. This refers to most anything you cause to tun as a user. Not privileged like the kernel and kernel modules.
  • And, in fact, kernel modules. These are files loaded by the kernel on demand, or possibly on use or by anticipated request ('coldplug'), and means additional drivers. They act like kernel code, but can be loaded and unloaded at will (whereas the kernel is a monolithic unit). Almost all kernel code can be both internal to the kernel and compiled as a module.


The graphical interface is actually a program, usually started from the service management. This may strike Windows users as odd, as they are used to thinking the graphical interface is the system - unless they're old enough to remember Windows 3.1.

Some window managers put a whole another behavioural and functional layer on top of the system. For example, Gnome's filsystem-like configuration manager is a neat idea, and KDE's kioslaves are sometimes really useful, but both of these are essentially just two of the more directly visible libraries.