Keeping a System Running

An ad lib talk by paulv

The discussion was "What topics would we like to have at the SonOfInstallfest?" One strongly desired topic, as it was an immediate need for Chuck Stiver, was "What needs to be done to keep a system in tune and running well after initial installation?".

paulv pipes up, "This needs no big preperation, I'll talk about this right now". Here are pea's rough notes from the discussion:

Boot up things

"don't touch it.... it will continue to work"

In general, "if you don't touch it, it will continue to work", but you can still run out of disk. One common way for this to happen is for /tmp to slowly accumulate cruft. The debian distribution saves you the trouble of monitoring /tmp usage by clearing /tmp on each reboot. Other distributions (notably RedHat) allow files to linger in /tmp accross system restarts. Just after reboot, wander into /tmp and clear files--anything there will rebuild if it is needed. This is something best done before starting an X desktop session.

If you are a KDE user, know that there may be a useful speedup file stored in /tmp. The file names are ksycoca and ksycocastamp. There is no harm in deleting them, they will rebuild on your next invocation of KDE, however KDE startup will be significantly slower until those files are rebuilt.

Another more rare way to run out of disk space is to run out of inodes. Inodes are table entries that point to the actual files on the disk. When the disk was initially formatted (mkfs ....) a finite number of inodes were created. Ussually, the default setting is sufficent for most uses, however if you create many-many teeny-weenie files, you could use up your inode allocation before you use up your data space! This is when df reports free space available, but you can't save a file--you just get the dreaded Out of inodes...

pea@plum:~$ df       # How much space?

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda6              8262036   3981984   3860356  51% /
tmpfs                    32944         0     32944   0% /dev/shm
/dev/sda1                30075     21778      6693  77% /boot


pea@plum:~$ df -i    # How's the inodes?

Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda6            1050400  235524  814876   23% /
tmpfs                   8236       1    8235    1% /dev/shm
/dev/sda1              16064      46   16018    1% /boot
	

df ___ -- clear tmp (mostly wipeable, tho there are some speedup files debian clrs tmp, a good thing.

While running

xosview display example

As your system runs there are a few commands that are useful for taking a peek at how things are going.

top
Run from the console, top gives a brief summary of system state. At thetop there is a summary of memory and processor loading (how busy is your system).
uptime
How long since your last reboot? What is the load level?
xosview
A graphical meter set showing system status.

"So, just what is LOAD??" Load is simply the number of jobs or processes waiting to run at any given time. There are ussually three load values given: the 1, 5 and 15 minute averages. A load of 13.4, as shown in the xosview screen snip at left, means that there are thirteen plus jobs in contention for the processor--busy, busy, busy. A normal desktop system will generally have a load of around one.

SWAP A system must have some swap area to function at all. The general rule for swap size is 1.5 times your memory size. Some swap will be used at any time. Inactive jobs may be moved to swap. On a desktop system, especially when web browsing with many tabs, swap will be consumed by the inactive tabs or browswer windows. FireFox is notorious for squirreling away copies of rendered pages in X backing stores on swap. Close firefox and regain swap

top -- uptime -- xosview (adn brtheren) -- load is nbr waiting processes -- free (swappin'??)

Log maintainance

Log files eat disk space; logrotate keeps growth inbounds.

The system and many packages produce log files as they run. These logs are found in /var/log. You have probably taken a peek or two at /var/log/messages to find error messages and bootup information. Over time these logs grow large. There is a tool, logrotate(8), that keeps the logs trimmed. How the logs are trimmed and archived is set up in logrotate's configuration file and directory:

/etc/logrotate.conf
the master logrotate configuration. These are the default settings. Most modern distributions place some sane settings in here.
/etc/logrotate.d/
logrotate configuration for individual packages, such as apache, mail, this and that. When you install these individual log-file producing pkgs, part of the installation is the addition of a logroate config just for that package.

Logs can be compressed, removed, archived, mailed, posted... See man logrotate for the many options that can be set in the configuration files.

/etc/logrotate.conf -- /etc/logrotate.d/[pkg]

Updates

kernel updates Whatcha gonna do if the upgrade goes south??

Know the package and system management tools for your distribution--each distro has its own methods and traditions in package and system management. The package manager installs the package, configs and such into the right places. The system management tool assures that all the packages work together--this tool handles (and hopefullly resolves) dependency issues. In RedHat-land rpm is the package manager; up2date or yum is the system manager. Over in debian-ville, dpkg and apt provide the software management services.

up2date, apt, yum all ahve helps and tricks -- apt-list [chhanges|bugs|___] -- RCS!!! Benifit of configing directly in /etc, you know what's there

/etc and version control

Learning and directly editing the configuration files in /etc, instead of using "wizzers", gives a better understanding of how your system fits together. However, there is one tool that is indespensiable when futzing with config files: RCS--revision control system. The rcs suite of tools allow changes to be tracked and controlled.

link/port "just enough RCS" from adam

fsck

Normally, the filesystem check happens automatically when needed at boot time. The occasional check cleans up any stray bits on the disk that might have appeared from application crashes and such. Incomplete and unclean shutdowns also create "fsck needed" situations.

...but what if fsck fails, throws its disc-twidding head back and quits? Now is the time for manual intervention. fsck -y /dev/[failed-partition] all your lost bits will be moved to a lost+found directory on that partition. There may be recoverable files (if you want them bad enough...)

Most distribution are using the ext3 journaled filesystem which is much more tolerant of incomplete shutdowns.

ext2/ext3 -- ext3 modes: journal metadata & journal data (slower)


my brain went to sleep and notes on the following items are very sparse.

Monitors and gauges

hardware condiition monitors -- SMART (which works for modern drives) -- RAID status

Hopping firewall

some discussion of hopping over or thru firewall (proxying) for what reason, i didn't record... 'expect' was mentioned. hmmmm...

umount

umount -L the drive that wouldn't leave....

Browse Happy logo

Standards Compliant Markup is encouraged. XHTML:: CSS:: 508.