Mitch Richling: UNIX System Admin Tools
| Author: | Mitch Richling |
| Updated: | 2025-06-17 |
Table of Contents
1. Introduction
You will find here several simple tools that may be of use for UNIX system administrators. My Guide to UNIX System Programming has a few programs that may be useful as well.
2. stats.pl: Compute Statistics
stats.pl started life as a bit of one line perl magic, and has grown over the years into what you see here. The idea is simple to bust text data into columns of numeric values, optionally grouped by categorical (factor) variables, and report simple statistical summary information like mean, average, max, min, count, standard deviation, variance, regression lines, and histograms. The output can be customized allowing a range of formats from machine readable ones like CSV and TSV to human consumable reports using fixed width tables.
Perhaps the most complex, and useful, feature of the stats.pl script is the powerful techniques it uses to extract the data in the first place. This is
doubly important for UNIX geeks that tend to deal with numerous oddly formatted text files on a daily basis. Note that the script is not only capable of
using the data it extracts, it is also capable of outputting the filtered and scrubbed data in various formats (like CSV). For many users
stats.pl is less of a computational tool, and more of a general purpose "data extractor and filter" allowing them to feed data into
tools like R, SAS, or (goodness forbid) Excel.
For simple cases, the script "just works" with the default values; however, more complex examples are easy to find in the day-to-day life of a UNIX system administrator:
- How do I extract the data from
vmstat? - The output of
vmstatis funny as the second line has the titles while the first and third lines are junk with the data starting on line four. That sounds painful, butstats.plmakes it easy:
-skipLines=3 -headerLine=2
- How do I get extract the data from
mpstat? - The output of
mpstatis another odd one in that the first line and every fourth line consists of column titles. How kooky is that? We note that each title line has the stringCPUand none of the data lines do. So we can use something like this:
-headerLine -notRegex=CPU
- OK. I got the data from
mpstat, but I want a summary for each CPU? - The CPU is labeled in the output of
mpstatin a column calledCPU- the column we used qin the previous FAQ entry to delete the title lines. All we need do is tellstats.plabout this column. The following options will do the trick:
-headerLine=1 -notRegex=CPU -cats=CPU=
- How do I get the data from
sar? - The output from
saris more complex. The first three lines are bogus, the fourth line has titles MIXED with data, and the last two lines are junk (a blank line and an "Average" line). Still, it isn't too bad tellingstats.plhow to get the data. Because this one is so complex, there are different ways to do it. Here are three:
-notRegex=Average -goodColCnt=5= -stopRegex='^$' -skipLines=4= -notRegex='(^$|Average)' -skipLines=4=
- How can I get better titles from
sardata? - First, see the previous question about how to get the data. Use one of the options, and add the following to the command line:
-colNames==time,usr,sys,wio,idle
3. Fast filesystem traversal
The traditional way to traverse a file system is to simply use a recursive algorithm.
This algorithm is generally I/O bound; however, the culprit on modern systems is often I/O latency - not bandwidth. This is particularly true with today's transaction based I/O subsystems and network file systems like NFS. One way to alleviate this bottleneck is to have multiple I/O operations simultaneously in flight. Using this technique on a single CPU Linux box with a local file system only produces marginal performance increases, but when dealing with NFS file systems the speedup can be quite significant. Experiments with multi-CPU hosts utilizing gigabit Ethernet with large NFS servers show incredible performance improvements of well over 50x (20 hours cut down to 20 minutes). This set of programs has been used to traverse hundreds of terabytes of storage distributed across more than a billion files and 100 fileservers in just a few hours.
The idea is to first store every directory given on the command line in a linked list. Then a thread pool is created, and the threads pop entries off of that
linked list in the order they were placed in the list (FIFO). Each thread then reads all the entries in the directory it popped off the list, performs user
defined actions on each entry, and stores any subdirectories at the end of the linked list. This algorithm leads to a roughly depth-first directory
traversal. The nature of the algorithm places a heavy load upon the caching systems available in many operating systems. For example, ncsize plays a roll in
how effective this program is on a Solaris system. Also in Solaris the number of simultaneous NFS connections dramatically effects performance. Depending on
what the optional processing functions are doing, this program can place an incredible load on nscd.
The code base is designed to be customized so that binaries may be easily produced to do special tasks as the need arises. That said code linked here is written in C, and makes use of ancient C techniques to provide for tool customization. As a demonstration of how to customize the behaviour, several compile options exist for the code in the archive that generate different binaries that do very different things. Currently the following examples may be compiled right out of the box:
du- A very fast version of
/bin/du. It has no command line options, and simply displays the output of a 'du -sk'. dux- A very fast, extended version of
/bin/duthat displays much more data about the files traversed including: file sizes, number of blocks, detects files with holes, and lots of other data. own- Prints the names of all files in a directory tree that are owned by a specified list of users.
age- Produces a report regarding the ages of the files in a directory tree.
noown- Prints the names of all files in a directory tree that are NOT owned by a specified list of users.
dirgo- Simply lists the files it finds. This is similar to a '
find ./', only it does an almost depth-first search.