© 2009 Mitch Richling
Before SRM (Solaris Resource Manager) became a standard part of Solaris, it was very difficult to set and enforce
policies that would effectively mange large, shared Solaris servers (thin client servers in particular). Such systems were
plagued with the recurrent problem of a single user suddenly consuming the entire system and denying others a fair
share. This is the environment tyr (named after the Norse god) is designed to monitor and
manage. This program has no fancy user interface, just a simple config file, a reporting tool, and a kernel module. In
normal operation, it sets quietly in the background monitoring activity and acting on violations of the policies contained
in its config file. The policies can be quite complex rule sets based upon user ID, system overall resource availability,
binary name, binary MD5, user group, physical connection (what X11 terminal or SunRay), time of day, date, and network
activity. Some real examples:
nice on any process owned by a particular student ID that is consuming 15MB/s network
bandwidth to a lab X11 terminal when someone in the faculty group is trying to run a Maple simulation.nice'ed within the last hour, and a new process starts that
consumes more than 100% of a CPU for more than 5min, then immediately nice the new process.Because this tool is tied directly into the kernel, it is able to extract much more accurate information about process
resource consumption than is available via the ps command (CPU statistics to several digits
and memory information to one digit). In addition, the direct tie into the kernel makes it immune to many "root
kit" techniques by which one may attempt to hide resource consumption -- I don't know how many e-mails I have
received from system administrators who install tyr for the first time only to find out that
some critical server they manage has been rooted for months.
Not only is the reporting module of the tyr system a more accurate replacement
for ps, it is also capable of much more sophisticated reporting. For example, it can summarize
usage by user, terminal (TTY), or user group. One may also create ad-hoc policies and take ad-hoc actions from the
reporting tool -- like selecting a process ID and killing it. Finally, the reporting module can be set to continuously
display process information as it changes and report any actions that the kernel module is taking to enforce policy.
Resource consumption by user![]() |
Reporting on processes![]() |
Continuous reporting![]() |