Affinities and ulimit

The Bash ulimit built-in can be used to probe and set the current user limits. Such limits include the amount of memory a process may use or the maximum number of opened files a user can have. While ulimit is generally understood to affect a whole session, it can be used to change the limits of a group of processes using, for example, a sub-shell.

However, the ulimit command is quirky (it expects a particular order for parameters and not all may be set on the same command line) and does not seems to be ageing all that well. For one thing, one cannot set the affinity of processes—indirectly controlling the number of and which cores one can use in a multi-core machine.

Invoking the command with -a will list all current limits with corresponding switches and units:

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 16383
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 16383
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

To modify the execution environment with ulimit one can either invoke ulimit at the current shell level or use it within a sub-shell as we saw previously. For example, to set soft and hard time limits for a program:

$ (ulimit -t 2  -St 1; this-or-that-program )

If you look at the list of options provided by ulimit one is conspicuously missing. If we have the maximum scheduling priority for a given program, we can’t set its affinity, that is, the CPU(s) on which it will be constrained to run.

This seems like a useless constraint to put on a process but it may not. For time-critical application, we may not want to have a processing shifting from core to core possibly trashing caches in the process. You may also want to keep a user from sucking up all CPU time by using all cores. Another reason is the speed stepping.

Speed Stepping will allow a processor to dynamically change its power consumption by adjusting its internal clock speed—and also possibly its core voltage. Idle processors return to a low clock frequency if the computer’s policy is set accordingly.

Suppose that on a machine we have one demanding process running full speed. If one has many cores, all but the cores running the currently demanding process are in low speed and power consumption mode. However, the OS may well decide to migrate the process from its current core to some other core. But the other core is in low speed/power mode so is running the process much slower for a while, at least until the system’s power policy allows that core to kick in high gear and increase its frequency to its maximum. Meanwhile, the recently abandoned core returns to low speed/power mode as it is now idle.

On my current Laptop (using a Core 2 Duo P8700) SpeedStep allows the cores to be independently managed. Using Gnome’s CPU frequency monitoring task bar applet, I see that the speed of the cores vary independently. I also see a demanding process migrate from one to the other for no apparent reason. I do not know what goes on in the scheduler; maybe if two demanding processes are mapped to the same core even for a very short period of time, one of the two gets migrated? Either way, I see a process migrate to a slower processor for no apparent reason and the scheduler doesn’t seems to know about SpeedStep.

The crudest and direct way of solving the problem is to use a call to sched_setaffinity function and set explicitly the affinity to a given set of processors. This works perfectly well, but it means that you have to modify your software to take affinity into account, possibly through new command line options. That’s rather cumbersome.

I propose to extend the ulimit built-in to offer control over affinity for processes. Maybe -w (for “working” processor) or -g (for group of processors) would make a good switch? The syntax could be a hex map or a list:

$ ( ulimit -w=2,3 ; this-or-that-program )

Maybe a more general mechanism for architecture-specific limits should be added? I’d be happy for now with an extra option for affinity.

* *

SpeedStepping isn’t only beneficial for laptops and netbooks, it is beneficial for servers and whole data centers as well. I have no serious data on this, but I guess that a typical data center has its peak hours of usage but remains idle a large part of the day. For example, a local service provider will get lots of hits during some peak hours (lunch time, evenings) but comparatively little during the night and early morning. During the idling hours, I guess someone could save a lot of money by using speed stepping. Not only slower CPU draw less power, they also dissipate a lot less heat. Heat is a major problem for large data centers and results in more power consumption as cooling is necessary.

* *

I think ulimit command should be extended to include new functionalities such as the processes’ affinity. I know that one can just include the affinity management in the application itself, but it seems to make more sense to shift the responsibility of setting affinities to the shell as it does not require the modification of an existing piece of software. Furthermore, shifting this responsibility outside the application allows for easier power and machine-usage policies managed by system-level scripts.

* *

Readers will point me to taskset, a command-line tool that can be used to provide a CPU list for affinity and launch a program. The thing with taskset, while it does work properly, is that it doesn’t permeate the user’s environment in the same way limits do. Of course, all processes that descend from a process launched with taskset will inherit from the affinities, but that doesn’t seems to be session-wide as limits are. Limits are set sometime during login, and all user processes for that session inherits from the initially set limits, without the intervention of the user. Of course, we could use taskset to somehow emulate the behavior, but that’d be a special case handled by a sessionrc script or something like that.

4 Responses to Affinities and ulimit

  1. Tom says:

    Have you ever seen the `taskset` program? It may fill the hole you see in `ulimit`.

  2. Steven Pigeon says:

    taskset ...args... bash would work, no doubt. But that’d be a hack you’d have to manage yourself rather than a classic user limit such as, say, maximum program size. I pretty sure that the support for affinity is already built-in correctly (for Linux, at least) and is only wanting an interface supported by PAM and etc.

  3. […] the user’s interactive processes and cap cores 4 to 7 to half speed (or even less?) and use taskset to “glue” the long-running processes to the slowed-down […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: