Wednesday, March 3, 2010

More on Swapping Vs. Paging

Just using a lot of swap space doesn’t necessarily mean that you need more memory. Here’s how to tell when Linux is happy with the available memory and when it needs more.

Linux novices often find virtual memory mysterious, but with a grasp of the fundamental concepts, it’s easy to understand. With this knowledge, you can monitor your system’s memory utilization using vmstat and detect problems that can adversely affect system performance.
How Virtual Memory Works

Physical memory-the actual RAM installed-is a finite resource on any system. The Linux memory handler manages the allocation of that limited resource by freeing portions of physical memory when possible.

All processes use memory, of course, but each process doesn’t need all its allocated memory all the time. Taking advantage of this fact, the kernel frees up physical memory by writing some or all of a process’ memory to disk until it’s needed again.

The kernel uses paging and swapping to perform this memory management. Paging refers to writing portions, termed pages, of a process’ memory to disk. Swapping, strictly speaking, refers to writing the entire process, not just part, to disk. In Linux, true swapping is exceedingly rare, but the terms paging and swapping often are used interchangeably.

When pages are written to disk, the event is called a page-out, and when pages are returned to physical memory, the event is called a page-in. A page fault occurs when the kernel needs a page, finds it doesn’t exist in physical memory because it has been paged-out, and re-reads it in from disk.

Page-ins are common, normal and are not a cause for concern. For example, when an application first starts up, its executable image and data are paged-in. This is normal behavior.

Page-outs, however, can be a sign of trouble. When the kernel detects that memory is running low, it attempts to free up memory by paging out. Though this may happen briefly from time to time, if page-outs are plentiful and constant, the kernel can reach a point where it’s actually spending more time managing paging activity than running the applications, and system performance suffers. This woeful state is referred to as thrashing.

Using swap space is not inherently bad. Rather, it’s intense paging activity that’s problematic. For instance, if your most-memory-intensive application is idle, it’s fine for portions of it to be set aside when another large job is active. Memory pages belonging to an idle application are better set aside so the kernel can use physical memory for disk buffering.
Using vmstat

vmstat, as its name suggests, reports virtual memory statistics. It shows how much virtual memory there is, how much is free and paging activity. Most important, you can observe page-ins and page-outs as they happen. This is extremely useful.

To monitor the virtual memory activity on your system, it’s best to use vmstat with a delay. A delay is the number of seconds between updates. If you don’t supply a delay, vmstat reports the averages since the last boot and quit. Five seconds is the recommended delay interval.

To run vmstat with a five-second delay, type:

vmstat 5

You also can specify a count, which indicates how many updates you want to see before vmstat quits. If you don’t specify a count, the count defaults to infinity, but you can stop output with Ctrl-C.

To run vmstat with ten updates, five seconds apart, type:

vmstat 5 10

Here’s an example of a system free of paging activity:

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 29232 116972 4524 244900 0 0 0 0 0 0 0 0 0
0 0 0 29232 116972 4524 244900 0 0 0 0 2560 6 0 1 99
0 0 0 29232 116972 4524 244900 0 0 0 0 2574 10 0 2 98

All fields are explained in the vmstat man page, but the most important columns for this article are free, si and so. The free column shows the amount of free memory, si shows page-ins and so shows page-outs. In this example, the so column is zero consistently, indicating there are no page-outs.

The abbreviations so and si are used instead of the more accurate po and pi for historical reasons.

Here’s an example of a system with paging activity:

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
. . .
1 0 0 13344 1444 1308 19692 0 168 129 42 1505 713 20 11 69
1 0 0 13856 1640 1308 18524 64 516 379 129 4341 646 24 34 42
3 0 0 13856 1084 1308 18316 56 64 14 0 320 1022 84 9 8

Notice the nonzero so values indicating there is not enough physical memory and the kernel is paging out. You can use top and ps to identify the processes that are using the most memory.

You also can use top to show memory and swap statistics. Here is an example of the uppermost portion of a typical top report:

14:23:19 up 348 days, 3:02, 1 user, load average: 0.00, 0.00, 0.00
55 processes: 54 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.0% user, 2.4% system, 0.0% nice, 97.6% idle
Mem: 481076K total, 367508K used, 113568K free, 4712K buffers
Swap: 1004052K total, 29852K used, 974200K free, 244396K cached

For more information about top, see the top man page.
Conclusion

It isn’t necessarily bad for your system to be using some of its swap space. But if you discover your system is often running low on physical memory and paging is causing performance to suffer, add more memory. If you can’t add more memory, run memory-intensive jobs at different times of the day, avoid running nonessential jobs when memory demand is high or distribute jobs across multiple systems if possible.