Mailing List Archive

Memory needs of drbd
Hi there,

Recently we have had a case of DRBD behaving extremely bad here, when faced
with high IO activity and memory shortage. We have seen primary machines
locking hard (no IRQs), sometimes even with a light load but more than a
couple of drbd volumes being used simultaneously.

We have banged our heads for quite a while over the code, until Marcelo
came up with the idea of increasing the values in /proc/sys/vm/freepages.
This file controls the threshold of swapping and how many pages the kernel
will keep only for its internal use (read ~linux/Documentation/sysctl/*).

That was one of the most striking cases of holly penguin pee I've seen
recently. In tests here with small machines (60MB of RAM) and doubling the
values found in that file, I've been able to stick even 144 dbench threads
on three drbd volumes (48 in each). I did not even get a timeout out of it.
Nothing. I stress tested for some three days, moved well over 30GB into
those volumes (counting the blocks transfered) and it behaved surprisingly
well.

That leads us to conclude that a good part of drbd's problems we see in
high loads are related to the memory needs it has when servicing its
requests. Imagine that the memory is full of drbd requests, and so the
kernel decides to flush some of them to free some pages. For every page to
be free, drbd has got to allocate memory for another request (local device
below drbd) and more memory to send it via TCP. A simplistic guess at it
would be that every drbd request occupies three times as many memory as a
local request.

I really don't think that the kernel's mechanisms are well tuned for
such different behaviour, and I think we should find some way to limit
drbd's outstanding (concurrent or not) requests so that we don't end up
deadlocked allocating the last bit of memory to try to free some more bits.

What do you all think of it? Can you get DRBD to hang on high load and many
volumes? If so, does doubing the values in /proc/sys/vm/freepages help
getting it not to hang?

Cheers!
Fábio
--
$ date; sleep 1; date Fábio Olivé Leite
Tue Jan 19 03:14:07 2038 olive@example.com
Thu Jan 1 00:00:00 1970
$ mail ken -s "Houston we have a problem"