Mailing List Archive

Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator
Hi Nick,

On Tue, May 25, 2010 at 10:07 AM, Nick Piggin <npiggin@suse.de> wrote:
> There is nothing to stop incremental changes or tweaks on top of that
> allocator, even to the point of completely changing the allocation
> scheme. It is inevitable that with changes in workloads, SMP/NUMA, and
> cache/memory costs and hierarchies, the best slab allocation schemes
> will change over time.

Agreed.

On Tue, May 25, 2010 at 10:07 AM, Nick Piggin <npiggin@suse.de> wrote:
> I think it is more important to have one allocator than trying to get
> the absolute most perfect one for everybody. That way changes are
> carefully and slowly reviewed and merged, with results to justify the
> change. This way everybody is testing the same thing, and bisection will
> work. The situation with SLUB is already a nightmare because now each
> allocator has half the testing and half the work put into it.

I wouldn't say it's a nightmare, but yes, it could be better. From my
point of view SLUB is the base of whatever the future will be because
the code is much cleaner and simpler than SLAB. That's why I find
Christoph's work on SLEB more interesting than SLQB, for example,
because it's building on top of something that's mature and stable.

That said, are you proposing that even without further improvements to
SLUB, we should go ahead and, for example, remove SLAB from Kconfig
for v2.6.36 and see if we can just delete the whole thing from, say,
v2.6.38?

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 11:03:49AM +0300, Pekka Enberg wrote:
> Hi Nick,
>
> On Tue, May 25, 2010 at 10:07 AM, Nick Piggin <npiggin@suse.de> wrote:
> > There is nothing to stop incremental changes or tweaks on top of that
> > allocator, even to the point of completely changing the allocation
> > scheme. It is inevitable that with changes in workloads, SMP/NUMA, and
> > cache/memory costs and hierarchies, the best slab allocation schemes
> > will change over time.
>
> Agreed.
>
> On Tue, May 25, 2010 at 10:07 AM, Nick Piggin <npiggin@suse.de> wrote:
> > I think it is more important to have one allocator than trying to get
> > the absolute most perfect one for everybody. That way changes are
> > carefully and slowly reviewed and merged, with results to justify the
> > change. This way everybody is testing the same thing, and bisection will
> > work. The situation with SLUB is already a nightmare because now each
> > allocator has half the testing and half the work put into it.
>
> I wouldn't say it's a nightmare, but yes, it could be better. From my
> point of view SLUB is the base of whatever the future will be because
> the code is much cleaner and simpler than SLAB. That's why I find
> Christoph's work on SLEB more interesting than SLQB, for example,
> because it's building on top of something that's mature and stable.

I don't think SLUB ever proved itself very well. The selling points
were some untestable handwaving about how queueing is bad and jitter
is bad, ignoring the fact that queues could be shortened and periodic
reaping disabled at runtime with SLAB style of allocator. It also
has relied heavily on higher order allocations which put great strain
on hugepage allocations and page reclaim (witness the big slowdown
in low memory conditions when tmpfs was using higher order allocations
via SLUB).


> That said, are you proposing that even without further improvements to
> SLUB, we should go ahead and, for example, remove SLAB from Kconfig
> for v2.6.36 and see if we can just delete the whole thing from, say,
> v2.6.38?

SLUB has not been able to displace SLAB for a long timedue to
performance and higher order allocation problems.

I think "clean code" is very important, but by far the hardest thing to
get right by far is the actual allocation and freeing strategies. So
it's crazy to base such a choice on code cleanliness. If that's the
deciding factor, then I can provide a patch to modernise SLAB and then
we can remove SLUB and start incremental improvements from there.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
Hi Nick,

On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> I don't think SLUB ever proved itself very well. The selling points
> were some untestable handwaving about how queueing is bad and jitter
> is bad, ignoring the fact that queues could be shortened and periodic
> reaping disabled at runtime with SLAB style of allocator. It also
> has relied heavily on higher order allocations which put great strain
> on hugepage allocations and page reclaim (witness the big slowdown
> in low memory conditions when tmpfs was using higher order allocations
> via SLUB).

The main selling point for SLUB was NUMA. Has the situation changed?
Reliance on higher order allocations isn't that relevant if we're
anyway discussing ways to change allocation strategy.

On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> SLUB has not been able to displace SLAB for a long timedue to
> performance and higher order allocation problems.
>
> I think "clean code" is very important, but by far the hardest thing to
> get right by far is the actual allocation and freeing strategies. So
> it's crazy to base such a choice on code cleanliness. If that's the
> deciding factor, then I can provide a patch to modernise SLAB and then
> we can remove SLUB and start incremental improvements from there.

I'm more than happy to take in patches to clean up SLAB but I think
you're underestimating the required effort. What SLUB has going for
it:

- No NUMA alien caches
- No special lockdep handling required
- Debugging support is better
- Cpuset interractions are simpler
- Memory hotplug is more mature
- Much more contributors to SLUB than to SLAB

I was one of the people cleaning up SLAB when SLUB was merged and
based on that experience I'm strongly in favor of SLUB as a base.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 12:19:09PM +0300, Pekka Enberg wrote:
> Hi Nick,
>
> On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> > I don't think SLUB ever proved itself very well. The selling points
> > were some untestable handwaving about how queueing is bad and jitter
> > is bad, ignoring the fact that queues could be shortened and periodic
> > reaping disabled at runtime with SLAB style of allocator. It also
> > has relied heavily on higher order allocations which put great strain
> > on hugepage allocations and page reclaim (witness the big slowdown
> > in low memory conditions when tmpfs was using higher order allocations
> > via SLUB).
>
> The main selling point for SLUB was NUMA. Has the situation changed?

Well one problem with SLAB was really just those alien caches. AFAIK
they were added by Christoph Lameter (maybe wrong), and I didn't ever
actually see much justification for them in the changelog. noaliencache
can be and is used on bigger machines, and SLES and RHEL kernels are
using SLAB on production NUMA systems up to thousands of CPU Altixes,
and have been looking at working on SGI's UV, and hundreds of cores
POWER7 etc.

I have not seen NUMA benchmarks showing SLUB is significantly better.
I haven't done much testing myself, mind you. But from indications, we
could probably quite easily drop the alien caches setup and do like a
simpler single remote freeing queue per CPU or something like that.


> Reliance on higher order allocations isn't that relevant if we're
> anyway discussing ways to change allocation strategy.

Then it's just going through more churn and adding untested code to
get where SLAB already is (top performance without higher order
allocations). So it is very relevant if we're considering how to get
to a single allocator.


> On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> > SLUB has not been able to displace SLAB for a long timedue to
> > performance and higher order allocation problems.
> >
> > I think "clean code" is very important, but by far the hardest thing to
> > get right by far is the actual allocation and freeing strategies. So
> > it's crazy to base such a choice on code cleanliness. If that's the
> > deciding factor, then I can provide a patch to modernise SLAB and then
> > we can remove SLUB and start incremental improvements from there.
>
> I'm more than happy to take in patches to clean up SLAB but I think
> you're underestimating the required effort. What SLUB has going for
> it:
>
> - No NUMA alien caches
> - No special lockdep handling required
> - Debugging support is better
> - Cpuset interractions are simpler
> - Memory hotplug is more mature

All this I don't think is much problem. It was only a problem because we
put in SLUB and so half these new features were added to it and people
weren't adding them to SLAB.


> - Much more contributors to SLUB than to SLAB

In large part because it is less mature. But also because it seems to be
seen as the allocator of the future.

Problem is that SLUB was never able to prove why it should be merged.
The code cleanliness issue is really trivial in comparison to how much
head scratching and work goes into analysing the performance.

It *really* is not required to completely replace a whole subsystem like
this to make progress. Even if we make relatively large changes,
everyone gets to use and test them, and it's so easy to bisect and
work out how changes interact and change behaviour. Compare that with
the problems we have when someone says that SLUB has a performance
regression against SLAB.


> I was one of the people cleaning up SLAB when SLUB was merged and
> based on that experience I'm strongly in favor of SLUB as a base.

I think we should: modernise SLAB code, add missing debug features,
possibly turn off alien caches by default, chuck out SLUB, and then
require that future changes have some reasonable bar set to justify
them.

I would not be at all against adding changes that transform SLAB to
SLUB or SLEB or SLQB. That's how it really should be done in the
first place.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
Hi Nick,

On Tue, May 25, 2010 at 12:34 PM, Nick Piggin <npiggin@suse.de> wrote:
>> The main selling point for SLUB was NUMA. Has the situation changed?
>
> Well one problem with SLAB was really just those alien caches. AFAIK
> they were added by Christoph Lameter (maybe wrong), and I didn't ever
> actually see much justification for them in the changelog. noaliencache
> can be and is used on bigger machines, and SLES and RHEL kernels are
> using SLAB on production NUMA systems up to thousands of CPU Altixes,
> and have been looking at working on SGI's UV, and hundreds of cores
> POWER7 etc.

Yes, Christoph and some other people introduced alien caches IIRC for
big iron SGI boxes. As for benchmarks, commit
e498be7dafd72fd68848c1eef1575aa7c5d658df ("Numa-aware slab allocator
V5") mentions AIM.

On Tue, May 25, 2010 at 12:34 PM, Nick Piggin <npiggin@suse.de> wrote:
> I have not seen NUMA benchmarks showing SLUB is significantly better.
> I haven't done much testing myself, mind you. But from indications, we
> could probably quite easily drop the alien caches setup and do like a
> simpler single remote freeing queue per CPU or something like that.

Commit 81819f0fc8285a2a5a921c019e3e3d7b6169d225 ("SLUB core") mentions
kernbench improvements.

Other than these two data points, I unfortunately don't have any as I
wasn't involved with merging of either of the patches. If other NUMA
people know better, please feel free to share the data.

On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> I think we should: modernise SLAB code, add missing debug features,
> possibly turn off alien caches by default, chuck out SLUB, and then
> require that future changes have some reasonable bar set to justify
> them.
>
> I would not be at all against adding changes that transform SLAB to
> SLUB or SLEB or SLQB. That's how it really should be done in the
> first place.

Like I said, as a maintainer I'm happy to merge patches to modernize
SLAB but I still think you're underestimating the effort especially
considering the fact that we can't afford many performance regressions
there either. I guess trying to get rid of alien caches would be the
first logical step there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, 25 May 2010, Pekka Enberg wrote:

> I wouldn't say it's a nightmare, but yes, it could be better. From my
> point of view SLUB is the base of whatever the future will be because
> the code is much cleaner and simpler than SLAB.

The code may be much cleaner and simpler than slab, but nobody (to date)
has addressed the significant netperf TCP_RR regression that slub has, for
example. I worked on a patchset to do that for a while but it wasn't
popular because it added some increments to the fastpath for tracking
data.

I think it's great to have clean and simple code, but even considering its
use is a non-starter when the entire kernel is significantly slower for
certain networking loads.

> That's why I find
> Christoph's work on SLEB more interesting than SLQB, for example,
> because it's building on top of something that's mature and stable.
>
> That said, are you proposing that even without further improvements to
> SLUB, we should go ahead and, for example, remove SLAB from Kconfig
> for v2.6.36 and see if we can just delete the whole thing from, say,
> v2.6.38?
>

We use slab internally specifically because of the slub regressions.
Removing it from the kernel at this point would be the equivalent of
saying that Linux cares about certain workloads more than others since
there are clearly benchmarks that show slub to be inferior in pure
performance numbers. I'd love for us to switch to slub but we can't take
the performance hit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, 25 May 2010, Nick Piggin wrote:

> I don't think SLUB ever proved itself very well. The selling points
> were some untestable handwaving about how queueing is bad and jitter
> is bad, ignoring the fact that queues could be shortened and periodic
> reaping disabled at runtime with SLAB style of allocator. It also
> has relied heavily on higher order allocations which put great strain
> on hugepage allocations and page reclaim (witness the big slowdown
> in low memory conditions when tmpfs was using higher order allocations
> via SLUB).
>

I agree that the higher order allocations is a major problem and slub
relies heavily on them for being able to utilize both the allocation and
freeing fastpaths for a number of caches. For systems with a very large
amount of memory that isn't fully utilized and fragmentation isn't an
issue, this works fine, but for users who use all their memory and do some
amount of reclaim it comes at a significant cost. The cpu slab thrashing
problem that I identified with the netperf TCP_RR benchmark can be heavily
reduced by tuning certain kmalloc caches to allocate higher order slabs,
but that makes it very difficult to run with hugepages and the allocation
slowpath even slower. There are commandline workarounds to prevent slub
from using these higher order allocations, but the performance of the
allocator then suffers as a result.

> SLUB has not been able to displace SLAB for a long timedue to
> performance and higher order allocation problems.
>

Completely agreed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 12:53:43PM +0300, Pekka Enberg wrote:
> Hi Nick,
>
> On Tue, May 25, 2010 at 12:34 PM, Nick Piggin <npiggin@suse.de> wrote:
> >> The main selling point for SLUB was NUMA. Has the situation changed?
> >
> > Well one problem with SLAB was really just those alien caches. AFAIK
> > they were added by Christoph Lameter (maybe wrong), and I didn't ever
> > actually see much justification for them in the changelog. noaliencache
> > can be and is used on bigger machines, and SLES and RHEL kernels are
> > using SLAB on production NUMA systems up to thousands of CPU Altixes,
> > and have been looking at working on SGI's UV, and hundreds of cores
> > POWER7 etc.
>
> Yes, Christoph and some other people introduced alien caches IIRC for
> big iron SGI boxes. As for benchmarks, commit
> e498be7dafd72fd68848c1eef1575aa7c5d658df ("Numa-aware slab allocator
> V5") mentions AIM.

It's quite a change with a lot of things. But there are definitely
other ways we can improve this without having a huge dumb crossbar
for remote frees.


> On Tue, May 25, 2010 at 12:34 PM, Nick Piggin <npiggin@suse.de> wrote:
> > I have not seen NUMA benchmarks showing SLUB is significantly better.
> > I haven't done much testing myself, mind you. But from indications, we
> > could probably quite easily drop the alien caches setup and do like a
> > simpler single remote freeing queue per CPU or something like that.
>
> Commit 81819f0fc8285a2a5a921c019e3e3d7b6169d225 ("SLUB core") mentions
> kernbench improvements.

I haven't measured anything like that. Kernbench for me has never
had slab show anywhere near the profiles (it's always page fault,
teardown, page allocator paths).

Must have been a pretty specific configuration, but anyway I don't
know that it is realistic.


> Other than these two data points, I unfortunately don't have any as I
> wasn't involved with merging of either of the patches. If other NUMA
> people know better, please feel free to share the data.

A lot of people are finding SLAB is still required for performance
reasons. We did not want to change in SLES11 for example because
of performance concerns. Not sure about RHEL6?


> On Tue, May 25, 2010 at 11:16 AM, Nick Piggin <npiggin@suse.de> wrote:
> > I think we should: modernise SLAB code, add missing debug features,
> > possibly turn off alien caches by default, chuck out SLUB, and then
> > require that future changes have some reasonable bar set to justify
> > them.
> >
> > I would not be at all against adding changes that transform SLAB to
> > SLUB or SLEB or SLQB. That's how it really should be done in the
> > first place.
>
> Like I said, as a maintainer I'm happy to merge patches to modernize
> SLAB

I think that would be most productive at this point. I will volunteer
to do it.

As much as I would like to see SLQB be merged :) I think the best
option is to go with SLAB because it is very well tested and very
very well performing.

If Christoph or you or I or anyone have genuine improvements to make
to the core algorithms, then the best thing to do will just be do
make incremental changes to SLAB.


> but I still think you're underestimating the effort especially
> considering the fact that we can't afford many performance regressions
> there either. I guess trying to get rid of alien caches would be the
> first logical step there.

There are several aspects to this. I think the first one will be to
actually modernize the code style, simplify the bootstrap process and
static memory allocations (SLQB goes even further than SLUB in this
regard), and to pull in debug features from SLUB.

These steps should be made without any changes to core algorithms.
Alien caches can easily be disabled and at present they are really
only a problem for big Altixes where it is a known parameter to tune.

From that point, I think we should concede that SLUB has not fulfilled
performance promises, and make SLAB the default.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
Hi Nick,

On Tue, May 25, 2010 at 1:19 PM, Nick Piggin <npiggin@suse.de> wrote:
>> Like I said, as a maintainer I'm happy to merge patches to modernize
>> SLAB
>
> I think that would be most productive at this point. I will volunteer
> to do it.

OK, great!

> As much as I would like to see SLQB be merged :) I think the best
> option is to go with SLAB because it is very well tested and very
> very well performing.

I would have liked to see SLQB merged as well but it just didn't happen.

> If Christoph or you or I or anyone have genuine improvements to make
> to the core algorithms, then the best thing to do will just be do
> make incremental changes to SLAB.

I don't see the problem in improving SLUB even if we start modernizing
SLAB. Do you? I'm obviously biased towards SLUB still for the reasons
I already mentioned. I don't want to be a blocker for progress so if I
turn out to be a problem, we should consider changing the
maintainer(s). ;-)

> There are several aspects to this. I think the first one will be to
> actually modernize the code style, simplify the bootstrap process and
> static memory allocations (SLQB goes even further than SLUB in this
> regard), and to pull in debug features from SLUB.
>
> These steps should be made without any changes to core algorithms.
> Alien caches can easily be disabled and at present they are really
> only a problem for big Altixes where it is a known parameter to tune.
>
> From that point, I think we should concede that SLUB has not fulfilled
> performance promises, and make SLAB the default.

Sure. I don't care which allocator "wins" if we actually are able to get there.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
Hi David,

On Tue, May 25, 2010 at 1:02 PM, David Rientjes <rientjes@google.com> wrote:
>> I wouldn't say it's a nightmare, but yes, it could be better. From my
>> point of view SLUB is the base of whatever the future will be because
>> the code is much cleaner and simpler than SLAB.
>
> The code may be much cleaner and simpler than slab, but nobody (to date)
> has addressed the significant netperf TCP_RR regression that slub has, for
> example.  I worked on a patchset to do that for a while but it wasn't
> popular because it added some increments to the fastpath for tracking
> data.

Yes and IIRC I asked you to resend the series because while I care a
lot about performance regressions, I simply don't have the time or the
hardware to reproduce and fix the weird cases you're seeing.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 01:45:07PM +0300, Pekka Enberg wrote:
> Hi Nick,
>
> On Tue, May 25, 2010 at 1:19 PM, Nick Piggin <npiggin@suse.de> wrote:
> >> Like I said, as a maintainer I'm happy to merge patches to modernize
> >> SLAB
> >
> > I think that would be most productive at this point. I will volunteer
> > to do it.
>
> OK, great!
>
> > As much as I would like to see SLQB be merged :) I think the best
> > option is to go with SLAB because it is very well tested and very
> > very well performing.
>
> I would have liked to see SLQB merged as well but it just didn't happen.

It seemed a bit counter productive if the goal is to have one allocator.
I think it still has merit but I should really practice what I preach
and propose incremental improvements to SLAB.


> > If Christoph or you or I or anyone have genuine improvements to make
> > to the core algorithms, then the best thing to do will just be do
> > make incremental changes to SLAB.
>
> I don't see the problem in improving SLUB even if we start modernizing
> SLAB. Do you? I'm obviously biased towards SLUB still for the reasons
> I already mentioned. I don't want to be a blocker for progress so if I
> turn out to be a problem, we should consider changing the
> maintainer(s). ;-)

I think it just has not proven itself at this point, we have most
production kernels (at least, the performance sensitive ones that
I'm aware of) running on SLAB, and if it is conceded that lack of
queueing and reliance on higher order allocations is a problem then
I think it is far better just to bite the bullet now, drop it so
we can have a single allocator. Rather than adding SLAB-like queueing
to it and other big changes. Then make incremental improvements to SLAB.

I have no problems at all with trying new ideas, but really, they
should be done in SLAB as incremental improvements. Everywhere we
take that approach, things seem to work better than when we do
wholesale rip and replacements.

I don't want Christoph (or myself, or you) to stop testing new ideas,
but really there are almost no good reasons as to why they can be done
as incremental patches.

With SLAB code cleaned up, there will be even fewer reasons.


> > There are several aspects to this. I think the first one will be to
> > actually modernize the code style, simplify the bootstrap process and
> > static memory allocations (SLQB goes even further than SLUB in this
> > regard), and to pull in debug features from SLUB.
> >
> > These steps should be made without any changes to core algorithms.
> > Alien caches can easily be disabled and at present they are really
> > only a problem for big Altixes where it is a known parameter to tune.
> >
> > From that point, I think we should concede that SLUB has not fulfilled
> > performance promises, and make SLAB the default.
>
> Sure. I don't care which allocator "wins" if we actually are able to get there.

SLUB is already behind the 8 ball here. So is SLQB I don't mind saying
because it has had much much less testing.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, 25 May 2010, Pekka Enberg wrote:
>
> I would have liked to see SLQB merged as well but it just didn't happen.

And it's not going to. I'm not going to merge YASA that will stay around
for years, not improve on anything, and will just mean that there are some
bugs that developers don't see because they depend on some subtle
interaction with the sl*b allocator.

We've got three. That's at least one too many. We're not adding any new
ones until we've gotten rid of at least one old one.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 08:13:50AM -0700, Linus Torvalds wrote:
>
>
> On Tue, 25 May 2010, Pekka Enberg wrote:
> >
> > I would have liked to see SLQB merged as well but it just didn't happen.
>
> And it's not going to. I'm not going to merge YASA that will stay around
> for years, not improve on anything, and will just mean that there are some
> bugs that developers don't see because they depend on some subtle
> interaction with the sl*b allocator.
>
> We've got three. That's at least one too many. We're not adding any new
> ones until we've gotten rid of at least one old one.

No agree and realized that a while back (hence stop pushing SLQB).
SLAB is simply a good allocator that is very very hard to beat. The
fact that a lot of places are still using SLAB despite the real
secondary advantages of SLUB (cleaner code, better debugging support)
indicate to me that we should go back and start from there.

What is sad is all this duplicate (and unsynchronized and not always
complete) work implementing things in both the allocators[*] and
split testing base.

As far as I can see, there was never a good reason to replace SLAB
rather than clean up its code and make incremental improvements.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
Hi Nick,

On Tue, May 25, 2010 at 6:43 PM, Nick Piggin <npiggin@suse.de> wrote:
> As far as I can see, there was never a good reason to replace SLAB
> rather than clean up its code and make incremental improvements.

I'm not totally convinced but I guess we're about to find that out.
How do you propose we benchmark SLAB while we clean it up and change
things to make sure we don't make the same mistakes as we did with
SLUB (i.e. miss an important workload like TPC-C)?

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 08:02:32PM +0300, Pekka Enberg wrote:
> Hi Nick,
>
> On Tue, May 25, 2010 at 6:43 PM, Nick Piggin <npiggin@suse.de> wrote:
> > As far as I can see, there was never a good reason to replace SLAB
> > rather than clean up its code and make incremental improvements.
>
> I'm not totally convinced but I guess we're about to find that out.
> How do you propose we benchmark SLAB while we clean it up

Well the first pass will be code cleanups, bootstrap simplifications.
Then looking at what debugging features were implemented in SLUB but not
SLAB and what will be useful to bring over from there.

At this point the aim would be for actual allocation behaviour with
non-debug settings to be unchanged. Hopefully this removes everyone's
(apparently) largest gripe that code is crufty.

Next would be to add some options to tweak queue sizes and disable
cache reaping at runtime, for the benfit of the low jitter crowd,
see if any further hotplug fixes are required.

Then would be to propose incremental improvements to actual algorithm.
For example, replacing the alien cache crossbar with a lighter weight
or more scalable structure.


> and change
> things to make sure we don't make the same mistakes as we did with
> SLUB (i.e. miss an important workload like TPC-C)?

Obviously it is impossible to make forward progress and also catch
all regressions before release. This fact means that we have to be
able to cope with them as well as possible.

We get two benefits from starting with SLAB. Firstly, we get a larger
testing base. Secondly, we get a simple (ie. git revert) formula of how
to get from good behaviour to bad behaviour.

I don't anticipate a huge number of functional changes to SLAB here
though. It's surprisingly hard to do better than it. alien caches are
one area, maybe configurable higher order allocation support, jitter
reduction.

If we do get a big proposed change in the pipeline, then we have to
eat it somehow, but AFAIKS we've still got a better foundation than
starting with a completely new allocator and feeling around in the
dark trying to move it past SLAB in terms of performance.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 8:19 PM, Nick Piggin <npiggin@suse.de> wrote:
>> I'm not totally convinced but I guess we're about to find that out.
>> How do you propose we benchmark SLAB while we clean it up
>
> Well the first pass will be code cleanups, bootstrap simplifications.
> Then looking at what debugging features were implemented in SLUB but not
> SLAB and what will be useful to bring over from there.

Bootstrap might be easy to clean up but the biggest source of cruft
comes from the deeply inlined, complex allocation paths. Cleaning
those up is bound to cause performance regressions if you're not
careful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, May 25, 2010 at 08:35:05PM +0300, Pekka Enberg wrote:
> On Tue, May 25, 2010 at 8:19 PM, Nick Piggin <npiggin@suse.de> wrote:
> >> I'm not totally convinced but I guess we're about to find that out.
> >> How do you propose we benchmark SLAB while we clean it up
> >
> > Well the first pass will be code cleanups, bootstrap simplifications.
> > Then looking at what debugging features were implemented in SLUB but not
> > SLAB and what will be useful to bring over from there.
>
> Bootstrap might be easy to clean up but the biggest source of cruft
> comes from the deeply inlined, complex allocation paths. Cleaning
> those up is bound to cause performance regressions if you're not
> careful.

Oh I see what you mean, just straight-line code speed regressions
could bite us when doing cleanups.

That's possible. I'll keep a close eye on generated asm.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator [ In reply to ]
On Tue, 25 May 2010, Pekka Enberg wrote:

> > The code may be much cleaner and simpler than slab, but nobody (to date)
> > has addressed the significant netperf TCP_RR regression that slub has, for
> > example. I worked on a patchset to do that for a while but it wasn't
> > popular because it added some increments to the fastpath for tracking
> > data.
>
> Yes and IIRC I asked you to resend the series because while I care a
> lot about performance regressions, I simply don't have the time or the
> hardware to reproduce and fix the weird cases you're seeing.
>

My patchset still never attained parity with slab even though it improved
slub's performance for that specific benchmark on my 16-core machine with
64G of memory:

# threads SLAB SLUB SLUB+patchset
16 69892 71592 69505
32 126490 95373 119731
48 138050 113072 125014
64 169240 149043 158919
80 192294 172035 179679
96 197779 187849 192154
112 217283 204962 209988
128 229848 217547 223507
144 238550 232369 234565
160 250333 239871 244789
176 256878 242712 248971
192 261611 243182 255596

CONFIG_SLUB_STATS demonstrates that the kmalloc-256 and kmalloc-2048 are
performing quite poorly without the changes:

cache ALLOC_FASTPATH ALLOC_SLOWPATH
kmalloc-256 98125871 31585955
kmalloc-2048 77243698 52347453

cache FREE_FASTPATH FREE_SLOWPATH
kmalloc-256 173624 129538000
kmalloc-2048 90520 129500630

When you have these type of results, it's obvious why slub is failing to
achieve the same performance as slab. With the slub fastpath percpu work
that has been done recently, it might be possible to resurrect my patchset
and get more positive feedback because the penalty won't be as a
significant, but the point is that slub still fails to achieve the same
results that slab can with heavy networking loads. Thus, I think any
discussion about removing slab is premature until it's no longer shown to
be a clear winner in comparison to its replacement, whether that is slub,
slqb, sleb, or another allocator. I agree that slub is clearly better in
terms of maintainability, but we simply can't use it because of its
performance for networking loads.

If you want to duplicate these results on machines with a larger number of
cores, just download netperf, run with CONFIG_SLUB on both netserver and
netperf machines, and use this script:

#!/bin/bash

TIME=60 # seconds
HOSTNAME=<hostname> # netserver

NR_CPUS=$(grep ^processor /proc/cpuinfo | wc -l)
echo NR_CPUS=$NR_CPUS

run_netperf() {
for i in $(seq 1 $1); do
netperf -H $HOSTNAME -t TCP_RR -l $TIME &
done
}

ITERATIONS=0
while [ $ITERATIONS -lt 12 ]; do
RATE=0
ITERATIONS=$[$ITERATIONS + 1]
THREADS=$[$NR_CPUS * $ITERATIONS]
RESULTS=$(run_netperf $THREADS | grep -v '[a-zA-Z]' | awk '{ print $6 }')

for j in $RESULTS; do
RATE=$[$RATE + ${j/.*}]
done
echo threads=$THREADS rate=$RATE
done
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/