Mailing List Archive

Generic B-tree implementation
Hello folks

I am attaching source files containing a very generic implementation
of B-trees in C. The implementation corresponds to in memory B-Tree
data structure. The B-tree library consists of two files, btree.h and
btree.c. I am also attaching a sample program main.c which should
hopefully make the use of the library clear.

I would be happy to receive inputs from the community for changes
(enchancements) to the library. All the more I would like to help
someone with a project which would take advantage of the B-Tree
implementation.

- Vishal

PS: This code is being released under GPL version 2.

--
Motivation will almost always beat mere talent.
Re: Generic B-tree implementation [ In reply to ]
Vishal Patil <vishpat@gmail.com> wrote:
> I am attaching source files containing a very generic implementation
> of B-trees in C. The implementation corresponds to in memory B-Tree
> data structure. The B-tree library consists of two files, btree.h and
> btree.c. I am also attaching a sample program main.c which should
> hopefully make the use of the library clear.

B-trees are useful mainly when you can get a bunch of pointers in one
swoop, i.e., by reading nodes from disk.

> I would be happy to receive inputs from the community for changes
> (enchancements) to the library. All the more I would like to help
> someone with a project which would take advantage of the B-Tree
> implementation.

Build infrastructure (== library) without clear users won't go very far on
LKML.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
Agreed, however if I am not mistaken B-trees are useful even for
virtual memory implementation, for example HP-UX uses B-trees for
managing virtual memory pages.

Also I did not get the statement
"Build infrastructure (== library) without clear users won't go very
far on LKML"

- Vishal


On 7/17/06, Horst von Brand <vonbrand@inf.utfsm.cl> wrote:
> Vishal Patil <vishpat@gmail.com> wrote:
> > I am attaching source files containing a very generic implementation
> > of B-trees in C. The implementation corresponds to in memory B-Tree
> > data structure. The B-tree library consists of two files, btree.h and
> > btree.c. I am also attaching a sample program main.c which should
> > hopefully make the use of the library clear.
>
> B-trees are useful mainly when you can get a bunch of pointers in one
> swoop, i.e., by reading nodes from disk.
>
> > I would be happy to receive inputs from the community for changes
> > (enchancements) to the library. All the more I would like to help
> > someone with a project which would take advantage of the B-Tree
> > implementation.
>
> Build infrastructure (== library) without clear users won't go very far on
> LKML.
> --
> Dr. Horst H. von Brand User #22616 counter.li.org
> Departamento de Informatica Fono: +56 32 654431
> Universidad Tecnica Federico Santa Maria +56 32 654239
> Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
>


--
Motivation will almost always beat mere talent.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
On Mon, 17 Jul 2006 23:08:53 EDT, Vishal Patil said:
> Agreed, however if I am not mistaken B-trees are useful even for
> virtual memory implementation, for example HP-UX uses B-trees for
> managing virtual memory pages.

OK, sounds at least somewhat plausible..

> Also I did not get the statement
> "Build infrastructure (== library) without clear users won't go very
> far on LKML"

Your patch would go a lot further if it came as 2 parts:

PATCH 1/2: Add Generic B-tree implementation
PATCH 2/2: Convert mm/foobar.c to track VM pages using B-trees.

Barring an actual patch 2/2, a *clear* explanation of why it would benefit
a *specific* piece of code so somebody else can do it...
RE: Generic B-tree implementation [ In reply to ]
Vishal Patil wrote:
>
> I am attaching source files containing a very generic implementation
> of B-trees in C. The implementation corresponds to in memory B-Tree
> data structure. The B-tree library consists of two files, btree.h and
> btree.c. I am also attaching a sample program main.c which should
> hopefully make the use of the library clear.

Couple of thoughts:

1. red/black b-trees have superior worst case performance as it
relates to rebalancing, and the implementation doesn't add a
lot of complexity:
http://www.nist.gov/dads/HTML/redblack.html

2. Paul Vixie's b-tree implementation has been around since the mid-80's
or so, and simply from an historical perspective is worth a look
(comp.sources.unix anyone?):
http://www.isc.org/index.pl?/sources/devel/func/avl-subs-2.php

3. GCC uses 'splay trees' to good advantage:
http://www.nist.gov/dads/HTML/splaytree.html
which have the property that most-recently referenced nodes
tend to be higher up in the tree.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
Gary

I said B-Tree and not binary tree, please read the explaination about
B-tree at http://en.wikipedia.org/wiki/B-tree. Also I am aware of AVL
trees.

I never claimed that my implementation is better or anything like
that. I posted the code so that someone in need of the data structure
might use it. Also I would be willing them to help with their project.

- Vishal

On 7/18/06, Gary Funck <gary@intrepid.com> wrote:
>
> Vishal Patil wrote:
> >
> > I am attaching source files containing a very generic implementation
> > of B-trees in C. The implementation corresponds to in memory B-Tree
> > data structure. The B-tree library consists of two files, btree.h and
> > btree.c. I am also attaching a sample program main.c which should
> > hopefully make the use of the library clear.
>
> Couple of thoughts:
>
> 1. red/black b-trees have superior worst case performance as it
> relates to rebalancing, and the implementation doesn't add a
> lot of complexity:
> http://www.nist.gov/dads/HTML/redblack.html
>
> 2. Paul Vixie's b-tree implementation has been around since the mid-80's
> or so, and simply from an historical perspective is worth a look
> (comp.sources.unix anyone?):
> http://www.isc.org/index.pl?/sources/devel/func/avl-subs-2.php
>
> 3. GCC uses 'splay trees' to good advantage:
> http://www.nist.gov/dads/HTML/splaytree.html
> which have the property that most-recently referenced nodes
> tend to be higher up in the tree.
>
>


--
Motivation will almost always beat mere talent.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
RE: Generic B-tree implementation [ In reply to ]
Vishal Patil wrote:
> I said B-Tree and not binary tree, please read the explaination about
> B-tree at http://en.wikipedia.org/wiki/B-tree. Also I am aware of AVL
> trees.
>
> I never claimed that my implementation is better or anything like
> that. I posted the code so that someone in need of the data structure
> might use it. Also I would be willing them to help with their project.

My reason for pointing out the other data strucutres is to note that there
might be search tree representations that are more appropriate for
implementation inside the kernel, and to perhaps encourage you to have
a look at implementing them as well. Red-black trees in particular have
the property that they're reasonably well-balanced, and that the balancing
algorithm makes use of local information. That means that the kernel might
be able to limit the level of locking required to update the tree.

I liked your B-tree implementation, and have saved a copy. Too bad there
isn't the C/C++ equivalent of CPAN (comp.unix.sources is so passe`). Your
B-tree implementation would make a nice addition to an archive of
handy C algorithm implementations.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
On 7/18/06, Gary Funck <gary@intrepid.com> wrote:
> I liked your B-tree implementation, and have saved a copy. Too bad there
> isn't the C/C++ equivalent of CPAN (comp.unix.sources is so passe`). Your
> B-tree implementation would make a nice addition to an archive of
> handy C algorithm implementations.

FWIW C++ has http://boost.org/... and C has /usr/lib/ :)

-Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
B-trees are good for parellel updates as well. Anyway it would be
great to have inputs from other folks about how B-trees could help
inside the kernel (if at all)

- Vishal

On 7/18/06, Gary Funck <gary@intrepid.com> wrote:
>
> Vishal Patil wrote:
> > I said B-Tree and not binary tree, please read the explaination about
> > B-tree at http://en.wikipedia.org/wiki/B-tree. Also I am aware of AVL
> > trees.
> >
> > I never claimed that my implementation is better or anything like
> > that. I posted the code so that someone in need of the data structure
> > might use it. Also I would be willing them to help with their project.
>
> My reason for pointing out the other data strucutres is to note that there
> might be search tree representations that are more appropriate for
> implementation inside the kernel, and to perhaps encourage you to have
> a look at implementing them as well. Red-black trees in particular have
> the property that they're reasonably well-balanced, and that the balancing
> algorithm makes use of local information. That means that the kernel might
> be able to limit the level of locking required to update the tree.
>
> I liked your B-tree implementation, and have saved a copy. Too bad there
> isn't the C/C++ equivalent of CPAN (comp.unix.sources is so passe`). Your
> B-tree implementation would make a nice addition to an archive of
> handy C algorithm implementations.
>


--
Motivation will almost always beat mere talent.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
On Tue, 2006-07-18 at 11:22 -0400, Vishal Patil wrote:
> B-trees are good for parellel updates as well. Anyway it would be
> great to have inputs from other folks about how B-trees could help
> inside the kernel (if at all)

B-trees are mostly used in file systems in the kernel. For example NTFS
and HFS (and I think HPFS) use B-trees for various metadata like
directory indexes for example.

But of course your implementation is purely userspace and cannot be used
in the kernel (you use recursion for example...) so I am not sure how
you envisage to help the kernel with your code...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
I can get rid of recursions using loops, will need to work a little more on it.

Also I will be working on developing a patch for VM management using
B-trees instead of RB-trees.

- Vishal

On 7/19/06, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> On Tue, 2006-07-18 at 11:22 -0400, Vishal Patil wrote:
> > B-trees are good for parellel updates as well. Anyway it would be
> > great to have inputs from other folks about how B-trees could help
> > inside the kernel (if at all)
>
> B-trees are mostly used in file systems in the kernel. For example NTFS
> and HFS (and I think HPFS) use B-trees for various metadata like
> directory indexes for example.
>
> But of course your implementation is purely userspace and cannot be used
> in the kernel (you use recursion for example...) so I am not sure how
> you envisage to help the kernel with your code...
>
> Best regards,
>
> Anton
> --
> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
> Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
> WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
>
>


--
Motivation will almost always beat mere talent.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
On Wed, Jul 19, 2006 at 09:34:43AM -0400, Vishal Patil wrote:
> I can get rid of recursions using loops, will need to work a little more on
> it.

Before doing the above you may want to learn about all possible malloc
retvals too and to make sure the interface has all needed oom failure
paths that you're obviously missing.

One of the advantages of rbtree vs b-trees (and vs radixtrees too) is
the fact they require zero dynamic metadata allocations of ram. They
use the same trick of list.h to avoid it while still being mostly
generic and sharable library code. Imagine rbtrees like scalable
lists. The kernel usage is quite optimized too, the mmap path for
example does a single lookup and it stores the last "lookup" point
before restarting with an insertion while keeping the mmap_sem (or
mutex renaming of the day) on hold so to avoid the insertion operation
to start over with a second (wasteful) lookup (again very similar to
what you could do if you had list, and the rebalancing is a very
immediate operation too involving only a limited number of pointers).

> Also I will be working on developing a patch for VM management using
> B-trees instead of RB-trees.

Once you start changing those bits, you'll notice the further
requirement of the btrees due to the oom failures in code paths that
are already reasonably complex with vma oom failures.

As speed of cache raises faster than speed of ram, memory seeks tends
to cost more than they did in the past, but I doubt it worth it, most
important especially in the common case of very few vmas. I like the
common case of only a few dozen vmas to be so fast and low
overhead. The corner cases like uml and oracle already use nonlinear,
to also avoid the ram overhead of the vmas, with btree the lowmem
overhead would be even higher (the only 4/8 bytes of overhead of the
rbtrees would even be fixable with David's patch, but nobody
considered it very important so far to eliminate those 4/8 bytes
32bit/64bit per vma, though we can do that in the future). So even if
btree would be faster for those extreme corner cases, it would still
not be a replacement for the nonlinear (I wish there was a decent
replacement for nonlinear, whose only reason to exist seems to be uml
on 64bit archs).

If I would be in you, as a slightly more likely to succeed experiment,
I would be looking into replacing the pagecache radix-tree with a
btree, as long as you can leave intact the tagging properties we have
in the radix-tree needed for finding only dirty elements in the tree
etc... (we use that to avoid separate dirty lists for the pages). You
should also size the order to automatically match the cache size of
the arch (dunno if it's better at compile or run time). I'm no a
radix-tree guru but the btree may save some ram if you've all
pagecache pages scattered all over the place with random access. It
also won't require all levels to be allocated. However it will require
rebalancing, something the radix tree doesn't require, it seems a bit
of a tradeoff, and I suspect the radix-tree will still win in all
common cases. But at least all oom failure paths should already exists
for you, so that should avoid you having to touch much code externally
to your own btree files.

I wish you to have fun with the btrees, I remember I had fun back then
when I was playing with the rbtrees ;).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
Andrea

Thank you for your time and a very valuable input. I was thinking of
implementing the VM management using B-trees because I wanted to play
with something interesting in the kernel. However I will definately
look into your idea of page cache as well.

Will keep everyone informed about my progress.

- Vishal


On 7/19/06, Andrea Arcangeli <andrea@suse.de> wrote:
> On Wed, Jul 19, 2006 at 09:34:43AM -0400, Vishal Patil wrote:
> > I can get rid of recursions using loops, will need to work a little more on
> > it.
>
> Before doing the above you may want to learn about all possible malloc
> retvals too and to make sure the interface has all needed oom failure
> paths that you're obviously missing.
>
> One of the advantages of rbtree vs b-trees (and vs radixtrees too) is
> the fact they require zero dynamic metadata allocations of ram. They
> use the same trick of list.h to avoid it while still being mostly
> generic and sharable library code. Imagine rbtrees like scalable
> lists. The kernel usage is quite optimized too, the mmap path for
> example does a single lookup and it stores the last "lookup" point
> before restarting with an insertion while keeping the mmap_sem (or
> mutex renaming of the day) on hold so to avoid the insertion operation
> to start over with a second (wasteful) lookup (again very similar to
> what you could do if you had list, and the rebalancing is a very
> immediate operation too involving only a limited number of pointers).
>
> > Also I will be working on developing a patch for VM management using
> > B-trees instead of RB-trees.
>
> Once you start changing those bits, you'll notice the further
> requirement of the btrees due to the oom failures in code paths that
> are already reasonably complex with vma oom failures.
>
> As speed of cache raises faster than speed of ram, memory seeks tends
> to cost more than they did in the past, but I doubt it worth it, most
> important especially in the common case of very few vmas. I like the
> common case of only a few dozen vmas to be so fast and low
> overhead. The corner cases like uml and oracle already use nonlinear,
> to also avoid the ram overhead of the vmas, with btree the lowmem
> overhead would be even higher (the only 4/8 bytes of overhead of the
> rbtrees would even be fixable with David's patch, but nobody
> considered it very important so far to eliminate those 4/8 bytes
> 32bit/64bit per vma, though we can do that in the future). So even if
> btree would be faster for those extreme corner cases, it would still
> not be a replacement for the nonlinear (I wish there was a decent
> replacement for nonlinear, whose only reason to exist seems to be uml
> on 64bit archs).
>
> If I would be in you, as a slightly more likely to succeed experiment,
> I would be looking into replacing the pagecache radix-tree with a
> btree, as long as you can leave intact the tagging properties we have
> in the radix-tree needed for finding only dirty elements in the tree
> etc... (we use that to avoid separate dirty lists for the pages). You
> should also size the order to automatically match the cache size of
> the arch (dunno if it's better at compile or run time). I'm no a
> radix-tree guru but the btree may save some ram if you've all
> pagecache pages scattered all over the place with random access. It
> also won't require all levels to be allocated. However it will require
> rebalancing, something the radix tree doesn't require, it seems a bit
> of a tradeoff, and I suspect the radix-tree will still win in all
> common cases. But at least all oom failure paths should already exists
> for you, so that should avoid you having to touch much code externally
> to your own btree files.
>
> I wish you to have fun with the btrees, I remember I had fun back then
> when I was playing with the rbtrees ;).
>


--
Motivation will almost always beat mere talent.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Generic B-tree implementation [ In reply to ]
Folks

Following Andrea's suggestions I have implemented the Linux kernel
page cache using B-Trees. I am attaching the patch along with this
email. Note that the patch was developed against 2.6.16.2. Also the
patch offers the B-tree data structure as a library (like the radix
tree)

Also I haven't made any performance measurements to compare it with
the radix tree implementation. However ideas to do this are most
welcome.

Thanks

- Vishal

On 7/19/06, Vishal Patil <vishpat@gmail.com> wrote:
> Andrea
>
> Thank you for your time and a very valuable input. I was thinking of
> implementing the VM management using B-trees because I wanted to play
> with something interesting in the kernel. However I will definately
> look into your idea of page cache as well.
>
> Will keep everyone informed about my progress.
>
> - Vishal
>
>
> On 7/19/06, Andrea Arcangeli <andrea@suse.de> wrote:
> > On Wed, Jul 19, 2006 at 09:34:43AM -0400, Vishal Patil wrote:
> > > I can get rid of recursions using loops, will need to work a little more on
> > > it.
> >
> > Before doing the above you may want to learn about all possible malloc
> > retvals too and to make sure the interface has all needed oom failure
> > paths that you're obviously missing.
> >
> > One of the advantages of rbtree vs b-trees (and vs radixtrees too) is
> > the fact they require zero dynamic metadata allocations of ram. They
> > use the same trick of list.h to avoid it while still being mostly
> > generic and sharable library code. Imagine rbtrees like scalable
> > lists. The kernel usage is quite optimized too, the mmap path for
> > example does a single lookup and it stores the last "lookup" point
> > before restarting with an insertion while keeping the mmap_sem (or
> > mutex renaming of the day) on hold so to avoid the insertion operation
> > to start over with a second (wasteful) lookup (again very similar to
> > what you could do if you had list, and the rebalancing is a very
> > immediate operation too involving only a limited number of pointers).
> >
> > > Also I will be working on developing a patch for VM management using
> > > B-trees instead of RB-trees.
> >
> > Once you start changing those bits, you'll notice the further
> > requirement of the btrees due to the oom failures in code paths that
> > are already reasonably complex with vma oom failures.
> >
> > As speed of cache raises faster than speed of ram, memory seeks tends
> > to cost more than they did in the past, but I doubt it worth it, most
> > important especially in the common case of very few vmas. I like the
> > common case of only a few dozen vmas to be so fast and low
> > overhead. The corner cases like uml and oracle already use nonlinear,
> > to also avoid the ram overhead of the vmas, with btree the lowmem
> > overhead would be even higher (the only 4/8 bytes of overhead of the
> > rbtrees would even be fixable with David's patch, but nobody
> > considered it very important so far to eliminate those 4/8 bytes
> > 32bit/64bit per vma, though we can do that in the future). So even if
> > btree would be faster for those extreme corner cases, it would still
> > not be a replacement for the nonlinear (I wish there was a decent
> > replacement for nonlinear, whose only reason to exist seems to be uml
> > on 64bit archs).
> >
> > If I would be in you, as a slightly more likely to succeed experiment,
> > I would be looking into replacing the pagecache radix-tree with a
> > btree, as long as you can leave intact the tagging properties we have
> > in the radix-tree needed for finding only dirty elements in the tree
> > etc... (we use that to avoid separate dirty lists for the pages). You
> > should also size the order to automatically match the cache size of
> > the arch (dunno if it's better at compile or run time). I'm no a
> > radix-tree guru but the btree may save some ram if you've all
> > pagecache pages scattered all over the place with random access. It
> > also won't require all levels to be allocated. However it will require
> > rebalancing, something the radix tree doesn't require, it seems a bit
> > of a tradeoff, and I suspect the radix-tree will still win in all
> > common cases. But at least all oom failure paths should already exists
> > for you, so that should avoid you having to touch much code externally
> > to your own btree files.
> >
> > I wish you to have fun with the btrees, I remember I had fun back then
> > when I was playing with the rbtrees ;).
> >
>
>
> --
> Motivation will almost always beat mere talent.
>


--
Motivation will almost always beat mere talent.