Mailing List Archive

Bayes for large heterogeneous user-base (was Re: use_bayes=0 completly disables report function)
On Sat, 21 Apr 2012 14:40:42 -0500 (CDT)
Dave Funk <> wrote:

> > Beyes does not make much sense in a multi-user, diverse community
> > such as my university department. Makes sense here (small company;
> > small user base)

> I'll have to disagree with that, as a person running a mail server
> for a university college (organizational unit bigger than a
> department) which has thousands of users. Bayes may not be as deadly
> accurate as it would in a totally homogeneous environment but still
> worthwhile.

+1. Our (commercial) solution includes a nightly-updated Bayes corpus
that includes tokens from a couple of million messages from more than
a million end users and it's still extremely accurate at picking out
spam. As you wrote, ham is all over the place, but spam tends to look
the same.