Mailing List Archive

TxRep increases sa-learn processing time exponentially
Hello all,

As you can see from the attached conf, I use Redis to store bayes and auto-whitelist data. Primarily because I operate on macOS and the disk based db storage doesn’t work on APFS.

I recently enabled TxRep. However, enabling it increases the time and resources (memory and CPU load) spent exponentially. For example, when I run 'sa-learn —spam —mbox /my/mail.mbox' to a mailbox with 1000-2000 messages, it typically takes 1-2 minutes to finish. Setting use_txrep=1 bumps it up to 20-40 minutes while also consuming significantly more memory and CPU.

Is this normal or am I doing something wrong?


# Configure Mail::SpamAssassin::Plugin::TxRep
# http://truxoft.com/resources/txrep.htm
#
use_txrep 1
txrep_factory Mail::SpamAssassin::RedisAddrList

# Configure Mail::SpamAssassin::Plugin::RedisAWL
# https://metacpan.org/pod/Mail::SpamAssassin::Plugin::RedisAWL
auto_whitelist_redis_server 127.0.0.1:6379
auto_whitelist_redis_prefix awl_

# Configure the Bayes learning system
use_bayes 1
use_bayes_rules 1
use_learner 1
bayes_use_hapaxes 1
bayes_learn_to_journal 0
bayes_token_ttl 30d
bayes_seen_ttl 14d

# Configure Redis for Bayes token storage
bayes_store_module Mail::SpamAssassin::BayesStore::Redis
bayes_sql_dsn server=127.0.0.1:6379;database=0

# Configure Bayes auto-learning
bayes_auto_learn 1
bayes_auto_expire 1
bayes_auto_learn_on_error 1
bayes_auto_learn_threshold_spam 12.00
bayes_auto_learn_threshold_nonspam -1.00
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
I would open a bug on bugzilla because that could be an issue but it
certainly points to optimization issues.

On 2/11/2019 8:08 AM, Palvelin Postmaster wrote:
> Hello all,
>
> As you can see from the attached conf, I use Redis to store bayes and auto-whitelist data. Primarily because I operate on macOS and the disk based db storage doesn’t work on APFS.
>
> I recently enabled TxRep. However, enabling it increases the time and resources (memory and CPU load) spent exponentially. For example, when I run 'sa-learn —spam —mbox /my/mail.mbox' to a mailbox with 1000-2000 messages, it typically takes 1-2 minutes to finish. Setting use_txrep=1 bumps it up to 20-40 minutes while also consuming significantly more memory and CPU.
>
> Is this normal or am I doing something wrong?
>
>
> # Configure Mail::SpamAssassin::Plugin::TxRep
> # http://truxoft.com/resources/txrep.htm
> #
> use_txrep 1
> txrep_factory Mail::SpamAssassin::RedisAddrList
>
> # Configure Mail::SpamAssassin::Plugin::RedisAWL
> # https://metacpan.org/pod/Mail::SpamAssassin::Plugin::RedisAWL
> auto_whitelist_redis_server 127.0.0.1:6379
> auto_whitelist_redis_prefix awl_
>
> # Configure the Bayes learning system
> use_bayes 1
> use_bayes_rules 1
> use_learner 1
> bayes_use_hapaxes 1
> bayes_learn_to_journal 0
> bayes_token_ttl 30d
> bayes_seen_ttl 14d
>
> # Configure Redis for Bayes token storage
> bayes_store_module Mail::SpamAssassin::BayesStore::Redis
> bayes_sql_dsn server=127.0.0.1:6379;database=0
>
> # Configure Bayes auto-learning
> bayes_auto_learn 1
> bayes_auto_expire 1
> bayes_auto_learn_on_error 1
> bayes_auto_learn_threshold_spam 12.00
> bayes_auto_learn_threshold_nonspam -1.00


--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
On 11 Feb 2019, Kevin A. McGrail uttered the following:

> I would open a bug on bugzilla because that could be an issue but it
> certainly points to optimization issues.

This is probably <https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7587>.

--
NULL && (void)
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
On 26 Feb 2019, nix@esperi.org.uk said:

> On 11 Feb 2019, Kevin A. McGrail uttered the following:
>
>> I would open a bug on bugzilla because that could be an issue but it
>> certainly points to optimization issues.
>
> This is probably <https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7587>.

(Or, rather, it's probably not *exactly* that because this is a
different backend: but I'd bet there is something similar wrong with
Redis, be it locks or repeated open/close causing massive I/O or
something like that.)

--
NULL && (void)
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
check https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7164

My amateur analysis was summarized in this message https://mail-archives.apache.org/mod_mbox/spamassassin-users/201711.mbox/browser




On 26/02/2019 19.30, Nix wrote:
> On 26 Feb 2019, nix@esperi.org.uk said:
>
>> On 11 Feb 2019, Kevin A. McGrail uttered the following:
>>
>>> I would open a bug on bugzilla because that could be an issue but it
>>> certainly points to optimization issues.
>> This is probably <https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7587>.
> (Or, rather, it's probably not *exactly* that because this is a
> different backend: but I'd bet there is something similar wrong with
> Redis, be it locks or repeated open/close causing massive I/O or
> something like that.)
>
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
On 27 Feb 2019, David Gessel said:

>
> check https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7164
>
> My amateur analysis was summarized in this message https://mail-archives.apache.org/mod_mbox/spamassassin-users/201711.mbox/browser

Yeah, constantly recreating the factory unconditionally would be a
disaster for performance -- but if the factory is tied to the user in
use, we surely *do* need to keep different factories around for each
user, etc.
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
On 27 Feb 2019, David Gessel told this:

>
> check https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7164
>
> My amateur analysis was summarized in this message https://mail-archives.apache.org/mod_mbox/spamassassin-users/201711.mbox/browser

btw, that's not a message, that's a whole mailbox. :)

One thread in that mailbox talks about sa-learn taking 90 seconds per
token. 90 seconds is 3x the flock timeout for the txrep database, which
is consistent with four lock takeouts, three of them blocking on its own
locks because it doesn't bother to release the locks (perhaps the author
wrongly assumes they nest.)

(90s/message is precisely what I saw until I hacked up the ugly
blocks-on-its-own-locks fix I cited earlier. Honestly, I suspect TxRep's
lock handling and state handling in general is so much of a tangled mess
that the thing cannot be considered a suitable replacement for the AWL
until it's entirely rewritten. It blocks on its own locks, it is clearly
doing something similar with redis, it reuses other users' configuration
unless you force it to throw away all its cached state for every message
and reconnect to all its dbs again (!)... this is not production-quality
code, sorry. I keep meaning to switch back to the AWL, which might be
less effective but at least doesn't have giant bugs suggestive of
software that is just not fully baked scattered all through it.)

--
NULL && (void)
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
Nix,


That's probably a reasonable path for now, I'm using TxRep with the diff I posted but not on a large mail server.   Thanks for the insight.


-David



On 27/02/2019 17.27, Nix wrote:
> On 27 Feb 2019, David Gessel told this:
>
>> check https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7164
>>
>> My amateur analysis was summarized in this message https://mail-archives.apache.org/mod_mbox/spamassassin-users/201711.mbox/browser
> btw, that's not a message, that's a whole mailbox. :)
>
> One thread in that mailbox talks about sa-learn taking 90 seconds per
> token. 90 seconds is 3x the flock timeout for the txrep database, which
> is consistent with four lock takeouts, three of them blocking on its own
> locks because it doesn't bother to release the locks (perhaps the author
> wrongly assumes they nest.)
>
> (90s/message is precisely what I saw until I hacked up the ugly
> blocks-on-its-own-locks fix I cited earlier. Honestly, I suspect TxRep's
> lock handling and state handling in general is so much of a tangled mess
> that the thing cannot be considered a suitable replacement for the AWL
> until it's entirely rewritten. It blocks on its own locks, it is clearly
> doing something similar with redis, it reuses other users' configuration
> unless you force it to throw away all its cached state for every message
> and reconnect to all its dbs again (!)... this is not production-quality
> code, sorry. I keep meaning to switch back to the AWL, which might be
> less effective but at least doesn't have giant bugs suggestive of
> software that is just not fully baked scattered all through it.)
>