Mailing List Archive

TxRep increases sa-learn processing time exponentially
Hello all,

As you can see from the attached conf, I use Redis to store bayes and auto-whitelist data. Primarily because I operate on macOS and the disk based db storage doesn’t work on APFS.

I recently enabled TxRep. However, enabling it increases the time and resources (memory and CPU load) spent exponentially. For example, when I run 'sa-learn —spam —mbox /my/mail.mbox' to a mailbox with 1000-2000 messages, it typically takes 1-2 minutes to finish. Setting use_txrep=1 bumps it up to 20-40 minutes while also consuming significantly more memory and CPU.

Is this normal or am I doing something wrong?


# Configure Mail::SpamAssassin::Plugin::TxRep
# http://truxoft.com/resources/txrep.htm
#
use_txrep 1
txrep_factory Mail::SpamAssassin::RedisAddrList

# Configure Mail::SpamAssassin::Plugin::RedisAWL
# https://metacpan.org/pod/Mail::SpamAssassin::Plugin::RedisAWL
auto_whitelist_redis_server 127.0.0.1:6379
auto_whitelist_redis_prefix awl_

# Configure the Bayes learning system
use_bayes 1
use_bayes_rules 1
use_learner 1
bayes_use_hapaxes 1
bayes_learn_to_journal 0
bayes_token_ttl 30d
bayes_seen_ttl 14d

# Configure Redis for Bayes token storage
bayes_store_module Mail::SpamAssassin::BayesStore::Redis
bayes_sql_dsn server=127.0.0.1:6379;database=0

# Configure Bayes auto-learning
bayes_auto_learn 1
bayes_auto_expire 1
bayes_auto_learn_on_error 1
bayes_auto_learn_threshold_spam 12.00
bayes_auto_learn_threshold_nonspam -1.00
Re: TxRep increases sa-learn processing time exponentially [ In reply to ]
I would open a bug on bugzilla because that could be an issue but it
certainly points to optimization issues.

On 2/11/2019 8:08 AM, Palvelin Postmaster wrote:
> Hello all,
>
> As you can see from the attached conf, I use Redis to store bayes and auto-whitelist data. Primarily because I operate on macOS and the disk based db storage doesn’t work on APFS.
>
> I recently enabled TxRep. However, enabling it increases the time and resources (memory and CPU load) spent exponentially. For example, when I run 'sa-learn —spam —mbox /my/mail.mbox' to a mailbox with 1000-2000 messages, it typically takes 1-2 minutes to finish. Setting use_txrep=1 bumps it up to 20-40 minutes while also consuming significantly more memory and CPU.
>
> Is this normal or am I doing something wrong?
>
>
> # Configure Mail::SpamAssassin::Plugin::TxRep
> # http://truxoft.com/resources/txrep.htm
> #
> use_txrep 1
> txrep_factory Mail::SpamAssassin::RedisAddrList
>
> # Configure Mail::SpamAssassin::Plugin::RedisAWL
> # https://metacpan.org/pod/Mail::SpamAssassin::Plugin::RedisAWL
> auto_whitelist_redis_server 127.0.0.1:6379
> auto_whitelist_redis_prefix awl_
>
> # Configure the Bayes learning system
> use_bayes 1
> use_bayes_rules 1
> use_learner 1
> bayes_use_hapaxes 1
> bayes_learn_to_journal 0
> bayes_token_ttl 30d
> bayes_seen_ttl 14d
>
> # Configure Redis for Bayes token storage
> bayes_store_module Mail::SpamAssassin::BayesStore::Redis
> bayes_sql_dsn server=127.0.0.1:6379;database=0
>
> # Configure Bayes auto-learning
> bayes_auto_learn 1
> bayes_auto_expire 1
> bayes_auto_learn_on_error 1
> bayes_auto_learn_threshold_spam 12.00
> bayes_auto_learn_threshold_nonspam -1.00


--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171