Mailing List Archive

MySQL
Is

bayes_store_module Mail::SpamAssassin::BayesStore::MySQL

Still the current best configuration option for implementing bays SQL? I ask because the documentation I found mentions things like MySQL 4.1 so it’s a rather old document. In fact, it is old enough it’s from SA 3.1 days.

I tried to search at apache.org for bayes sql and found a link to minutes from a board meeting, and there isn’t a search on https://spamassassin.apache.org/doc.html an d google takes me to the aforementioned 3.1 documentation.

I did find this https://wiki.apache.org/spamassassin/UsingSQL but there is no mention of bayes there.

Searching THERE for bayes sql brings me to the not useful

https://wiki.apache.org/spamassassin/BayesSqlClearUsers

Searching for bayes_store or bayesstore is also less than useful.

basically what I would like to do is have spamd create a user record in SQL when mail is accepted and then use that record to store the bayes data for that user (where user is a dovecot virtual user@domain.example styled name).


--
"A politician is a man who approaches every problem with an open mouth.”
Re: MySQL [ In reply to ]
Those are still accurate as not much has changed.  The new 3.4.3 rc3 has
some SQL changes for a last updated field which I've added over the
years for a cron job as well to clear out old entries.

On 7/2/2019 6:41 AM, @lbutlr wrote:
> Is
>
> bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
>
> Still the current best configuration option for implementing bays SQL? I ask because the documentation I found mentions things like MySQL 4.1 so it’s a rather old document. In fact, it is old enough it’s from SA 3.1 days.
>
> I tried to search at apache.org for bayes sql and found a link to minutes from a board meeting, and there isn’t a search on https://spamassassin.apache.org/doc.html an d google takes me to the aforementioned 3.1 documentation.
>
> I did find this https://wiki.apache.org/spamassassin/UsingSQL but there is no mention of bayes there.
>
> Searching THERE for bayes sql brings me to the not useful
>
> https://wiki.apache.org/spamassassin/BayesSqlClearUsers
>
> Searching for bayes_store or bayesstore is also less than useful.
>
> basically what I would like to do is have spamd create a user record in SQL when mail is accepted and then use that record to store the bayes data for that user (where user is a dovecot virtual user@domain.example styled name).
>
>

--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171
Re: MySQL [ In reply to ]
On 2 Jul 2019, at 09:12, Kevin A. McGrail <kmcgrail@apache.org> wrote:
> Those are still accurate as not much has changed. The new 3.4.3 rc3 has
> some SQL changes for a last updated field which I've added over the
> years for a cron job as well to clear out old entries.

Thanks, should I wait?

Looking over both the sql/README and the sql/README.bayes

how does spamc build the user list that it stores prefs for, does it simply accept the username that it is passed after dovecot verifies the user is a valid local user and create a record for that user name? I mean, I hope so, that seems like it would be best, but it looks like I have to maintain the username field in the SQL database myself.

The database must contain a table, default name "userpref", with at
least three fields:

username varchar(100) # this is the username whose e-mail is being filtered
preference varchar(30) # the preference (whitelist_from, required_score, etc.)
value varchar(100) # the value of the named preference

but then later it says

Once you have created the database and added the table, just add the required
lines to your global configuration file (local.cf).

which says nothing about adding data TO the table.

I guess the tl;dr version of my question (too late!) is how is the username field populated in the database?


--
ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES
BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart
chalkboard Ep. 1F07
Re: MySQL [ In reply to ]
On 7/2/2019 3:45 PM, @lbutlr wrote:
> Thanks, should I wait?
For the 3.4.3 release? You can use rc3 today if you want.  I have it in
production and it works fine. You can find it at
http://talon2.pccc.com/~kmcgrail/devel/
> Looking over both the sql/README and the sql/README.bayes
>
> how does spamc build the user list that it stores prefs for, does it simply accept the username that it is passed after dovecot verifies the user is a valid local user and create a record for that user name? I mean, I hope so, that seems like it would be best, but it looks like I have to maintain the username field in the SQL database myself.
>
> The database must contain a table, default name "userpref", with at
> least three fields:
>
> username varchar(100) # this is the username whose e-mail is being filtered
> preference varchar(30) # the preference (whitelist_from, required_score, etc.)
> value varchar(100) # the value of the named preference
>
> but then later it says
>
> Once you have created the database and added the table, just add the required
> lines to your global configuration file (local.cf).
>
> which says nothing about adding data TO the table.
>
> I guess the tl;dr version of my question (too late!) is how is the username field populated in the database?

I think you are mixing up the user preference table and the naive
bayesian table.  Apologies if the docs aren't clear.  Please consider
cleaning them up!

Once you create the bayes table, you can put data into it with sa-learn.

You then can configure a cf file to use the bayes table data when new
emails come in.

bayes_store_module              Mail::SpamAssassin::BayesStore::SQL
bayes_sql_dsn                   DBI:mysql:spamassassin:[hostname for the
sql server]
bayes_sql_username              [the dbuser]
bayes_sql_password              [the dbpass]
#BAYESIAN Rules
use_bayes                               1
use_bayes_rules                         1
bayes_auto_learn                        0
bayes_expiry_max_db_size                200000
bayes_auto_expire                       1
bayes_learn_to_journal                  1

I would recommend you look into Redis as a backend but for a small
install, Bayes on SQL works fine.

Also recommend you disable auto-expire and setup a cron job for that so
you don't have delays when getting new mail.

Regards,

KAM

--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171
Re: MySQL [ In reply to ]
On 2 Jul 2019, at 14:21, Kevin A. McGrail <kmcgrail@apache.org> wrote:
>> I guess the tl;dr version of my question (too late!) is how is the username field populated in the database?
>
> I think you are mixing up the user preference table and the naive bayesian table. Apologies if the docs aren't clear. Please consider cleaning them up!

I can’t clean them up because I don’t understand them! ????

The bayes readme says you need to have a user pref database setup first.

<https://svn.apache.org/repos/asf/spamassassin/branches/3.1/sql/README.bayes>
This assumes that you have already created a database for use with spamassassin and setup a username/password that can access that database. (See "Creating A Database", in "sql/README", if you don't have a suitable database ready.)

<https://svn.apache.org/repos/asf/spamassassin/branches/3.1/sql/README>

> Once you create the bayes table, you can put data into it with sa-learn.
>
> You then can configure a cf file to use the bayes table data when new emails come in.

That all seems understandable, it is the issue of populating the initial table with user data and then having to maintain that user list myself that I am wondering about.

> I would recommend you look into Redis as a backend but for a small
> install, Bayes on SQL works fine.

Yeah, SQL is going to more than meet my needs when I figure it out. I’m only precessing a few thousand emails a day.

> Also recommend you disable auto-expire and setup a cron job for that so you don't have delays when getting new mail.

You mean a cron job to do the expirations I assume?


--
And crawling on the planet's face
Some insects called the human race
Lost in time, lost in space, and meaning
Re: MySQL [ In reply to ]
@lbutlr wrote:
> On 2 Jul 2019, at 14:21, Kevin A. McGrail <kmcgrail@apache.org> wrote:
>>> I guess the tl;dr version of my question (too late!) is how is the username field populated in the database?
>>
>> I think you are mixing up the user preference table and the naive bayesian table. Apologies if the docs aren't clear. Please consider cleaning them up!
>
> I can’t clean them up because I don’t understand them! ????
>
> The bayes readme says you need to have a user pref database setup first.
>
> <https://svn.apache.org/repos/asf/spamassassin/branches/3.1/sql/README.bayes>
> This assumes that you have already created a database for use with spamassassin and setup a username/password that can access that database. (See "Creating A Database", in "sql/README", if you don't have a suitable database ready.)

This refers the the SQL database username/password, not SA user data
within the database. You only need one user/password, although I
*think* if you're paranoid and willing to spend the time at it it's
possible to have separate SQL logins for different users, possibly even
complete separate databases for each user.

SA will fill in fields in the database as needed when messages are
learned; there's no need to maintain a list of your system/SA users. I
think you mentioned earlier you're using virtual users, not system
users, so you'll have to use the -u argument to sa-learn for manual
learning.

I'm not sure there's a better way to word that section.

-kgd