Mailing List Archive

Machine learning with or vs. Bayes?
Hi all,

I don't suppose anyone has a neural-net-based SA Machine Learning plugin or external program, to complement or replace Bayes? There are a number of fairly compact Python ML packages that would greatly ease this task nowadays, like TensorFlow. It looks like rspamd has a neural net module... I wonder if it would be relatively portable.

I guess there's a bunch of ML in use for QA/masscheck and auto-scoring... but is there anything for actual rule generation, not just scoring? Or, like Bayes, where the "rule generation" is embedded in the neural net, and it just kicks out a spamminess indicator/probability?

Of course, Gmail and the other big providers have their own ML solutions that seem to be pretty good, though they have an enormous user base and near-infinite resources...

Granted, reliance on python means it's not embedded in SA, but SA already calls other external programs like pyzor/razor/DCC, so that wouldn't seem to necessarily be a big knock against it.

Cheers.

--- Amir
Re: Machine learning with or vs. Bayes? [ In reply to ]
> Of course, Gmail and the other big providers have their own ML solutions that seem to be pretty good, though they have an enormous user base and near-infinite resources...

I would argue, in contrary, that Gmail performs rather poorly, I have at
least one FP a day and that is a big no no. A couple of FN are not a
problem, but if I miss an important message because it was classified as
spam, I would be really unhappy. So as a result, I have to check the
spam manually. It is not efficient!

Olivier
Re: Machine learning with or vs. Bayes? [ In reply to ]
On Fri, 28 Jun 2019, 07:42 Amir Caspi, <cepheid@3phase.com> wrote:

> Hi all,
>
> I don't suppose anyone has a neural-net-based SA Machine Learning plugin
> or external program, to complement or replace Bayes? There are a number of
> fairly compact Python ML packages that would greatly ease this task
> nowadays, like TensorFlow. It looks like rspamd has a neural net module...
> I wonder if it would be relatively portable.
>
Hi Amir, I am working on developing a plugin with 2/3 statistical
classifiers including (SVM and neural nets) under the Google summer of code
programme with Kevin McGrail as my mentor.

I guess there's a bunch of ML in use for QA/masscheck and auto-scoring...
> but is there anything for actual rule generation, not just scoring? Or,
> like Bayes, where the "rule generation" is embedded in the neural net, and
> it just kicks out a spamminess indicator/probability?
>
> Of course, Gmail and the other big providers have their own ML solutions
> that seem to be pretty good, though they have an enormous user base and
> near-infinite resources...
>
> Granted, reliance on python means it's not embedded in SA, but SA already
> calls other external programs like pyzor/razor/DCC, so that wouldn't seem
> to necessarily be a big knock against it.
>
With python, as you said it wont be embedded into SA and hence I'm worried
about plugin integration. We ( me + mentors) have come up with a couple of
possible feasible solutions. Will post about any updates on the list soon.

Note- Any information in general which you think might help in this issue,
please let me know.


> Cheers.
>
> --- Amir
>

Regards,
Shreyansh Shrivastava

>