Mailing List Archive

sa-update not properly parsing urls in MIRRORED.BY files?
hi-

the subject expresses my uneducated hypothesis as to what might be causing a problem i seem to have encountered after upgrading to 3.4.2.

i have an additional channel defined [sought.rules.yerp.org], and updates of this channel seem to have broken upon updating to 3.4.2:

>sa-update -vvv --allowplugins --channelfile /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir /var/lib/spamassassin/sa-update-keys
DNS TXT query: 2.4.3.sought.rules.yerp.org -> 3402014020421
Update available for channel sought.rules.yerp.org: -1 -> 3402014020421
DNS A query rules.yerp.org.s3.amazonaws.com/rules/stage failed: NXDOMAIN
DNS AAAA query rules.yerp.org.s3.amazonaws.com/rules/stage failed: NXDOMAIN
channel: could not find working mirror, channel failed
Update failed, exiting with code 4

we can see it find the txt record for mirrors:

>dig mirrors.sought.rules.yerp.org txt +short
"http://yerp.org/rules/MIRRORED.BY"

and successfully retrieves and reads the MIRRORED.BY file, which contains:

>curl 'http://yerp.org/rules/MIRRORED.BY'
http://rules.yerp.org.s3.amazonaws.com/rules/stage/

but then it seems to behave unexpectedly, and appears to not properly parse the hostname from within the url, instead attempting to lookup the entire url as though it were a hostname ["rules.yerp.org.s3.amazonaws.com/rules/stage"], which of course is invalid and doesn't exist.

query logs from the recursive nameserver confirm this:

09-Jan-2019 23:49:04.421 queries: info: client 198.19.20.50#57187 (rules.yerp.org.s3.amazonaws.com/rules/stage): view internal: query: rules.yerp.org.s3.amazonaws.com/rules/stage IN A + (198.19.20.50)
09-Jan-2019 23:49:04.422 queries: info: client 198.19.20.50#39320 (rules.yerp.org.s3.amazonaws.com/rules/stage): view internal: query: rules.yerp.org.s3.amazonaws.com/rules/stage IN AAAA + (198.19.20.50)

if we follow the url correctly, we can see there is a functional mirror:

>curl -LO 'http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10462 100 10462 0 0 1712k 0 --:--:-- --:--:-- --:--:-- 2043k

>l
total 12K
-rw-r--r-- 1 root root 11K Jan 9 23:51 3402014020421.tar.gz

so this channel would be working, were the url parsed properly.

is my hypothesis wrong? is this to be expected? if not, how can i figure out why this is happening?

thanks!
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
I believe this is a known issue fixed in svn. We need to get 3.4.3 out the
door for this. Are you able to test with the 3.4 branch from svn?

On Wed, Jan 9, 2019, 23:59 listsb <listsb-spamassassin@bitrate.net wrote:

> hi-
>
> the subject expresses my uneducated hypothesis as to what might be causing
> a problem i seem to have encountered after upgrading to 3.4.2.
>
> i have an additional channel defined [sought.rules.yerp.org], and updates
> of this channel seem to have broken upon updating to 3.4.2:
>
> >sa-update -vvv --allowplugins --channelfile
> /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile
> /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir
> /var/lib/spamassassin/sa-update-keys
> DNS TXT query: 2.4.3.sought.rules.yerp.org -> 3402014020421
> Update available for channel sought.rules.yerp.org: -1 -> 3402014020421
> DNS A query rules.yerp.org.s3.amazonaws.com/rules/stage failed: NXDOMAIN
> DNS AAAA query rules.yerp.org.s3.amazonaws.com/rules/stage failed:
> NXDOMAIN
> channel: could not find working mirror, channel failed
> Update failed, exiting with code 4
>
> we can see it find the txt record for mirrors:
>
> >dig mirrors.sought.rules.yerp.org txt +short
> "http://yerp.org/rules/MIRRORED.BY"
>
> and successfully retrieves and reads the MIRRORED.BY file, which contains:
>
> >curl 'http://yerp.org/rules/MIRRORED.BY'
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/
>
> but then it seems to behave unexpectedly, and appears to not properly
> parse the hostname from within the url, instead attempting to lookup the
> entire url as though it were a hostname ["
> rules.yerp.org.s3.amazonaws.com/rules/stage"], which of course is invalid
> and doesn't exist.
>
> query logs from the recursive nameserver confirm this:
>
> 09-Jan-2019 23:49:04.421 queries: info: client 198.19.20.50#57187 (
> rules.yerp.org.s3.amazonaws.com/rules/stage): view internal: query:
> rules.yerp.org.s3.amazonaws.com/rules/stage IN A + (198.19.20.50)
> 09-Jan-2019 23:49:04.422 queries: info: client 198.19.20.50#39320 (
> rules.yerp.org.s3.amazonaws.com/rules/stage): view internal: query:
> rules.yerp.org.s3.amazonaws.com/rules/stage IN AAAA + (198.19.20.50)
>
> if we follow the url correctly, we can see there is a functional mirror:
>
> >curl -LO '
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz'
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left
> Speed
> 100 10462 100 10462 0 0 1712k 0 --:--:-- --:--:-- --:--:--
> 2043k
>
> >l
> total 12K
> -rw-r--r-- 1 root root 11K Jan 9 23:51 3402014020421.tar.gz
>
> so this channel would be working, were the url parsed properly.
>
> is my hypothesis wrong? is this to be expected? if not, how can i figure
> out why this is happening?
>
> thanks!
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
W dniu 2019-01-10 o?12:05, Kevin A. McGrail pisze:
> I believe this is a known issue fixed in svn.? We need to get 3.4.3 out
> the door for this.? Are you able to test with the 3.4 branch from svn?

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7623
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On Jan 10, 2019, at 06.05, Kevin A. McGrail <kmcgrail@apache.org> wrote:
>
> I believe this is a known issue fixed in svn. We need to get 3.4.3 out the door for this. Are you able to test with the 3.4 branch from svn?

thanks. i've done a crude test just grabbing sa-update from svn, with some progress:

>sa-update -v --allowplugins --channelfile /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir /var/lib/spamassassin/sa-update-keys
Update available for channel sought.rules.yerp.org: -1 -> 3402014020421
http: (curl) GET http://yerp.org/rules/MIRRORED.BY, success
http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz, success
http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha512, FAILED, status: exit 22
http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha256, FAILED, status: exit 22
http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.asc, success
channel 'sought.rules.yerp.org': could not find working mirror, channel failed
Update failed, exiting with code 4

it parses the url properly now, but still fails. i guess it doesn't like only having the asc file? is my test too crude to be viable?
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
listsb skrev den 2019-01-11 05:15:

>> sa-update -v --allowplugins --channelfile
>> /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile
>> /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir
>> /var/lib/spamassassin/sa-update-keys
> Update available for channel sought.rules.yerp.org: -1 -> 3402014020421

is this very old channel waked to life now ? :=)

imho it have being non maintained in many years
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On 10 Jan 2019, at 23:15, listsb wrote:

> On Jan 10, 2019, at 06.05, Kevin A. McGrail <kmcgrail@apache.org>
> wrote:
>>
>> I believe this is a known issue fixed in svn. We need to get 3.4.3
>> out the door for this. Are you able to test with the 3.4 branch from
>> svn?
>
> thanks. i've done a crude test just grabbing sa-update from svn, with
> some progress:
>
>> sa-update -v --allowplugins --channelfile
>> /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile
>> /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir
>> /var/lib/spamassassin/sa-update-keys
> Update available for channel sought.rules.yerp.org: -1 ->
> 3402014020421
> http: (curl) GET http://yerp.org/rules/MIRRORED.BY, success
> http: (curl) GET
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz,
> success
> http: (curl) GET
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha512,
> FAILED, status: exit 22
> http: (curl) GET
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha256,
> FAILED, status: exit 22
> http: (curl) GET
> http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.asc,
> success
> channel 'sought.rules.yerp.org': could not find working mirror,
> channel failed
> Update failed, exiting with code 4
>
> it parses the url properly now, but still fails.

This breakage is a FEATURE, not a bug.

> i guess it doesn't like only having the asc file?

Correct. That channel provides no usable hash file and so cannot work
with sa-update. If you would like a version of sa-update that does not
require hash files, hack it up at will: that's what open source is for.

Also, the signature is bad:

$ gpg --verify -v 3402014020421.tar.gz.asc
gpg: armor header: Version: GnuPG v1.4.10 (GNU/Linux)
gpg: assuming signed data in '3402014020421.tar.gz'
gpg: Signature made Tue Feb 4 16:48:02 2014 EST
gpg: using DSA key DC85341F6C6191E3
gpg: Note: signature key DC85341F6C6191E3 expired Wed Aug 9 19:29:42
2017 EDT
gpg: Note: signature key DC85341F6C6191E3 expired Wed Aug 9 19:29:42
2017 EDT
gpg: Note: signature key DC85341F6C6191E3 expired Wed Aug 9 19:29:42
2017 EDT
gpg: using pgp trust model
gpg: BAD signature from "Justin Mason Signing Key (Code Signing Only)
<signingkey@jmason.org>" [expired]
gpg: binary signature, digest algorithm SHA1, key algorithm dsa1024


And finally: that rule channel has not been updated in almost 4 years
and almost surely will never be updated again. Trying to use sa-update
with it is pointless and dangerous and so it SHOULD break. If the
theory and praxis behind the final round of generation and scoring of
the SOUGHT rules was valid in 2014, they would be essentially worthless
against the mythical average mailstream of 2019. They may or may not be
useful for any particular mailstream today but in any case they are
unmaintained and unsupported. No one should use them without local
testing and ongoing local oversight of their performance against one's
local mailstream.
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
Bill Cole wrote:
> On 10 Jan 2019, at 23:15, listsb wrote:
>> Update available for channel sought.rules.yerp.org: -1 -> 3402014020421

> And finally: that rule channel has not been updated in almost 4 years
> and almost surely will never be updated again.

I'm pretty sure it's been longer than that even. Last time I checked
closely it was empty; absolutely no __ rules and the scored metas were
"meta SOUGHT_1 (0)".

Even if it downloads and validates, it's not actually doing anything,
and hasn't been for years.

-kgd
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On Wed, Jan 09, 2019 at 11:59:36PM -0500, listsb wrote:
>
> >sa-update -vvv --allowplugins ...

Just a general note, I would never ever use --allowplugins unless it's your
personal channel. There is no reason why official channels should ever
distribute plugins as it would be basically remote code run as root.
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On Fri, 11 Jan 2019 10:22:13 -0500
Kris Deugau wrote:

> Bill Cole wrote:
> > On 10 Jan 2019, at 23:15, listsb wrote:
> >> Update available for channel sought.rules.yerp.org: -1 ->
> >> 3402014020421
>
> > And finally: that rule channel has not been updated in almost 4
> > years and almost surely will never be updated again.
>
> I'm pretty sure it's been longer than that even.

I download it yesterday and it's 5 years in a few weeks


> Last time I checked
> closely it was empty; absolutely no __ rules and the scored metas
> were "meta SOUGHT_1 (0)".

There's nothing left in 20_sought.cf, but two of the three SOUGHT_FRAUD
meta rules are still in 20_sought_fraud.cf.

Sought rules were never intended to have any long-term value, they
aren't general spam signs, they were autogenerated rules based on
fairly long phrases found in recent spam.
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On 11 Jan 2019, at 10:22, Kris Deugau wrote:

> Bill Cole wrote:
>> On 10 Jan 2019, at 23:15, listsb wrote:
>>> Update available for channel sought.rules.yerp.org: -1 ->
>>> 3402014020421
>
>> And finally: that rule channel has not been updated in almost 4 years
>> and almost surely will never be updated again.
>
> I'm pretty sure it's been longer than that even.

Correct. Almost 5, according to the internal & signature timestamps. My
mistake was a symptom of it being early January...

> Last time I checked closely it was empty; absolutely no __ rules and
> the scored metas were "meta SOUGHT_1 (0)".

$ grep score 20*
20_sought.cf:score JM_SOUGHT_1 0
20_sought_fraud.cf:score JM_SOUGHT_FRAUD_1 0
20_sought_fraud.cf:score JM_SOUGHT_FRAUD_2 3.0
20_sought_fraud.cf:score JM_SOUGHT_FRAUD_3 3.0

> Even if it downloads and validates, it's not actually doing anything,
> and hasn't been for years.

Testing 282 simple but long body rules against every message is not
free.

The danger in the SOUGHT rules still being a part of SA 'lore' is that
they are a bit of abandoned attack surface. It's still possible to
download the tarball and forcibly install it or to use an obsolete or
modified sa-update to do so. If Justin lost control of the channel or
(less likely) turned malicious, the channel could be revived and turned
against a relatively inattentive subset of people using SA.

Breaking unmaintained zombie rules channel was a fortuitous side-effect
of sa-update switching from SHA1 to SHA256 and SHA512.
Re: sa-update not properly parsing urls in MIRRORED.BY files? [ In reply to ]
On Jan 11, 2019, at 00.24, Bill Cole <sausers-20150205@billmail.scconsult.com> wrote:
>
> On 10 Jan 2019, at 23:15, listsb wrote:
>
>> On Jan 10, 2019, at 06.05, Kevin A. McGrail <kmcgrail@apache.org> wrote:
>>>
>>> I believe this is a known issue fixed in svn. We need to get 3.4.3 out the door for this. Are you able to test with the 3.4 branch from svn?
>>
>> thanks. i've done a crude test just grabbing sa-update from svn, with some progress:
>>
>>> sa-update -v --allowplugins --channelfile /etc/spamassassin/sa-update-conf.d/channels.txt --gpgkeyfile /etc/spamassassin/sa-update-conf.d/sa-update-keys.txt --gpghomedir /var/lib/spamassassin/sa-update-keys
>> Update available for channel sought.rules.yerp.org: -1 -> 3402014020421
>> http: (curl) GET http://yerp.org/rules/MIRRORED.BY, success
>> http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz, success
>> http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha512, FAILED, status: exit 22
>> http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.sha256, FAILED, status: exit 22
>> http: (curl) GET http://rules.yerp.org.s3.amazonaws.com/rules/stage/3402014020421.tar.gz.asc, success
>> channel 'sought.rules.yerp.org': could not find working mirror, channel failed
>> Update failed, exiting with code 4
>>
>> it parses the url properly now, but still fails.
>
> This breakage is a FEATURE, not a bug.
>
>> i guess it doesn't like only having the asc file?
>
> Correct. That channel provides no usable hash file and so cannot work with sa-update. If you would like a version of sa-update that does not require hash files, hack it up at will: that's what open source is for.

thanks, it was not knowing about the change from sha1 to sha2 that was the red herring for me. since an sha1 hash is still published, that wasn't failing prior to upgrading. on a related but different note, it's interesting that with an expired gpg key, it wasn't failing before upgrading. i don't run sa-update with --nogpg.

in any case, at least the upgrade process exposed a channel in the config that had long since been forgotten about.