These devs have a history of incompetence in handling bugs. They also have an incompetently administered bug tracker, which now blocks unregistered visitors from simply viewing bug reports.
The linked bug:
Testing was done using this workflow as an offgrid user:
- Carry a laptop into an Internet cafe to fetch email and run sa-update
- From home without Internet: process the mail (which would take too long in a cafe)
- Also from offgrid home: periodically run sa-learn on false negatives and reprocess
/transparency issue/
Someone approaching SA for the first time would naturally expect sa-update to need Internet, but not SA’s scoring. The fact that SA needs an Internet uplink to score content defies reasonable expectations and fails the “principle of least astonishment”. The man page and docs in /usr/share/doc/spamassassin give no clues that Internet is surprisingly required for scoring.
When I first discovered SA’s scoring tool was accessing the net, I wrapped it with torsocks so as to mitigate leaking personal info to some extent. That worked at a time when Internet was always available to me. I should mention first that torsocks is a hack. It’s not as proper as an app that supports proxies.
Torsocks is also a somewhat futile hack because DNS lookups are often done using UDP. In attempt mitigate that risk, a tor middlebox was tried:
firejail --net=vnet0 --dns=\"$(ip address show dev vnet0 | awk '/inet\>/{gsub(/[/].*/,""); print $2 }')\" /usr/bin/spamassassin
But that also failed even when there was Internet and I did not keep notes on how or why.
When “torsocks spamassassin” is executed without a WAN, it behaves poorly. The output is unfriendly nonsense from a python-unaware end-user standpoint when a msg is scored:
===✂----------------------------------------
1767261476 PERROR torsocks[17588]: socks5 libc connect: Connection refused (in socks5_connect() at socks5.c:202)
1767261476 PERROR torsocks[17588]: socks5 libc connect: Connection refused (in socks5_connect() at socks5.c:202)
Use of uninitialized value in subroutine entry at /usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP/Lite.pm line 647.
Bad arg length for NetAddr::IP::Util::mask4to6, length is 0, should be 32 at /usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP/Lite.pm line 647.
Compilation failed in require at /usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP.pm line 8.
BEGIN failed--compilation aborted at /usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP.pm line 8.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin/NetSet.pm line 26.
BEGIN failed--compilation aborted at /usr/share/perl5/Mail/SpamAssassin/NetSet.pm line 26.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin/Conf.pm line 88.
BEGIN failed--compilation aborted at /usr/share/perl5/Mail/SpamAssassin/Conf.pm line 88.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin.pm line 71.
BEGIN failed--compilation aborted at /usr/share/perl5/Mail/SpamAssassin.pm line 71.
Compilation failed in require at /usr/bin/spamassassin line 78.
BEGIN failed--compilation aborted at /usr/bin/spamassassin line 78.
procmail: [17579] Thu Jan 1 10:57:56 2026
procmail: Program failure (111) of "torsocks"
procmail: Rescue of unfiltered data succeeded
procmail: [17579] Thu Jan 1 10:57:56 2026
procmail: No match on "^X-Spam-Status: Yes"
===✂----------------------------------------
It’s bizarre that compilation would be in play at this stage. The above also took painfully long, and no score was generated.
Testing in a less messy environment yielded better results:
===✂----------------------------------------
$ firejail --net=none /usr/bin/spamassassin -t < /usr/share/doc/spamassassin/examples/sample-spam.txt
…
X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on localhost
X-Spam-Flag: YES
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=1000.0 required=5.0 tests=BAYES_40,GTUBE,NO_RECEIVED,
NO_RELAYS autolearn=no autolearn_force=no version=4.0.1
…
===✂----------------------------------------
The workaround for me will be to prefix with “firejail –net=none”. But that’s not ideal because it means the procmail scripts must be altered between torsocks and firejail every time WAN availability changes.
What does the “-t” flag do? That appears in USAGE.gz but the man page does not disclose any CLI options. What other CLI options are there?
SA can probably be configured to skip tests that need a WAN, but it’s likely unsurmountable for a novice user to quickly derive an advanced configuration like that. If possible, the docs should disclose a sample config for offline mode of operation.
Or even better, add a simple CLI flag “–offline”.
/several bugs and enhancement requests enumerated/
-
The man page and docs should state that Internet is used for scoring and the comprehensive docs in /usr/share/doc should state the rationale.
-
Spamassassin should be configurable to support a proxy. Ideally, users should have a choice of SOCKS and HTTP proxies. Note that DNS lookups often use UDP which the Tor network will not carry. So if a proxy is supplied SA should also take care to use TCP.
-
When a task requires a WAN and the WAN is unreachable, in the very least SA should give useful information and terminate more gracefully when torsocks is used.
-
Or better than ③ above, SA should continue scoring in a degraded state. It should be able to give a score without a WAN and perhaps add a warning header stating that the scoring was degraded by the lack of connectivity.
-
Document all CLI options (e.g. -t) in the man page.
-
Document a sample offline configuration.
-
Add an offline mode of operation that can be switched on the CLI.
Perhaps torsocks is somehow deceiving SA about WAN availability. If SA is not going to be smart about that scenario, then the proxy option is needed. A proxy option is needed anyway, in fact. And the proxy option should be prominently documented and encouraged because there are security compromises when running with default configs over the clearnet.
Comment from the idiot (Bill Cole) who closed the bug:
No SpamAssassin Bug is described here.
You are free to take this structureless conversational topic to the SpamAssassin Users mailing list.
#rspamd is probably a better tool. SA devs have been incompetent as long as I remember.
