WEBlog -- Wouter's Eclectic Blog

Fri, 08 Apr 2011

Blocking newsletter spam

It's incredible how many people are of the misguided belief that just because I happen to run a company, I am automatically interested in their newsletter about whatever it is that they are doing, no matter how far it is removed from the kinds of things my company actually does.

Are these people spammers? Yes, definitely, and I don't want to do business with them. But there's a major difference between this kind of mails and your common nigerian scammer or counterfeit blue pill "salesperson". Unlike the latter, some newsletter spammers are interested in forming a genuine business relationship with my company. They're going about it the wrong way, but that doesn't necessarily mean they're trying to trick me into doing something that would not be in my best interest—they're not just after my money.

Although their methods are wrong, that does not mean they're entirely clueless. Some of these unwanted newsletters are sent with VERP-style return paths, which suggests that if the mail bounces at SMTP time, I would no longer receive their junk. So bouncing them is what I do. Exim makes this very easy:

acl_check_mail:
  deny
	message = Your domain has been blacklisted
	log_message = domain blacklisted
	condition = ${
			lookup{$sender_address_domain}
			wildlsearch{/etc/exim4/blacklist-domains}
			{true}
			{false}
		     }
  accept

What this does is use a wildlsearch lookup to verify whether the domain of the envelope sender (i.e., as specified in the MAIL FROM: SMTP command) exists in the /etc/exim4/blacklist-domains file. Since we use a wildlsearch, we can use the * as a wildcard—*grep.be would mean 'grep.be, or any of its subdomains', whereas *.grep.be would mean 'any of the grep.be subdomains'. This is because at least one of the people I've blacklisted that way sends their newsletter through a distributed service, and the VERP-style header is based upon the server that actually communicates with my system; and others have a subdomain for the newsletter, but don't use it (or use a different one) for regular mail. If I'm not interested in their spam, I'm probably also not interested in their other mail, so therefore the wildcard (is this overkill? Maybe, but I don't care—I don't do business with spammers).

This ACL is then activated for the SMTP MAIL FROM: command (search for acl_smtp_mail variable in the exim specification). This makes it impossible for the spammer to reach postmaster@ from the same domain, too, but that doesn't matter; they can always use a different address.

One might be wondering why I'm using this kind of domain-based blacklisting rather than a regular bayesian spamfilter, or anything of the sorts. The reason is fairly simple: because the general format of these newsletters is distinctly different from regular spam. For instance, some of these newsletter spammers are in fact competitors who didn't bother to check who they're sending mail to. As a result, their newsletters would contain key words that would appear in mails which I send to my regular customers, too; if I were to classify them as spam in my bayesian classifier, that would increase the chance of the classifier misclassifying a mail from an actual customer as spam. Most of these are very similar in format to newsletters that I did consciously subscribe to, and which are therefore not spam, etc.

Finally, bouncing mail rather than blackholing it or filing it in a separate folder (as I have spamassassin do) has the added advantage of making it clear to a newsletter spammer that their junk is not wanted. Most (though certainly not all) will then remove me from their newsletter, saving me bandwith and processing power. And since we do this at MAIL FROM: time, rather than upon completion of the RCPT TO: or DATA commands, I'm not actually giving away any information that they don't have, either.

Sat, 08 Sep 2007

EMIRATES HERITAGE LOTTERY

Some spam was sent to a Debian list with that subject (sorry, no link, that would increase the pagerank). I guess spammers are now trying to combine two strategies in one: I have 1.000.000.000.000.000.000 US$ that I want to get off, but I can't possibly do so without getting into trouble. Therefore, I declare that you have won the LOTTERY!!!1! Now please claim your prize, and help me out of my problems. Thank you.

Whether it'll work is another matter, of course.

Mon, 26 Jun 2006

Comment spams

The nice thing about an NIH comment post thing is that you can change it if people start to abuse it.

I was starting to get more and more comment spams. Some spammer obviously must have taken a look at my comment submission form, and written some code to post his junk there. Even though they made a little mistake that would make it very easy to identify their posts (URL-encoded the data in some field where it was not necessary, and indeed would mean that the post would not be visible had I approved it), I didn't feel like adding much code to special-case one particular spammer.

So instead, I changed my comment form to rename a particular field, and would return a 403-style rather empty page with a remark of Sod off, spammer if the field is filled in. When I start to receive spam again, changing the code to do the same thing once more is pretty easy.

It's been a few weeks now. And I didn't have to change anything about my comment submission policy, like some other people had to do.

Isn't that nice? :-)

Fri, 05 May 2006

Bluetooth spam

I have a laptop that supports bluetooth. By using gnome-bluetooth (which unfortunately isn't packaged yet—and no, I don't have the time to start doing more packages), I can accept data transfers over Bluetooth to my laptop, which sometimes is handy.

Some people seem to think it's funny to start sending me unsolicited files while I'm on the train, hacking away on my laptop. I swear, they're spamming me. Of course, it's not something that can't be solved by running killall gnome-obex-server, but it's annoying.

I'm thinking of writing my own obex server (or modifying g-o-s) so that the box that appears when it asks whether I want to accept some file has the option to "spam back", i.e., send the same file to the sending device over and over again. But then, that wouldn't be ethical, would it?

Mon, 23 Jan 2006

Retaliating to spam.

How do you fight spam?

Not through regulation. Spammers will find loopholes and exploit them. You can plug the loophole, but then they'll find another one. And some spammers will just not care, try to avoid the cops. Sure, eventually they'll get caught, but for every spammer caught there's ten more waiting.

Not through technology; that's an arms race. Improve your filters, and the spammers will just modify their mails to slip through again.

I don't think it is possible to fight spammers. But you can take revenge...

Let's look at a typical nigerian scam:

From: foo@bar.baz
To: wouter@grep.be
Subject: PLEASE HELP
Date: some_date_in_the_future

Hi,

My name is <the name of a relative of someone rich who died>.

I have <some problem involving an amount of money that I cannot get
at>.

Could you please <do something which will cost you a lot of money>?

Thanks.

PS: please, reply on my private email address <scammer@scam.com>.

The important phrase is the last one: if you could please be so kind as to reply to their "private" email address. Obviously that email address is a throwaway one; but one would expect that they must read it and hope someone replies to it before the amount of spam on that address gets too high to reasonably be able to read it.

Which is the catch. I wonder how those spammers would react if I would start adding their "private" email addresses to some database and, rather than throwing away mails which are so obviously spam, start forwarding them to random addresses from that pool...

Thu, 22 Dec 2005

Whoa

I just got spammed by the government.

Well, almost so. I received a mail from the Belgische BrandwondenStichting. That's a Belgian Government-sponsored and -founded organisation created with the intention of providing information and prevention regarding fire and fire-inflicted wounds. Apparently, they discovered the "power" of the Internet, and sent me some email warning me about the danger of fireworks.

I couldn't care less; I do like to watch the fireworks, but I don't usually fire it myself. But I do think it's worrisome that my own government start sending out this kind of things...

Hah.

Even though my blog doesn't enable comments by default (instead everything is moderated), someone still thought it was a good idea to spam on a (totally unrelated, though as it happens the first at the time) post on my blog for their blog aggregator that is totally content-less at this point in time (and no, I will not post a link).

That's the first time this happens, in all the time I've had comments on my blog; so moderating them seems to work well. Which obviously is nice.

Sat, 28 May 2005

Finally a Patch that works!

A mail with that as subject appeared on the debian-l10n-dutch mailinglist today.

Unfortunately it has nothing to do with computers.

Tue, 24 May 2005

Spam filtering.

A few days ago, I opened my ~/.mutt/muttrc and added the following:

macro index,pager S s=spam/gemist<enter><enter>

=spam/gemist is my folder for, well, missed spam. False negatives. Mails in that folder get read by a cronjob that feeds them to sa-learn --spam, and I found I was having way too much of those lately. I was already tagging them per mailbox, so that I could throw them to the false negatives box all at a time, but these were still way too many characters. Hence, the macro to throw spam away with only one keystroke (well, two if you count shift).

While doing this and looking at my mails more closely than normal, I found out why I was having too much of those, too: the score of the BAYES_99 test (meaning, the bayesian filter was 99 to 100% sure that the mail was spam) was set to 1.9! Freakin' idiot. Modified it in my local configuration to say 4.0 rather than 1.9, and suddenly a lot more spam was caught.

Happy happy, joy joy.

Not sure whether this is my bug or whether it's in the package; I'll have to review that.

Thu, 19 May 2005

Why email address munging is harmful.

See?

If you can figure out what's happening there without cheating (thus, without looking at other versions of that mail), then please lend me your crystal ball sometime.