Keeping Spam Out of the Network
Accepting an e-mail implies that the message transfer agent (MTA) has accepted responsibility1 for performing onward delivery. This has legal implications in some countries nowadays. In most cases the legal requirements will include keeping an archived copy of every e-mail that passes through the network. Given that it is estimated that 65 percent to 90 percent2 of all e-mail today is spam, companies can end up archiving terabytes of spam!
Unfortunately most MTAs today will queue and accept first, then dequeue and scan before onward delivery. This leads to many people opting for something called accept-and-drop in an effort to reduce spam. If the e-mail is found to be spam after accepting it, it is simply discarded. Under some legislation this could be considered illegal. Even worse is the case of a false-positive, resulting in a legitimate e-mail being discarded.
In order to effectively combat spam, it is necessary to stop the spam before it enters the network.
Let’s imagine for the moment that you–the readers–and I–the author–run a mail server called mx.my-domain.example.3 A sensible approach for us will include:
- Blocklists
- Scan before acceptance
- SPF
There are other techniques applied as well, such as throttling and temporary deferral, but they warrant a discussion of their own.
Before looking at these techniques, here is a recap of what a normal SMTP conversation4 looks like:
220 mx.my-domain.example is ready
EHLO yours.example
250 Pleased to meet you yours.example. Please proceed.
MAIL FROM:<you@yours.example>
250 OK
RCPT TO:<me@my-domain.example>
250 OK
DATA
354 Please proceed
An email in RFC2822 format is sent followed by a single dot
.
250 OK
QUIT
221 Goodbye
Using a blocklist
The sooner a spammer can be stopped the better. Therefore, the first line of defense is a real-time blocklist. This allows for the connection to be refused immediately by using a 5xx code instead of the friendly 220.
511-This IP is blacklisted for sending spam.
511-Please contact your service provider.
511 Service providers may visit http://www.mcafee.com for an antispam solution
At this point your MTA should also close the connection even though it strictly violates RFC2821. We are not dealing with people who are playing fair here!
Some people advocate that the connection should be dropped without a reply, but this must be done only when one is dead sure that the connection IP is a compromised host. The connection may come from a blacklisted ISP’s mail server. Refusing the connection with a 5xx style message pushes the reponsbility back to the sender. If a legitimate customer of an ISP is affected by this, they can complain to the ISP. The latter will need to take care of the problem or face losing customers.
Blocklists are effective up to a point. An IP address cannot be blacklisted when the ration of spam to ham is low. Also it takes time before an IP can be deemed to be spamming. The next step is to look inside the email itself.
Scanning the content
Most spam e-mails are relatively small and a good antispam content-analysis engine will be able to scan them in very little time. This makes it ideal to scan the e-mail in the DATA phase before sending a reply. In an RFC2821-compliant SMTP conversation 10 minutes5 will be allowed before sending the reply. This allows enough time to do the scanning. If it is a spambot, which is not going to wait around for this period and goes away, we’ve won yet again! If it is a very big file that can be scanned in a short time, it is probably not spam anyway.
220 mx.my-domain.example is ready
EHLO yours.example
250 Pleased to meet you yours.example. Please proceed.
MAIL FROM:<you@yours.example>
250 OK
RCPT TO:<me@my-domain.example>
250 OK
DATA
354 Please proceed
A spam in RFC2822 format is sent followed by a single dot
.
553 This is spam!
Once again having detected that it is spam, we can push the problem back to the sender. We also close the connection at this time. If it is a spambot, it will go away; if it is an ISP, then the provider has to determine why spam has been accepted in the first place. Whatever the case the ball is now back in the sender’s court and we can be sure that we have less spam on our network. At least that is what we thought.
SPF
As any mail admin knows, the sender address can be spoofed easier than clicking a link in your browser. Let’s look at what has happended on our network:
220 mx.my-domain.example is ready
EHLO yours.example
250 Pleased to meet you yours.example. Please proceed.
MAIL FROM:<me@my-domain.example>
250 OK
RCPT TO:<me@my-domain.example>
250 OK
DATA
354 Please proceed
A spam in RFC2822 format is sent followed by a single dot
.
553 This is spam!
We have bounced the spam, but the sender address has been faked to point back to us. If the sending IP was a real MTA, it would create a bounce message and connect back to us.
220 mx.my-domain.example is ready
EHLO yours.example
250 Pleased to meet you yours.example. Please proceed.
MAIL FROM:<>
250 OK
RCPT TO:<me@my-domain.example>
250 OK
DATA
354 Please proceed
A bounce in RFC3462 format is sent followed by a single dot
.
250 OK
QUIT
221 Goodbye
Because of the encoding of the bounce message, there is a good chance that the body might only be partially included or not at all. As such we might not detect this as spam at all. We can counter the spoofing of our domain name using Sender Policy Framework (SPF), a useful, but limited, technique. In reality its only benefit is the prevention of sender spoofing. If we implement SPF-checking on our mail server (and also register SPF records for our domain) we could have caught the spammer earlier.
220 mx.my-domain.example is ready
EHLO yours.example
250 Pleased to meet you yours.example. Please proceed.
MAIL FROM:<me@my-domain.example>
517 SPF Failed. Go away!
Unfortunately this will still create a bounce if the sender is a well-behaving MTA. At least now we can mail abuse@yours.example and ask them nicely to implement SPF. If the sending MTA had SPF enabled, it should not have accepted the e-mail in the first place.
Unfortunately they can also create SPF records, especially the all-permissive “+all” kind, which means that any address is allowed to send an e-mail from that domain.
This is the point where people would like to point out that DomainKeys are the solution. Just remember that Domainkeys operate on the body itself, which means it is something that has to happen in the DATA phase. Use of SPF will eliminate a good percentage of spam before it gets to the data phase.
Implementing this in the MTA
As I mentioned earlier, most MTAs will queue first, and then scan. Some mainstream MTA software such as Postfix and Courier will allow filtering before acceptance, but implementing that is far from trivial. Even commercial offerings tend to fail in this respect. To date the only commercial offering I have seen that can do this and really do this well is the McAfee Secure Internet Gateway appliance. (OK, I might be biased, because I was responsible for getting that technology implemented.)
One thing many people tend to miss is the need for good feeds. Getting the infrastructure in place is one thing, but what really drives it are good-quality feeds for blocklists, and up-to-date rules for content-scanning antispam engines. That is where I am very fortunate to be able to work with some dedicated and clever people at McAfee which can provide these kind of services.
In conclusion
Blocking spam before acceptance in the DATA phase releases us from legal requirements, frees up resources, and pushes the problem back to the sender. It is a simple approach that can be used to reduce spam traffic on the internal network, but it relies on having good antispam feeds in place.
As a side note, I’ll leave you ponder whether this approach can be applied to e-mail containing malware, a topic best left for another day.
Notes
1 See RFC2821 section 4.1.1.4.
2 The amount varies according to who measures it and where it is measured on the network.
3 .example is a special reserved TLD for use in documentation or as examples (RFC2606 section 2).
4 The discerning mail admin will notice that I have removed bits from the SMTP conversation. This is for clarity. I have also left out any use of DSN extensions.
5 See RFC2821 section 4.5.3.2.