Mailcorral Documentation

5. Interoperability

5.1 Test Suite

Most email viruses employ similar propagation techniques and are, consequently, very similar in external appearance to one another. That being the case, it is possible to design an email virus filter that removes viruses based on outward appearance. This is highly desirable, since it will ensure that, even when a new virus comes along, the filter will be able to screen for it.

On the other hand, the penalty for failing to detect a virus in an email message is high. We like to know if we've covered all the bases in MailCorral by being sure that we detect all of the important propagation techniques?

The Email Filter Validation Suite consists of generic email messages which can be passed through an email filter to test all of the popular methods of virus propagation. If the filter detects each message and handles it correctly, it is probably ready for prime time.

Each time we make any changes to MailCorral, we pass all of the email messages from the validation suite through it and examine the results. Each time we discover a new type of virus that must be specifically scanned for, we construct a test message and add it to the validation suite. In this manner, we ensure that no new bugs are introduced nor are any reintroduced.

5.2 Working With SpamCorral

MailCorral can redirect all received spam to the spam corral, instead of delivering it to the spammer's intended victim. All in all, not a bad plan but it is possible that a piece of spam could prove valuable (stranger things have happened). It is even possible that a non-spam message might be misidentified as spam and rustled into the corral by mistake.

The ultimate decision, about whether a piece of spam is valuable or not, is best left up to the intended recipient. After all, they are really the only ones who know whether they actually want so see the spammer's message or not. But, if the recipient is shown every piece of spam and asked to make a decision about whether they want to see it or not, the solution is no better than the problem. The compromise solution is to only ask them once or twice a day, in a single message, and to provide a summary that contains sufficient information to allow them to decide, quickly and easily, whether they want to see the spam or not.

To accomplish this goal, a spam handling package, called SpamCorral, works hand in glove with MailCorral to provides spam notifications, through a program which can be run periodically by a cron job. When this program is run, it will send email notification messages to all of the recipients of spam. It summarizes each of the messages received since the last time it was run, giving the sender's address, the subject, the delivery date/time and the associated spam statistics (as a percentage, with 100% being the threshold for classification of a message as spam). When a user receives the notification, they can optionally reply to the message (using their mailer's reply function), retaining the description of any pieces of spam which they wish to see and deleting the description of those which they don't. The action is simple and natural, in that it is just like replying to any other piece of email that they receive.

A second program in the spam handling package, the spam handling robot, listens for messages sent to it, as replies to notification messages, requesting that spam be extracted from the mail corral and remailed to the original recipient. Upon verification of the sender's right to remail the spam, the corralled messages will be remailed to them but, this time, they will pass directly through the sendmail filter unscathed. The operation of the spam handlers is automatic and unattended, simply responding to requests from the recipients to remail all of the spam that they ask to see. No intervention by administrative personnel is required. Furthermore, no important or interesting messages are ever dropped by accident. The recipient has final say in all decisions.

As was mentioned above, MailCorral works very closely with SpamCorral, putting the spam into the corral, where it awaits its final disposition at the user's behest. Furthermore, when the spam handler remails spam that has been released by the recipient, MailCorral does some special processing (very minimal in nature) to carry out delivery of the spam with a minimum of fuss.

5.3 Creating Your Own Spam Handler

If MailCorral is asked to redirect received spam to the spam corral, you can choose to write your own spam handling programs to process the spam that is placed therein. Here are some notes on how to go about this.

Corralled spam is completely formatted and ready for remailing. If any viruses or other obnoxious entities were found within the message, in addition to its being spam, these have already been removed. Spam headers have been inserted as have any descriptive messages. To remail the message for delivery, all that need be done is to invoke sendmail and pass it the message exactly as it is stored.

When spam is remailed, be aware that there is a header in the message (SPAMHDRBYPASS in smfopts.h) that contains a key which will instruct the filter to bypass processing of the message the second time around. This header contains a copy of the message date/time, encrypted using the bypass tag key (SPAMKEY in smfopts.h). For the message to be successfully remailed, without additional processing, this header must be kept intact, the date/time must not be changed from when the message was originally sent and two other criteria must be met. The mailer that remails the spam must be "local" and the name of the remailer must not contain an '@' (i.e. the message must be from the local domain). All remailed spam will be tagged by the filter as "[SPAM]" in the subject line.

The name of the original recipient of a message, from the envelope, is included in the actual name under which the spam is stored in the corral. Remailing should be done with this name, not the to name in the message headers. Note that, if there were multiple recipients on a piece of spam's envelope, one copy of the message is stored without a recipient name. A symbolic link is then made to the stored message for each of the envelope recipients. Thus, any program that processes corralled spam must consider symbolic links as well as files and must disregard files that have no recipeint included in their name. Here are a couple of examples:

  -rw------- 1 1650  Aug 3 14:14  spam_to_jblow_3D4C1D90
  lrwxrwxrwx 1 48  Aug 3 14:21  spam_to_jblow_3D4C1F0D
 -> spam_to__3D4C1FD0
  lrwxrwxrwx 1 48  Aug 3 14:21  spam_to_jdoe_3D4C1F0D
 -> spam_to__3D4C1FD0
  -rw------- 1 1644  Aug 3 14:21  spam_to__3D4C1F0D

To process the above files, send out three pieces of remailed spam, two to "jblow" and one to "jdoe". A single copy of the first message ("spam_to_jblow_3D4C1D90") is sent to "jblow". Multiple copies of the second message ("spam_to__3D4C1F0D") are sent to "jblow" and "jdoe". If you want to figure out when to delete the messages, any message with a recipient name in its file name can be deleted immediately, as can any link. File names with no recipient name in them can only be deleted when no links pointing to them remain.

5.4 Using SpamAssassin To Classify Spam

If you wish to use SpamAssassin as your spam arbitron, to classify messages that are spam, you need to first install the package (either by selecting the version that comes with your OS distro or by downloading and installing the source from: https://spamassassin.apache.org/downloads.cgi). Follow the instructions for installing SpamAssassin, including the spamd daemon. Make sure to install the system startup script in the appropriate place (e.g. /etc/init.d) and verify that spamd is set to automatically start at boot time, in sequence before sendmail (and terminate after sendmail at system shutdown). Turn the startup script on so that spamd starts automatically at boot time.

The spamd options that you should consider using are:

-c Create user preferences files (e.g. in the ~/.spamassassin directory) where auto whitelist and Bayesian statistics will be kept for each, individual user.
-d Daemonize, that is to say run as a daemon, waiting for work (in this case from MailCorral). You must specify this option.
-L Use local tests only. Do not look up addresses, etc. in DNS. This makes spamd be a lot quicker to judge, although it may result in more false negatives.
-p portnum   The port number to listen on. By default, spamd listens on port 783. If you'd rather use some other port number, specify it here. The rest of the MailCorral documentation makes mention of port 2527 so you might wish to make use of that number.

Earlier versions of SpamAssassin required that the "-a" parameter be used with spamd to cause auto whitelisting to be enabled. With the latest versions of SpamAssassin, auto whitelisting is turned on by default, and is either enabled or disabled with the "use_auto_whitelist" option in the configuration file. The "-a" parameter is deprecated and should be removed when running the later versions of SpamAssassin. On the other hand, if you are still using an earlier version, by all means, turn on auto whitelisting with "-a".

A system-wide auto-whitelist can be used, by setting the auto_whitelist_path and auto_whitelist_file_mode configuration commands as you like to define where the whitelists will be stored:

auto_whitelist_path   /var/spool/spamassassin/auto-whitelist
auto_whitelist_file_mode   0666

Note that, if SpamAssassin is used as your spam arbitron, MailCorral uses the SpamAssassin whitelist and local configuration files to implement a fast path lookup for whitelisted and blacklisted mail senders. This eliminates the need to send received mail messages to SpamAssassin purely to find out whether they are white or blacklisted, which can result in a considerable speedup in handling white and blacklisted messages.

To implement this feature, MailCorral must know where SpamAssassin is installed. Normally, it is installed in a well-known location and this feature should just work by default. However, if you install SpamAssassin somewhere other than the usual spot, you will probably need to change the path names found in the module spamfilter.c and recompile MailCorral. Here is the list of path/file names that MailCorral usually looks for:

/usr/local/share/spamassassin/60_whitelist.cf
/usr/share/spamassassin/60_whitelist.cf
/var/lib/spamassassin/*/60_whitelist.cf
/usr/local/etc/spamassassin/local.cf
/usr/pkg/etc/spamassassin/local.cf
/usr/etc/spamassassin/local.cf
/etc/mail/spamassassin/local.cf
/etc/spamassassin/local.cf
~/.spamassassin/user_prefs

The way that MailCorral searches the list is intuitive, except for the path/file names that have an asterisk in them. In those cases, the entire directory path at and below the point at which the asterisk occurs is searched for the file named. This supports SpamAssassin's use of versioned white and blacklists.

5.5 Using ClamAV To Detect Viruses

If you wish to use ClamAV as your virus arbitron, to detect messages that contain viruses, you should first install it on your system (either by selecting the version that comes with your OS distro or by downloading and installing the source from: https://www.clamav.net/downloads). Follow the instructions for installing ClamAV, including the clamd daemon. Make sure to install the system startup script in the appropriate place (e.g. /etc/init.d) and verify that clamd is set to automatically start at boot time, in sequence before sendmail (and terminate after sendmail at system shutdown). Turn the startup script on so that clamd starts automatically at boot time.

The clamd options (which can only be set in the configuration file "/etc/clam/clamav.conf") are pretty much OK, by default. The only options that we suggest that you should consider using are:

PidFile /var/run/clamav/clamd.pid   ClamAV may put its PID file somewhere where traditional startup scripts can't find it. We suggest that you force it to use "var/run/clamav/clamd.pid" as this is where traditional startup scripts expect PID files to be.
TCPSocket socknum   The TCP socket number to listen on. By default, clamd doesn't listen on any TCP sockets so you must specify this parameter along with a socket number. The rest of the MailCorral documentation makes mention of port 2528 so you might wish to make use of that number.

Make sure that you keep ClamAV up to date, as viruses are constantly evolving. Also, be sure to run freshclam at regular intervals (at least once or twice daily) to keep the virus signatures file up to date. This is your first line of defence against new viruses. ClamAV will notify you, in your log file, about updates. If you run a log watcher program like "logwatch", you should receive email notification whenever a new version of ClamAV is available. Once again, keep it up to date!