|
DNA Analysis Spots
e-mail Spam
Biologists at IBM's Watson Research Centre
have devised an anti-spam filter based on the way scientists
analyse genetic sequences. It has proved to be 96.5% efficient.
The formula, or algorithm, helps in automatically determining
the properties of a protein, such as function and structure,
directly from a string. Obviously algorithms that pertain
to pattern discovery are applicable to a vast range of problems.
One of the properties of the algorithm is that it will spot
two or more occurrences, whenever they are in the message.
It can be trained so that it will not be fooled by cunning
replacements of "S" with "$", a common
ploy used by spammers to bypass conventional e-mail filters.
Further, the method builds up its database of known true-spam
patterns and constantly adds new patterns it spots. It compares
its vocabulary with e-mail which it knows do not contain spam.
So, an incoming message hit with this pattern analysis will
be rejected if it contains a large proportion of the same
vocabulary patterns.
The system has to go though some more
pilot studies and testing before it is let loose to protect
inboxes.
Source: Wista
Innovation

|