2003-11-26
AbstractLike most advertising flyers found in postal mailboxes, millions of emails -- now classically referred to as spam -- fill email inboxes around the world everyday. Spam can be considered as the most annoying cyber-pollution that targets all of us with tons of unsolicited emails. Those emails usually contain advertisements and spammers are paid to spread as many of them as possible.Though spam should generally not be considered a real cyber attack, it may be difficult to distinguish between virus-contaminated emails, phishing scams and bothersome ads (those containing tricky JavaScript or specific forged HTML used to track them). Moreover, spammers slow the servers receiving legitimate emails and may cause availability problems. While spammers earn money by embarrassing people, employees and netsurfers lose time by receiving unsolicited emails -- in some cases, hundreds per day. Companies may lose money too, through lost productivity, bandwidth charges, purchasing blacklists, and so on. Typical solutions against this cyber-plague may be to filter emails received by using content analysis or blacklists, and to fix poorly configured servers. This paper will evaluate the usefulness of using honeypots to fight spammers. The first part of the article will explain some background information on spam. Then, we will try to understand how honeypots may detect, slow and stop such activities while promoting a clean Internet. Finally we will conclude with some future perspectives. 1.0 Introduction to spammers1.1 What is Spam?While spam is the name of a food dish containing "mystery meat" [ref 0], this is also what people call their unsolicited emails received on the Internet. The origin of the common use of this name is a Monty Python sketch [ref 1] where the word "spam" become so present that you cannot hear anything else (Vikings singing the praises of spam, waitress repeating spam [ref 2]). The idea is that if Internet users were just flooded by spam, nobody would be able to distinguish spam from normal emails. A security platoon could say with humor that the first casualty of spam is innocence.In this paper, we will use the word spam to describe UBE (Unsolicited Bulk E-mail) and UCE (Unsolicited Commercial E-mail). Examples and logs given in this paper are inspired by real-life events, but they were modified to retain anonymity. 1.2 How spammers workThe spam is sent by spammers because it has become a paid activity of cyber mass advertisement. Spammers' work can be cut into different categories:
We would need a book to describe everything to do with spam in enough detail, and the Internet is ever full of excellent resources that talk about this already, so let's focus on the important issues. 1.2.1 Email addresses get harvestedThe first need for spammers is to get an updated list of targets. Many different ways exist to collect thousands and thousands of email addresses on Internet. When you send emails to UseNet, for example, your address will be available to simple, automatic programs that are looking at the headers of every message posted. By saving specific fields (From:, Reply-To:), spammers may easily build huge lists of potential targets. Another example of harvesting addresses may be through the use of poorly configured mailing lists that give out the list of its subscribers. A third technique is based again on simple, automatic programs, this time ones crawling Web pages on Internet. For each HTML Web page found, such a program will check for a mailto: link ("send me an email by clicking here") and will follow the Web links proposed to continue this sort of evil seeking.
Figure 1: Harvesting email addresses You may also want to read [ref 3] other documents to get more detailed explanations about the harvesting of email addresses. 1.2.2 Open proxiesSpammers may either directly connect to a remote mail server, or bounce through open proxies. For example, the role of a Web proxy is to do the job of a Web client for someone else. When a Web client connects to a proxy, he asks for a Web page somewhere on Internet. The proxy will then grab this Web page by itself, and will return the obtained data to the client. In the logs of the remote Web server, usually we can only see the IP address of the proxy who did the Web requests.An open proxy is a proxy service opened to the world for almost any kind of request, allowing anybody to remain anonymous while crawling the net. Such proxies are used a lot in the underground: blackhat people, warez people, etc. Open proxies are also useful for many spammers, because they will be able to stay anonymous while sending their unwanted emails. Here is an example of a TCP session recorded by snort [ref 4], showing a remote proxy check probably launched by Earthlink. The client connects to the proxy on TCP port 8080, and doesn't ask for a Web page but instead for a TCP session initialized with a remote SMTP server (207.69.200.120) owing to the HTTP CONNECT function. The rest of this TCP session is SMTP, directly sent to the SMTP server (HELO, MAIL FROM, RCPT TO, DATA, QUIT).
Using a proxy server is quite efficient for a spammer to have anonymity. As proxy owners may have logs, spammers may fear that their IP address could be recorded (remote proxy log). Usually, spammers bet that badly configured proxies don't have logs. Their fear of logs is why sometimes they use chains of proxies to increase their luck -- they connect to an open proxy server (TCP Session), then ask it to connect to another known open proxy server (CONNECT a.b.c.d:3128), etc. For example:
Figure 2: Open relays and spammers The longer the chain, the stealthier they become, but they will lose time as multiple bounces will result in multiple delays added. 1.2.3 Open relaysAn open relay (which is sometimes called an insecure relay or a third-party relay) is a Mail Transfer Agent (MTA) that accepts third-party relays of e-mail messages even though they are not destined for its domain. As they forward emails that are neither to nor from a local user, open relays are used by spammers to route large volumes of unsolicited emails.Such a poorly configured MTA lends its system and network resources to the remote abuser who is getting paid to send out spam. Usually, an organization that unwittingly relays spam may be blacklisted on international lists (RBL, etc). That would annoy internal users because they couldn't use their own email properly. A big ISP sadly blacklisted would probably lose clients and money. 2.0 Honeypots versus spammersTo quote the leader of the Honeynet Project, Lance Spitzner [ref 5], a honeypot is an information system resource whose value lies in unauthorized or illicit use of that resource.In this chapter, we will see if it's possible to use honeypots technologies in the following cases:
2.1 Honeypots and harvestingOne of the first phases of a spammer is the harvesting of email addresses. Here we will focus on the harvesting through Web pages, which may be the easiest case to solve for those trying to defend against spam. Without saying that honeypots can fool spammers during this phase, there are some efficient techniques that don't exactly correspond to the definition of classical honeypots. This is the concept: while spammers browse the Web, if they read Web pages with fake email addresses, they will feed their database with invalid targets. Purists may say that this is not exactly a honeypot, so let's say it's like adding one spoon of honey on your Web pages.During automatic harvesting of valid email addresses on the Web, spammers may sometimes be recognized because of the tools they use by checking the User-Agent field sent by their browser [ref 6]. Some people have decided to either block a specific User-Agent known to be used by spammers, or transparently redirect those Web clients to fake Web pages containing tons of fake email addresses. The trouble is that it's very easy for spammers to change the User-Agent. So those same people defending against spam then decided to create Web links on their pages that would be invisible for a human reader (e.g. white characters on a white background) but visible for a spambot following every link read in the HTML source. Such a Web page waiting for Spam bots will dynamically create fake email addresses to fool the spammers. One idea could be to create tons of fake addresses. There is a quite good example of a piece of freeware called Wpoison [ref 7]. This CGI script added to your Web site will generate fake email addresses looking like real ones. A live demo can be tried on this Web site [ref 8]. Another technique could be to create a fake address containing specifically chosen information. The day this email address is used as a target of spam, the owner will be able to determine the IP used by the spammer.
This script will dynamically generate a mailto: link, containing a fake email address with the IP of the current Web client and the date. For example:
If the Web client is a spambot, it will add 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org in the database of potential targets. Now we suppose that a spammer uses this database. He will probably send an email to this virtual address. Then the mail server administrator can filter incoming emails by looking at the recipients (on your MTA or eventually on your MUA [Mail User Agent]). If you receive an email destined to 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org, then you surely know that 80.13.aa.bb is the IP address that was used on November 17, 2003. And more than that, you know that this address was a spam harvesting source.
Though those techniques seem to be interesting, they will only work with stupid spambots, ones which are probably not used by skilled spammers. The more sophisticated spammers may use open proxies to crawl the net, and the dynamically created email address will just help with finding such proxies and the spammer will keep his anonymity. 2.2 Honeypots and open proxiesOne of the main paths used by spammers to reach mail servers is going through open proxies that accept and freely transmit requests. Those open proxies play the role of screeners for the spammers that hide beyond them.So, would it be so difficult to set up a fake open proxy in a honeypot ? No, and that's what were are going to look at. By looking at your firewalls logs, you'll probably notice attempts to access TCP ports like :
Many basement-dwelling people "courageously" hiding behind their monitor, and using tools they don't understand, will scan the net to map all interesting services. Some of them share their information in public lists of proxies on the Internet (just use Google and search for things like "open proxies list"). By connecting to the answering TCP ports, sending a few packets may help to understand if the proxy is open or not (will it accept and go anywhere?). What if we setup some honeypots that will answer positively to incoming requests? We'll be able to fool some spammers. My favorite honeypot, made by Niels Provos, is called Honeyd [ref 9]. To create a fake relay server, simulating open proxies and an open mail relay, you could use such a configuration file :
This will ask Honeyd to simulate an OpenBSD 2.9 computer with the IP 192.168.1.66 and three TCP ports opened: 25, 128 and 8080. For each incoming request coming to those ports, Honeyd will launch the appropriate fake service (sendmail.sh, squid.sh, proxy.sh). If those services want to see what was sent by spammers, they just have to read data from STDIN. To reply to the spammers, they just have to write data to STDOUT (like a classical Inetd process). To fool the remote spammer, we'll have to simulate part or all of the discussion. As an interesting proof of concept, we will look at the tool called Bubblegum Proxypot [ref 10] which is a sharp, small tool. The only goal of this tool is to fool active spammers by simulating an open proxy. In comparison with Honeyd, it cannot simulate something else (Honeyd may be used to simulate anything you need); it cannot change its IP stack behavior, etc. Though it's a simpler tool, we'll quickly learn many things from spammers. Depending of his skill, the spammer will either simply check that the proxy is open, or perhaps try to see if it is working properly. Remember that the spammer's goal is to make money. Thus spammers cannot afford to lose much time sending thousands of emails out for nothing. On my temporary honeypots, I saw both of the above behaviors. With Proxypot, you can choose one of three possible configurations to fool the spammers:
I personally used the option smtp2 and got thousands of spam through it. [Continue to Part 2]
|
|||||
|
Credits
Thanks to Niels Provos for his ideas and reviewing.
About the Author
View more articles by Laurent Oudot on SecurityFocus.
[ref 1] Monty Python , The SPAM sketch [ref 2] The Infamous Monty Python Spam Skit, in streaming RealVideo [ref 3] Uri Raz, How do spammers harvest email addresses? [ref 4] Snort Intrusion Detection System [ref 5] Lance Spitzner, "Honeypots, tracking the hackers", 2002 [ref 6] http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_unwanted_robots_to_go_to_hell [ref 7] Wpoison, a CGI to annoy harvesters with spam bots [ref 9] Niels Provos, Honeyd the daemon to build honeypots [ref 10] Proxypot, a fake proxy daemon to fool spammers
[continued in Part 2]
|
