Security: June 2006 Archives

Way back in 1999, I was looking for a packet analyzer. I was familiar with EtherPeek for the Macintosh from a few years before, and I found that the AG Group was producing EtherPeek for Windows, too. The AG Group is now WildPackets, and they are exceedingly helpful to anyone that has to troubleshoot data networks. AG Group always offered some cool network freebies: IP Subnet Calculator, netTools and a great protocol reference chart.

One of their people, J. Scott Haugdahl, has an excellent book, Network Analysis and Troubleshooting, which offers a bottom-up review of the OSI 7-layer model . (Which one are you: All People Say They Need Data Processing or Please Do Not Throw Sausage Pizza Away?)

I liked EtherPeek and the book so much that I bought both and paid out of my own pocket even though my job was managing the network. Of course, this was back in the day when running tcpdump required you to know your IRQ, DMA and chip set (i.e. DEC Tulip). My job at the time was helping change a campus network from Netware to TCP/IP when Windows and Macintosh didn't even install a TCP/IP stack by default. We went from three-and-a-half network protocols (two different Netware frame types) to one and a half (we still had a couple of AppleTalk issues.) Each computer was on the Internet with a public IP address and no firewall. The ping of death still worked against most machines, and we also got hit with Smurf and Trinoo attacks that would disrupt all online activity.

WildPackets makes some excellent packet analyzers for wired and wireless networks. Now their base-level product is free: OmniPeek Personal. While I have been using Ethereal since my old version of EtherPeek became obsolete because it was on my ancient Dell laptop, I missed EtherPeek because it was the first packet analyzer I really got to know well. I could create filters and find exactly what I needed to find. EtherPeek also had good summary statistical functions, which could tell me who was producing the most traffic on my networks. Omnipeek Personal is better than my copy of EtherPeek was because it includes some expert analysis about bad packets and delayed response times. It also produces HTML statistics just like the original, and it has a better interface than Ethereal, using color to show differences between packets.

For those of you that underestimate the power of color, try printing a Google or Mapquest map in black and white and one in color and see which one is easier to read while you're driving. OmniPeek makes it easier to read your packet stats and is easier on your eyes than Ethereal. It's also supposed to do wireless captures -- I'll update when I get a compatible chipset wireless card.

In dealing with financial activities, our law enforcement/intelligence community is someplace between Get Smart and Mission:Impossible, depending on which story you read in the newspaper.

Eighty-one people in 17 states used a California woman's Social Security number, according to the AP on June 18, 2006. You'd think the IRS or Social Security Administration would notice that 81 jobs falls outside the normal range of jobs. Maybe even past 3 standard deviations above the mean number of jobs that people hold in a given time period.

"They knew what was happening but wouldn't do anything," said Schmierer, 33, a housewife in this San Francisco suburb. "One name, one number; why can't they just match it up?"

Then on June 23, the New York Times breaks a story about how the Treasury is overseeing a CIA program that monitors data going through the Society for Worldwide Interbank Financial Telecommunications. What do these reporters think FinCen does? They look for the same kind of activity that the NY Times-revealed program does, except they've been doing it a lot longer than the CIA.

A former compliance officer for a major brokerage once told me that you might get away with insider trading once. After that the investigators would know the people with whom you attended kindergarten and might be in a position to give you insider information. That's link analysis.

The last time I bought AMEX traveler's cheques it took half an hour because of the paperwork required by the bank to satisfy post-9/11 financial tracking regulations, so it doesn't surprise me that the intelligence community is monitoring international transactions. (The paperwork is so tedious that I'm going to carry cash more often than not.)

Our government can access tons and tons of data about every transaction that travels across our borders, but without efficient algorithms for flagging suspicious activity, it will all be useless. Placing every tax return and W-2 statement into a single data warehouse would be academic. Yahoo and Google probably generate more data in a week than all our tax returns and W-2s annually. You would think the Social Security Administration would be able to see the fraud in their systems. The ACLU wouldn't even be able to argue that our government isn't allowed to look at its own data.

Once you loaded the data, a few queries could spit out suspicious Social Security Number users in a day or two. Again, the budget for this would be under a million or two.

Counting Web Attacks

| | Comments (0) | TrackBacks (0)

I see a lot of 404 errors in my Apache logs. A 404 error is a file not found, e.g. someone has requested a file that's not there. Often it means I made a typo in a configuration or HTML someplace. More often, it means someone someplace is probing my server for weak web applications.

Linux and open source software have made it easy to add web applications running under Apache and MySQL. The problem is as more and more sites start using these cool web applications, hackers are able to find holes in them. The developers fix the holes and release patches, but many webmasters don't apply the patches. Thus I see probes like the one below in my Apache logs:

212.83.253.101 - - [19/Jun/2006:09:24:49 -0400] "GET /a1b2c3d4e5f6g7h8i9/nonexistentfile.php HTTP/1.0" 404 320 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:49 -0400] "GET /adxmlrpc.php HTTP/1.0" 404 294 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:49 -0400] "GET /adserver/adxmlrpc.php HTTP/1.0" 404 303 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:49 -0400] "GET /phpAdsNew/adxmlrpc.php HTTP/1.0" 404 304 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:50 -0400] "GET /phpadsnew/adxmlrpc.php HTTP/1.0" 404 304 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:50 -0400] "GET /phpads/adxmlrpc.php HTTP/1.0" 404 301 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:50 -0400] "GET /Ads/adxmlrpc.php HTTP/1.0" 404 298 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:50 -0400] "GET /ads/adxmlrpc.php HTTP/1.0" 404 298 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:50 -0400] "GET /xmlrpc.php HTTP/1.0" 404 292 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:51 -0400] "GET /xmlrpc/xmlrpc.php HTTP/1.0" 404 299 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:51 -0400] "GET /xmlsrv/xmlrpc.php HTTP/1.0" 404 299 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:51 -0400] "GET /blog/xmlrpc.php HTTP/1.0" 404 297 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:51 -0400] "GET /drupal/xmlrpc.php HTTP/1.0" 404 299 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:52 -0400] "GET /community/xmlrpc.php HTTP/1.0" 404 302 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:52 -0400] "GET /blogs/xmlrpc.php HTTP/1.0" 404 298 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:52 -0400] "GET /blogs/xmlsrv/xmlrpc.php HTTP/1.0" 404 305 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:52 -0400] "GET /blog/xmlsrv/xmlrpc.php HTTP/1.0" 404 304 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:52 -0400] "GET /blogtest/xmlsrv/xmlrpc.php HTTP/1.0" 404 308 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:53 -0400] "GET /b2/xmlsrv/xmlrpc.php HTTP/1.0" 404 302 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:53 -0400] "GET /b2evo/xmlsrv/xmlrpc.php HTTP/1.0" 404 305 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:53 -0400] "GET /wordpress/xmlrpc.php HTTP/1.0" 404 302 "-" "-"
212.83.253.101 - - [19/Jun/2006:09:24:53 -0400] "GET /phpgroupware/xmlrpc.php HTTP/1.0" 404 305 "-" "-"

This is a probe, not an attack. There's nothing illegal about requesting files that aren't on my server, is there? But if I touch /var/www/html/adxmlrpc.php, we may find out what happens next. Note that most of these requests, while probing for different applications, share one thing in common: RPC on PHP.

The below is chart of probes by date and request on this webserver. There's not enough space to list each one as it corresponds to the color. (MS Excel shows me data point details info on mouseover in my pivot table.)

Attacks by Application

Data warehousing and data marts would be simple to construct if only the data were in a standard format. Five years from now, businesses will take OLAP for granted. (OLAP is a fancy way of saying we're going to automate the sums and averages of your sales data over time so you don't have to do all that stuff in Excel any more.) Five to ten years from now, businesses will live or die by their data mining algorithms. (I classify DM as a step above standard OLAP.) Before this can happen, the data have to be available in a usable form.

I come from an information security background, thus I spend far too much time poring over computer logs: web server access logs, firewall logs, Windows event logs, not to mention /var/log/*. I have learned lots of stupid log tricks, like using logwatch, grep (my favorite), Snare to send Windows logs to syslog, and now, Microsoft's free Logparser tool. Logparser has poor documentation but will certainly pay you back for time taken to learn to use it. There's even a non-Microsoft site dedicated to logparser.

Note: Syslog does not store data in 3NF rows. If you want to be able to sort by fields with destPort, sourcePort, sourceIP, destIP, without doing text search, you'll be doing a LOT of ETL work.

This week I was thinking about replacing my firewall/router (a Netopia R9100 with the hardware VPN upgrade that I trade off with a Linksys WRT-54GS (v3) when I'm not paranoid about using wireless.) And yes, I'm not supposed to tell you that, but it doesn't really make a difference if we're both using nmap. So I looked at firewall vendors websites to learn what I could about logging capabilities. I'm slightly less concerned about security in my home lab than I am about collecting data on attacks. Firewalls have been around for over ten years now, so you'd think they would have logging down.

Watchguard: several logging options, including syslog and XML, SNMP costs extra.

Juniper/NetScreen: syslog, SNMP, NetIQ (If I feel like paying for that, too.)

Checkpoint: "Eventia Reporter™ is a complete reporting system that delivers in-depth network security activity and event information from Check Point log data." This means I can look at CheckPoint logs, but I can't correlate them to anything else. This Checkpoint vs. Cisco page is also interesting.

SonicWall: "ViewPoint®, Local Log, Syslog, WebTrends" I can pay extra for SonicWall's "Viewpoint" product, but I still can't correlate SonicWall logs to any other logs. One SonicWall includes a "secure" switch in their firewall: I would love to see what happens when I try an arp spoof. (If I wanted a switch, I would buy one.)

Cisco PIX: SNMP, Syslog, and AAA ("Authentication, Authorization, and Accounting Support") It does Cisco logging. It also has a CLI. (Command-Line Interface.) Unless Cisco starts giving me free hardware, I'm not sure why I'd use a PIX. If I blow a command, my network is not secure. A CLI is fine when it's obvious if a command is working or not, as with routing, but with firewalls, it makes me nervous. Then again, you should test every port after entering a rule change on your firewall.

Microsoft ISA Server: "ISA Server 2004 provides detailed security and access logs in standard data formats, such as delimited text files, Microsoft SQL Server databases, or SQL Server 2000 Desktop Engine (MSDE) databases."

I don't even like software firewalls, but Microsoft makes it easy for me. At $1,500 plus $250 for decent software, Watchguard is more expensive than ISA server. Checkpoint and Juniper won't even tell me how much their products cost. Sonicwall, Watchguard, and ISA Server are all priced on CDW.

If firewall data are this disparate, I can't imagine what a pain it must be to build data warehouses with data from other sources. Current firewall products seem to create their own silos and make it difficult to track intruders across a network rather than just at the perimeter.

Using syslog, MS SQL 2005, SQL Server Analysis Services, and MS Excel, I can build a cube with my firewall log violations and then import the cube into Excel and produce pivot tables. While this might seem more complicated than it needs to be, I could produce a daily scorecard of attacks. The only catch is that I need a firewall that logs to SQL server or a syslog to SQL server connector. The syslog => SQL connection would be tough because my router/firewall doesn't do uniform syslog notifications. I know enterprise-level firewalls do much better logging, like the Watchguard X-series which I was fond of just because I could make them do almost anything. The last time I checked, though, they still cost $1,500 for the base model plus $500 for the appropriate software.

With the Watchguard's new XML logging, I could create a SQL Server Integration Services package to import the data regularly. From there, I could get SQL Server Analysis services to process my cube each night. Then I use Microsoft Sharepoint's Scorecard or OLAP web part to display statistics. Best of all, I wouldn't have to mess with doing my own manual extract-transform-load (ETL) of my router log data.

The graph below represents a simple count of attacks by port on my router. Port 0 corresponds to ICMP. (I don't respond to ping requests.) The rest of the ports are closed, except for port 80, which you're using now. I ban a few IPs on port 80 because they won't stop posting junk trackbacks onto my blog. The ports are in alphabetical order rather than numerical order because I must store them in text fields rather than numerical fields in the database. If the port numbers aren't text then SSAS will OLAP them and I'll end up with the sum of all ports, which is nonsense but nevertheless might make a good statistic for MBA-types. While the graphic may not be all that impressive, the scalability is. Using SQL and SSAS, I could track probes and attacks on hundreds of firewalls at a time, track trends over time, and even predict the level of future probes.

Probes by Port

Being assigned a data warehousing/data mining project for class sounds like fun, but where am I supposed to get a data set? I can buy a database of all area codes and exchanges with latitude and longitude, but I would still have to simulate a hundred million records to address scalability and query optimization issues. Then I could find out if my estimations of the size of records is within a factor of ten, but the networks I see still wouldn't be "real" and I would have no idea if that's what real social networks looked like. (As an undergrad English Lit major, I was reading 18th Century epistolary novels instead of taking Data Structures like my Computer Science major classmates. The sad part is that Data Strutures would have been more interesting.)

Fortunately, data magically appear on my Linux box every day.

Each morning at four am, logwatch runs on my Fedora Core 4 (Red Hat Linux) box. It tells me how many times nonexistent files on my webserver have been requested, and how many router firewall violation attempts have been logged. It also tells me how many times Apache logged a "method not allowed" 405 code. I have several daily log files that give me useful information on attacks. The problem is that there are so many attacks that if I banned every IP that looked for a web application hole or probed a port I wouldn't have time for anything else.

So it makes sense to look for attack source IP (Internet Protocol) addresses that probe my router AND request holes in web apps. To do this I need three files: my router log from syslog; and two greps of all my Apache logs. (grep -h will suppress file names at the beginning of each line) looking for 404 and 405 errors. This gives me three tables, from which I can do inner joins on source IP in each. Of course, I have do do some tedious data cleanup to get the text log files into Excel and from there Access. (I always underestimate the time it takes to clean up data.) From Access, I'm going to go to SQL 2005, Analysis Services, and build a cube. From there I should be able to "see" the attacks using Pivot Tables in Microsoft Excel.

If I see a source IP in my router log and Apache error logs, then it's probably worth banning. Correlating IP addresses to identify those involved in multiple methods of attack takes me from hundreds of IP addresses down to six.

About this Archive

This page is a archive of entries in the Security category from June 2006.

Security: May 2006 is the previous archive.

Security: April 2007 is the next archive.

Find recent content on the main index or look in the archives to find all content.