Checking for intrusions in Apache Web server log analysis

Checking for intrusions in Apache Web server log analysis

By Brad Causey | Apr 22, 2009

It seems every new device, appliance and even desktop software program has the capability to generate logs or text-based data. There are a number of challenges associated with managing the onslaught of log data.

The first is centrally storing and gathering these logs; luckily, there are a number of available products for this. Logs are usually shipped off to a syslog, log management or SIM system that is centrally located in the network. So the big question is: How do you sift through Web server log data and find relevant security information?

Although there are many different open source and commercial software applications that perform some level of log analysis, one thing is usually common among them -- regular expressions (regex). Regular expressions are basically a string of characters that allow nearly any scripting language or search tool to perform fast, advanced searches against large amounts of text data. There are a few variations of regex formats, and the most commonly used by scripting languages are called Perl-derivative regular expressions. These include regex formats for .NET framework, Python, Java, JavaScript and, of course, Perl. By using this type of regex in combination with any scripting language or search tool, you can quickly and efficiently parse large amounts of data for meaningful information.

One of the most common log formats we tend to see issues in is Apache, or httpd. These Web server logs tend to hide a number of secrets that are vital to find, such as attack attempts, successful attack signatures, and even precursor activities to an impending attack.

We will focus on the use of regex with egrep. Egrep uses a very simple syntax for searching files and is readily present on nearly every operating system in common environments today. (Windows users can download a free version from a variety of sources).

Keep in mind that regex used with egrep is also compatible with any program or scripting language that supports regex.

For this article, we'll look at Apache logs. But the concepts applied via egrep, regex and httpd logs can be used across hundreds of other platforms, tools and log types. Understanding what is dangerous and how to search for it is a great step toward recognizing security issues within your organization.

Step one: Web log format

In order to create expressions to analyze the contents of these logs, we need to understand the log entry structure. Apache stores something called a server access log, usually in /etc/httpd/logs, and typically is named something like access_log.

You can configure httpd (Apache) to send these logs to a syslog or SIM system; if so, your log format may be different from the default. Apache stores return delimited entries in access_log in the following format:

10.10.10.10 - frank
[10/Oct/2007:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

Let's break this down section by section. The first value, 10.10.10.10, is simply the client IP address, directly followed by the hostname of the client if HostnameLookups is enabled. Next, we have the date and time stamp, 10/Oct/2007:11:55:36 -0700. This is obviously important for correlation purposes.

Next, we have the HTTP header information. This is especially helpful because it gives us details about what request was made by the client. In this case, GET/apache_pb.gif HTTP/1.0 indicates a GET method of request, targeting the image file named apache_pb.gif that is located in the root of the httpd Web server's directory.

Finally, the server return code, 200, indicates the request was completed successfully. The last bit of information is simply the size of the object returned to the client for that request.

 
 

Add comment

Post a Comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <a> <p> <span> <div> <h1> <h2> <h3> <h4> <h5> <h6> <img> <img /> <map> <area> <hr> <br> <br /> <ul> <ol> <li> <dl> <dt> <dd> <table> <tr> <td> <em> <b> <u> <i> <strong> <font> <del> <ins> <sub> <sup> <quote> <blockquote> <pre> <address> <code> <cite> <embed> <object> <strike> <caption>
  • Lines and paragraphs break automatically.
  • Use <!--pagebreak--> to create page breaks.

More information about formatting options

 

knowledge_central_tab

 
 
Knowledge Central
Today's top security priorities
Attacks based on vulnerabilities in websites are skyrocketing, and not many solutions are available to protect organizations against them. How do you deal with this and other key security issues today?
Taking a holistic business-centric approach to security
Today’s CIOs face multiple challenges, including the need to innovate in an extremely competitive business climate, address highly dynamic regulatory and compliance challenges, speed ROI to counter shrinking IT budgets, and secure their organizations against a wide barrage of sophisticated threats.
 
 
 
UTM product offers Logansport Savings Bank superior protection
Astaro Security Gateway’s IPS was able to block attacks that other intrusion prevention systems (IPS) missed at Logansport Savings Bank.
Hong Leong Financial opts for Juniper Networks at new Malaysia head office, data center
Hong Leong Financial Group Berhad builds complete and seamless data center and office network infrastructure with Juniper switches, security devices and Junos software.