Reversing firmware part 1
The article will explore various strategies for reversing firmware, with some examples. Finally, some best practices are mentioned.
Embedded systems and firmware
Embedded systems are everywhere, in mobiles, cameras, TVs, smart cards, and other automated devices. They have become an integral part of our lives and have made it comfortable and easy.
But how do these embedded devices work? They have installed firmware, which has its own set of instructions running on the hardware's processor. Firmware is basically a binary file installed on customized operating systems such as Unix or Windows and very small in size. It is specifically designed and developed to perform some predetermined set of functions. The introduction and architecture of firmware is beyond the scope and context of this article. We are more interested here in analyzing firmware from a security standpoint.
Become a certified reverse engineer!
For this article, we'll be using the following tools:
- binwalk
- dd, a forensics tool, extractor
- firmware modification kit
Firmware architecture
Most firmware architectures fall into these categories:
- Full firmware—This mostly consists of OS (Linux, Windows, etc.), such as BusyBox, kernel, bootloaders, libraries, and applications developed over them.
- Partial firmware—Where one of the above components is missing. The application may run directly with Kernel privileges, may have a custom OS, or may be just associated files.
- Popular OS—Linux, Windows, Cisco IOS, Symbian, etc.
- Popular file formats—CramFS, SquashFS, etc.
- Popular bootloaders—U-Boot, Redboot, etc.
- Compressing mechanisms—Zip, LZMA, Tar, etc.
After unpacking the firmware we may find the following: bootloaders, kernels, filesystem images, user apps, and web servers. We need to extract the filesystem images in order to analyze them.
Analysis
We'll use binwalk here. It comes as a part of a BT5 installation by default. If not in your distro, please update your BT as per the instructions at BackTrack's website. Binwalk is basically a tool to examine binary files. It searches for certain strings or patterns and gives the result; however, analysis needs to be done to ascertain the correctness of the results, as it may throw a lot of false positives. It lists the starting address of a certain section, size, and encryption types, etc., of the firmware. A sample:
Here we can see the compressing format/ archives used (LZMA), the size, and other properties. There are also some file formats, such as SquashFS, CramFS, NTFS, etc. The most popular ones are SquashFS and CramFS. We need to unpack the archives to examine further, which may give us information about bootloaders, kernels, web servers, filesystems, etc. Let's try the command "strings" and "hexdump" on the above binary file.
The "strings" command tries to list some readable strings, which means that the file is not encrypted:
Using hexedit, we may try to identify headers:
Unfortunately, no headers are identified. The reason I wanted to show the above examples on a simple firmware file is that sometimes there may not be any leads while doing analysis. In the reconnaissance phase it may fail; however, the key takeaway was that the above file was not encrypted, which may be a security issue.
Now let's move to a more real firmware file, the DLink router firmware. This can be downloaded from the DLink website. Let's examine this file and hope it will give us more results.
Let's run the "strings" command again as strings <filename>. It may give us some clues. The output shows that the file is written in the C/ C++ language, but also gives the clue that it's not encrypted. Unfortunately, we are not able to get the boot loader information due to some error. In addition, we may check the entire output for anything else interesting, but we did not find any except that it's developed in C/C++. Okay, let's move on.
Let's look at the results given by hexedit. It also doesn't immediately produce anything interesting:
Let's run binwalk on the file:
Binwalk provides us with some interesting information. We need to be careful about false positives. The compression type is "lzma." The filesystem is packed with SquashFS. The results show that it's a Realtek firmware header and the created date is in the past; the image size of 1543680 bytes (1.48 MB) is also less than the total file size of 1543708 bytes. All of these indicate that it is a valid result. The information about encryption seems to be a false positive, as we already saw that we were able to read the strings in clear text. The filesystem SquashFS seems to be valid, since its size is well below the actual file size and the created date is in the past.
We'll use a tool here called "dd," which is basically used in forensics investigations for extracting a chunk of data from the disk.
The offset 0 proves that we have successfully carved out the filesystem.
Alternatively, we can use another tool called "firmware modification tool," which can be used to extract various file types such as squash, CramFS filesystems etc. This can be downloaded from:
https://code.google.com/p/firmware-mod-kit/
This toolkit contains various tools, such as unsquash_all, uncramfs_all, extract-sh, build-firmware.sh, firmware-modification-kit.sh, etc., for different type of activities. Let's try to extract files from the above filesystem, firmware.squash. We'll use a tool called unsquash_all.sh, as it's a squash filesystem:
Let's check the folder where the entire filesystem has been extracted:
Now we can browse all the files and folders; the most interesting ones may be under "etc," such as shadow or passwd files. Here are the contents of the passwd file:
There is a wealth of information in /etc/rc, which contains startup scripts:
One interesting thing we can see is that the firmware is starting some web services from the "webs" binary file with root privileges. Let's run the "string" command on this to identify some hardcoded literals. We will extract the output into a text file for easy viewing:
squashfs-root/bin# strings webs>> out.txt
The contents of out.txt gives a wealth of information. We can inspect the contents of various folders/ subfolders such as:
/lan/ethernet/ip
/var/index
security/firewall/httpAllow
/sys/language
bin/date
Potential user inputs:
%s is an input in string format.
Remote telnet at port non 5457:
It is also possible to modify the firmware file and repackage it, which we will explore in next part of this article.
Conclusion
Some general recommendations for secure firmware:
Become a certified reverse engineer!
- Encryption—It prevents reverse engineering of the firmware. The firmware might be stored in encrypted form and only decrypted when it is to be executed or it might be decrypted during the firmware update process.
- Signing—This is concerned with ensuring that a message has not been corrupted or modified while in transit. This is important, since a malicious user cannot be allowed to alter the firmware originally delivered by the firmware producer.
- No hard-coded credentials—The presence of hard-coded accounts can serve as backdoors to devices.
- Code obfuscation—It makes runtime analysis difficult.
- Make it difficult for unauthorized users to obtain the firmware updates. Make it restrictive, less exposed.
- If the device is capable of being networked, ensure that no unnecessary services are running and that it can also alert and log when firmware has been updated.
In the next part of the article, we'll see how to modify and repackage an existing firmware file.