ABSTRACT
The output of a disassembler is used for many different purposes (e.g., debugging and reverse engineering). Therefore, disassemblers represent the first link of a long chain of stages on which any high-level analysis of machine code depends upon. In this paper we demonstrate that many disassemblers fail to decode certain instructions and thus that the first link of the chain is very weak. We present a methodology, called N-version disassembly, to verify the correctness of disassemblers, based on differential analysis. Given a set of n - 1 disassemblers, we use them to decode fragments of machine code and we compare their output against each other. To further corroborate the output of these disassemblers, we developed a special instruction decoder, the nth, that delegates the decoding to the CPU, the ideal decoder. We tested eight of the most popular disassemblers for Intel x86, and found bugs in each of them.
- D. Bruschi, L. Cavallaro, and A. Lanzi. Diversified Process Replicae for Defeating Memory Error Exploits. In 3rd International Workshop on Information Assurance. IEEE Computer Society, 2007.Google Scholar
- D. Chanet, B. De Sutter, B. De Bus, L. Van Put, and K. De Bosschere. Automated Reduction of the Memory Footprint of the Linux Kernel. ACM Transactions on Embedded Computing Systems, 6(4):23, 2007. Google ScholarDigital Library
- L. Chen and A. Avizienis. N-Version Programming: A Fault-Tolerance Approach to Reliability of Software Operation. In Proceedings of the 25 International Symposium on Fault-Tolerant Computing, 1995.Google Scholar
- C. Cifuentes and M. V. Emmerik. Recovery of Jump Table Case Statements from Binary Code. Science of Computer Programming, 2001. Google ScholarDigital Library
- C. Cifuentes and K. J. Gough. Decompilation of Binary Programs. Software, Practice and Experience, July 1995. Google ScholarDigital Library
- B. Cox, D. Evans, A. Filipi, J. Rowanhill, W. Hu, J. Davidson, J. Knight, A. Nguyen-tuong, and J. Hiser. N-Variant Systems: A Secretless Framework for Security through Diversity. In Proceedings of the 15th USENIX Security Symposium, 2006. Google ScholarDigital Library
- G. Dabah. diStorm64. http://ragestorm.net/distorm/.Google Scholar
- M. Dalla Preda, M. Madou, K. De Bosschere, and R. Giacobazzi. Opaque Predicates Detection by Abstract Interpretation. In Proceedings of the 1st International Workshop on Emerging Applications of Abstract Interpretation, 2006.Google ScholarDigital Library
- B. De Sutter, B. De Bus, K. De Bosschere, P. Keyngnaert, and B. Demoen. On the Static Analysis of Indirect Control Transfers in Binaries. In Proceedings of the International Conference on Parallel and Distributed processing Techniques and Applications (PDPTA), 2000.Google Scholar
- D. R. Engler and W. C. Hsieh. DERIVE: A Tool That Automatically Reverse-Engineers Instruction Encodings. In Proceedings of the ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization (Dynamo), 2000. Google ScholarDigital Library
- Free Software Foundation. GNU Binutils http://www.gnu.org/software/binutils/Google Scholar
- I. Guilfanov. Simplex method in IDA Pro, 2006. http://www.hexblog.com/2006/06/simplex_method_in_ida_pro.html.Google Scholar
- I. Guilfanov. Jump tables, 2008. http://hexblog.com/2008/01/jump_tables.html.Google Scholar
- Hex-Rays. IDA Pro. http://www.hex-rays.com/idapro/.Google Scholar
- R. N. Horspool and N. Marovac. An Approach to the Problem of Detranslation of Computer Programs. The Computer Journal, 23(3):223--229, 1980.Google ScholarCross Ref
- Intel. Intel 64 and IA-32 Architectures Software Developers Manual, Nov. 2008. Instruction Set Reference.Google Scholar
- Intel Corporation. XED2. http://www.pintool.org/.Google Scholar
- C. Kruegel, W. Robertson, F. Valeur, and G. Vigna. Static Disassembly of Obfuscated Binaries. In Proceedings of USENIX Security, August 2004. Google ScholarDigital Library
- C. Linn and S. Debray. Obfuscation of Executable Code to Improve Resistance to Static Disassembly. In Proceedings of the 10th ACM conference on Computer and communications security (CCS), 2003. Google ScholarDigital Library
- L. Martignoni, R. Paleari, G. Fresi Roglia, and D. Bruschi. Testing CPU emulators. In Proceedings of the 2009 International Conference on Software Testing and Analysis (ISSTA), Chicago, Illinois, U.S.A. ACM, July 2009. To appear. Google ScholarDigital Library
- W. M. McKeeman. Differential Testing for Software. Digital Technical Journal, 10(1), 1998.Google Scholar
- V. Mohan. Udis86. http://udis86.sourceforge.net/.Google Scholar
- NASM Team. The netwide assembler. http://www.nasm.us/.Google Scholar
- G. C. Necula and P. Lee. Proof-carrying code. Technical Report CMUCS-96-165, School of Computer Science, Carnegie Mellon University, Sept. 1996.Google Scholar
- R. Paleari, L. Martignoni, G. F. Roglia, and D. Bruschi. A Fistful of redpills: How to automatically generate procedures to detect CPU emulators. In Proceedings of the 3rd USENIX Workshop on Offensive Technologies (WOOT), Montreal, Canada. ACM, Aug. 2009. Google ScholarDigital Library
- B. Schwarz, S. Debray, and G. Andrews. Disassembly of Executable Code Revisited. Prooceedings of the Working Conference on Reverse Engineering (WCRE), 2002. Google ScholarDigital Library
- H. Theiling and A. Angewandte. Extracting Safe and Precise Control Flow from Binaries. In Proceedings of the 7th Conference on Real-Time Computing Systems and Applications, 2000. Google ScholarDigital Library
- J. Tröger and C. Cifuentes. Analysis of Virtual Method Invocation for Binary Translation. In Proceeding of the 9th Working Conference on Reverse Engineering (WCRE), 2002. Google ScholarDigital Library
- S. Udupa, S. Debray, and M. Madou. Deobfuscation: Reverse Engineering Obfuscated Code. In Proceedings of the 12th Working Conference on Reverse Engineering, 2005. Google ScholarDigital Library
- L. Vinciguerra, L. Wills, N. Kejriwal, P. Martino, and R. Vinciguerra. An Experimentation Framework for Evaluating Disassembly and Decompilation Tools for C++ and Java. In Proceedings of the 10th Working Conference on Reverse Engineering, 2003. Google ScholarDigital Library
- B. Yee, D. Sehr, G. Dardyk, B. Chen, R. Muth, T. Ormandy, S. Okasaka, N. Narula, and N. Fullagar. A sandbox for portable, untrusted x86 native code. In IEEE Symposium on Security and Privacy, 2009. Google ScholarDigital Library
- O. Yuschuk. OllyDbg. http://www.ollydbg.de/.Google Scholar
Index Terms
- N-version disassembly: differential testing of x86 disassemblers
Recommendations
A methodology for testing CPU emulators
Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenanceA CPU emulator is a software system that simulates a hardware CPU. Emulators are widely used by computer scientists for various kind of activities (e.g., debugging, profiling, and malware analysis). Although no theoretical limitation prevents developing ...
Testing CPU emulators
ISSTA '09: Proceedings of the eighteenth international symposium on Software testing and analysisA CPU emulator is a software that simulates a hardware CPU. Emulators are widely used by computer scientists for various kind of activities (e.g., debugging, profiling, and malware analysis). Although no theoretical limitation prevents to develop an ...
Random testing for security: blackbox vs. whitebox fuzzing
RT '07: Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007)Fuzz testing is an effective technique for finding security vulnerabilities in software. Fuzz testing is a form of blackbox random testing which randomly mutates well-formed inputs and tests the program on the resulting data. In some cases, grammars are ...
Comments