Introducing Model Checking for Malware Analysis

Model checking is an industrial strength technology used in software and protocol verification. Unfortunately, existing tools are complex and have a steep learning curve, making it complex to experiment with model checking.

Mr. Waffles is a bare-bones CTL model checker. You just need a Python interpreter, some additional modules, and you can start playing with graph logic.

Quick Example
>>> from checker import *
>>> g = PredicatedGraph()
>>> g.add_edge(1, 2)
>>> g.add_predicate(1, 'first')
>>> g.add_predicate(2, 'next')
>>> g.check('EX next')
set([1])

Mr. Waffles uses an extension of the Computation Tree Logic with reverse operators and free variables. With the reverse operators, it becomes easy to prove properties on the past (e.g. paths that lead to a certain node) and free variables allow for more concise specifications. We also added proofs over finite paths, making the PredicatedGraph structure naturally fit control-flow graphs (since you are most probably interested in finite program runs).

A real-worl example: malware analysis. We build a graph representing the program and check if it conforms to a malicious specification. For instance, Kinder et al. define the behavior of the Klez worm with the following formula: 'and GetModuleFileNameA(_, x, null) (EU !def(x) CopyFileA(x))'. We can then use Mr. Waffles to check that a sample graph exhibits Klez's behavior.

>>> g = PredicatedGraph()
>>> g.add_edge(1,2)
>>> g.add_edge(1,3)
>>> g.add_edge(3,4)
>>> g.add_edge(4,4)
>>> g.add_edge(3,5)
>>> g.add_edge(5,4)
>>> g.add_edge(5,6)
>>> g.add_predicate(1, 'entrypoint')
>>> g.add_predicate(2, 'CreateFile(pouet)')
>>> g.add_predicate(3, 'GetModuleFileNameA(handle, pFile, null)')
>>> g.add_predicate(4, 'def(handle)')
>>> g.add_predicate(5, 'use(pFile)')
>>> g.add_predicate(6, 'CopyFileA(pFile)')
>>> g.dot('klez')
>>> klez = 'and GetModuleFileNameA(_, x, null) (EU !def(x) CopyFileA(x))'
>>> g.check_with_freevars(klez)
set([3])

The return value set([3]) means "there is a path starting at node 3 in graph g satisfies the klez formula", therefore the malware has been successfully detected.

The g.dot('klez') statement above produced the following image (using the optional pydot module):

License: Mr. Waffles is distributed under the terms of the GPLv2


Reference Papers

P. Beaucamps and J.-Y. Marion, On Behavioral Detection »

D. Lacey et al., Proving Correctness of Compiler Optimizations by Temporal Logic »

J. Kinder et al., Detecting Malicious Code by Model Checking »

About

Developed by Daniel Reynaud (Nancy University - Loria) in collaboration with the High Sec Lab and ESEC R&D.

Q: Why is it called Mr. Waffles?
A: Because everybody loves waffles.