Skip to content

Latest commit

 

History

History
30 lines (24 loc) · 2 KB

README.md

File metadata and controls

30 lines (24 loc) · 2 KB

learning-whitebox

The repository contains the code described in my master's thesis "Learning white-box: Applying machine learning to identify white-box related functions in binaries" which can be found on TU/e website. The belaida script loads a trained SVM model and classifies whether functions in a binary are white-box related or not (see paper). It's a proof of concept!

The repository was initially thought to create scripts that obfuscate Chow's white-box implementation. However, it ended up containing source code from multiple repositories, which were needed to create obfuscated variants. Some files were merged into one source file. The ownerships are as following:

Pre-requisites:

Building with debug symbols:

./build-variants.sh

If you want to add debug symbols for the binaries, add the --add-symbols argument when running the build script. The llvm-obfuscated binary will not include symbols.

Other files:

  • belaida.py: script which can run in IDA Pro 8.0+ to classify white-box related functions in a binary
  • rebel-ida.py: used to extract features from samples. Runs in IDA Pro 8.0
  • classifier-hyperparameter-tuning.py: exactly what the name says
  • variants/*.sh: scripts used to create obfuscated variants of the implementations mentioned above
  • *.pkl files: trained SVM model in pickle format