Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(1) Output raw test cases for fuzzer. (2) Prioritize solving some ann… #1

Open
wants to merge 5 commits into
base: debug
Choose a base branch
from

Conversation

leonbett
Copy link
Owner

@leonbett leonbett commented Jul 6, 2020

…otated edges. (3)Server mode, which only works when interpreter instance is not deleted.

…otated edges. (3)Server mode, which only works when interpreter instance is not deleted.
@leonbett leonbett requested a review from laurentsimon July 6, 2020 21:52
replayKTest(0), replayPath(0), usingSeeds(0),
atMemoryLimit(false), inhibitForking(false), haltExecution(false),
ivcEnabled(false), debugLogBuffer(debugBufferString) {

enteredDevMain = false; // Must be reset; previous Executor instance will have set it to true, leading to problems with undefined memory in uclibc before user main is reached.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the error message you get?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I comment this line, it will print
"addConstraint1: offset=2undefined_offset=2
KLEE: ERROR: libc/string/memcpy.c:29: read non-initialized memory"
for the second input.
I think the reason is in this line: https://github.com/leonbett/klee/blob/debug2/lib/Core/Memory.cpp#L378

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. you need to install my patched uclibc. Once I added support of non-initialized read in KLEE, it started finding some bugs in uclibc itself :/

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting :)
I've reverted my changes and used your uclibc, but then it's detecting another undefined read:
KLEE: ERROR: /root/git/klee/runtime/POSIX/fd_init.c:78: read non-initialized memory, which is this line klee_prefer_cex(s, s->st_dev == defaults->st_dev);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's because I have not tested all possible options... :/ If you check https://github.com/klee/klee/blob/master/runtime/POSIX/fd_init.c#L117, you'll see that defaults->st_dev is never set, so KLEE thinks it's undefined. The call to stat64() calls into kernel and initializes the s variable, but KLEE is not aware of this.
So we need to tell KLEE to ignore the undefinedness and use the values as concrete values. This can be done with klee_ignore_undefined(), see how it's used in https://github.com/lmrs2/klee-uclibc/blob/619421b12979bcea0b838338df37f23b3af46e8e/libc/termios/tcgetattr.c
This should work. If it does, I'll patch my KLEE fork...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this did the trick.

@@ -14,6 +14,9 @@
#include <set>
#include <string>
#include <vector>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this will work if you make install. The lib/Core/ files won't be copied in the installation folder. Maybe you could just declare classes as:

class TimingSolver;
class StatsTracker;
class SpecialFunctionHandler;

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'make install' seems to work. With the forward declarations, there will be errors in main.cpp when one of these classes is used, e.g.

specialFunctionHandler->prepare(preservedFunctions);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. I did not see it was actually used. I thought it was only passed as a pointer to the constructor.

replayKTest(0), replayPath(0), usingSeeds(0),
atMemoryLimit(false), inhibitForking(false), haltExecution(false),
ivcEnabled(false), debugLogBuffer(debugBufferString) {

enteredDevMain = false; // Must be reset; previous Executor instance will have set it to true, leading to problems with undefined memory in uclibc before user main is reached.
kmodule = _kmodule;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Executor::Executor(LLVMContext &ctx,
const InterpreterOptions &opts,
InterpreterHandler *ih,
TimingSolver* s,
Copy link
Collaborator

@laurentsimon laurentsimon Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be consistent with naming. Some variables start with _, some don't.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've removed the '_' and initialized the variables using the initializer list.

/// Points to the merging searcher of the searcher chain,
/// `nullptr` if merging is disabled
MergingSearcher *mergingSearcher = nullptr;

// For prioritized concolic execution
void add_forked_state_to_concolic_priority_list(ExecutionState* otherState);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep the naming as the rest of KLEE, like addForkedStateToConcolicPriorityList

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@@ -2935,7 +2912,9 @@ void Executor::doDumpStates() {
}

void Executor::run(ExecutionState &initialState) {
bindModuleConstants();
if (tries==0)
bindModuleConstants(); // Only do this once, because we can reuse it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this function outside of the Executor, maybe in klee/main.cpp?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. It requires some refactoring but I'll try :)

@@ -2998,12 +2977,33 @@ void Executor::run(ExecutionState &initialState) {
}

klee_message("seeding done (%d states remain)", (int) states.size());
klee_message("<CONC> sizeof prio2: %lu", prio2.size());
Copy link
Collaborator

@laurentsimon laurentsimon Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we're using a priority list f we're dumping these after we've replayed the seeds. I can see it may be useful later, though. Currently we seem to have deferred the state-dump to after the replay is complete. Is that correct?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test cases will only be dumped after replay. The idea is that the solver should only be involved in order to follow the branches as determined by the seed during replay, and when replay is done, it should only try to flip those branches corresponding to the states in prio2, then prio1, and not try to solve prio0 branches at all. This is probably not the ideal/right way to do it though?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what we lose if we dump the states during the replay (as in my original code) vs waiting till the end. If there's a good reason to defer, that's fine.

Copy link
Owner Author

@leonbett leonbett Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is not very useful indeed, so I will revert to your original code that dumps the states during replay.

interpreterHandler->incPathsExplored(); // not sure this is correct
}*/

klee_message("states size() after: %lu\n", states.size());

if (OnlySeed) {
doDumpStates();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this call will duplicate some of the dumping we do above. In this line R1050, we're adding to our prio list but we're not killing the state, which makes me think KLEE will have a state that will be dumped above and in the doDumpStates() call.

interpreterHandler->incPathsExplored(); // not sure this is correct
}*/

klee_message("states size() after: %lu\n", states.size());

if (OnlySeed) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we passing --only-seed when running KLEE?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently I run it like so:
/path/to/klee/build/bin/klee --libc=uclibc --posix-runtime -server-seed-dir=/root/git/test/inotify_test/watch -sync-dir=/root/git/test/syncdir --named-seed-matching --max-solver-time=5 -only-replay-seeds -only-seed -allow-seed-extension -allow-seed-truncation -allocate-determ=true -solver-backend=z3 /path/to/mytest.bc --sym-stdin 1000 &> dump

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. So we should check we're not dumping the same state twice (see my #1 (comment)). To speed things up (after you've debugged), you can also use -only-output-states-covering-new. But your code should also work without this option, so let's not use it yet - it's just an optimization.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, what is the sync-dir option for? I have not used it myself.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info!
sync-dir is an option I've added (https://github.com/leonbett/klee/blob/debug2/tools/klee/main.cpp#L140). It's the location where raw test cases are output for a fuzzer to pick them up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok got it! Maybe we should call these --stateful-sync-indir and --stateful-sync-outdir. It's not important for now.


globalObjects.clear();
globalAddresses.clear();

if (statsTracker)
statsTracker->done();

tries++;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we move bindModuleConstants() to KLEE's main.cpp, we may be able to get rid of the tries variable.

std::string newSeed = blocking_get_new_seed_file();
errs() << "read seed: " << newSeed << "\n";
errs() << "iteration " << i << "\n";
MemoryObject::counter = 0; // same MemoryObject ids for all runs for debugging memory layout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good enough for testing. I wonder if it'd be cleaner to have a setter/getter function in MemoryObject to set/get this variable; instead of accessible like this.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, but I think I will remove it altogether

watchfd = inotify_add_watch(notifyfd, ServerSeedDir.c_str(), IN_CREATE);
if (watchfd < 0)
return cleanup_monitor_directory(1, "inotify_add_watch");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an alternative to the cleanup_monitor_directory() function is to use:

int result = -1;

if (notify < 0)
   goto end;
...
if (watchfd < 0)
  goto end;

// everything ok
result = 0;

end:
  // do the cleanup
  if (result == -1) {.... }
  return result;

}

void Executor::add_forked_state_to_concolic_priority_list(ExecutionState* otherState){
Instruction* lastInstr = otherState->prevPC->inst;
Copy link
Collaborator

@laurentsimon laurentsimon Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to have some prioritization for seed scheduling, we should add our own searcher in https://github.com/klee/klee/blob/master/lib/Core/Searcher.cpp
Currently we're only using this list to dump the state, if I understand correctly.

/// @brief Required by klee::ref-managed objects
mutable class ReferenceCounter _refCount;

public:
static int counter;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counters are unsigned values ;)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can blame the klee authors :)

@@ -520,6 +518,7 @@ Executor::~Executor() {
}

/***/
bool enteredDevMain = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can set this to true to test. It set it to false before I understood why I was getting undefined mem error before reaching the main function... then I realized I needed this extra klee_ignore_undefined function + uclibc patches.

@@ -115,6 +115,7 @@ void klee_init_fds(unsigned n_files, unsigned file_length,
struct stat64 s;

stat64(".", &s);
klee_ignore_undefined(&s, sizeof(s));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. I'll make similar changes to my KLEE branch.

@leonbett leonbett requested a review from cl-g July 24, 2020 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants