Hi,
I am writing a open source tool that decompiles binary code.
I have it working quite well, taking the binary code and outputting a
.c program.
I call it "libbeauty" and it is hosted here:
http://revenge.berlios.de/
I am having a problem crafting an algorithm.
Here is some example simple code:
int m;
int n;
n = 0;
loop:
m = data[n];
n++;
If n <= 9 goto loop;
Now, I can easily tell by looking at the code that the size of the
"data" array is 10. n counts from 0 to 9.
So, by observation, one can determine that we should have at the top
of the code: int data[10];
I am trying to come up with an algorithm that will discover the size
of the data array.
I have come up with several methods, but none of them are pretty.
E.g.
Using constraints.
"n=0" means that at the point of "loop:" n has initialised to 0.
One can determine that between "loop:" and the if statement, n is
increasing and never decreasing. (but there are exceptions to
consider, e.g. wrap around)
>From the if statement one can determine that when reaching "loop:" n
is always less or equal to 9.
So, when one reaches the "m=data[n]" one has the following constraints.
1) n initialised to 0;
2) n always increases
3) n is less than 10.
We therefore have both bounds, the lower bound and the upper bound,
and can therefore say "int data[10]"
The above constraints method works very well for this example, but it
will gets less productive if the code is more complex than a simple
loop.
Can anyone come up with a better algorithm?
Kind Regards
James