[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.c

Finding a string in executable

Angus

4/1/2011 10:08:00 AM

I have a very simple program as below:

int main(){
char* mystring = "ABCDEF";
return 0;
}

I have built this program without any debugging symbols included. If
I open the program in a hex editor I cannot find the string ABCDEF.
Should this string not be stored sequentially in some area of the
executable?
7 Answers

Angel

4/1/2011 10:20:00 AM

0

On 2011-04-01, Angus <anguscomber@gmail.com> wrote:
> I have a very simple program as below:
>
> int main(){
> char* mystring = "ABCDEF";
> return 0;
> }
>
> I have built this program without any debugging symbols included. If
> I open the program in a hex editor I cannot find the string ABCDEF.
> Should this string not be stored sequentially in some area of the
> executable?

That is machine- and compiler-dependant, the C standard says nothing
about it. Perhaps on some exotic platform, the string might be compressed,
encrypted or fragmented.

On a 32 bit Linux system with gcc, I can see the string with the
'strings' program.

# strings blah
/lib/ld-linux.so.2
__gmon_start__
libc.so.6
_IO_stdin_used
__libc_start_main
GLIBC_2.0
PTRh
[^_]
ABCDEF


--
The perfected state of a spam server is a smoking crater.
- The Crater Corrolary to Rule #4

Mark Bluemel

4/1/2011 11:00:00 AM

0

On Apr 1, 11:08 am, Angus <anguscom...@gmail.com> wrote:
> I have a very simple program as below:
>
> int main(){
>    char* mystring = "ABCDEF";
>    return 0;
>
> }
>
> I have built this program without any debugging symbols included.  If
> I open the program in a hex editor I cannot find the string ABCDEF.
> Should this string not be stored sequentially in some area of the
> executable?

You don't use the string, so there's no reason why the compiler
shouldn't optimise it away, is there?

Angus

4/1/2011 11:02:00 AM

0

On Apr 1, 11:20 am, Angel <angel+n...@spamcop.net> wrote:
> On 2011-04-01, Angus <anguscom...@gmail.com> wrote:
>
> > I have a very simple program as below:
>
> > int main(){
> >    char* mystring = "ABCDEF";
> >    return 0;
> > }
>
> > I have built this program without any debugging symbols included.  If
> > I open the program in a hex editor I cannot find the string ABCDEF.
> > Should this string not be stored sequentially in some area of the
> > executable?
>
> That is machine- and compiler-dependant, the C standard says nothing
> about it. Perhaps on some exotic platform, the string might be compressed,
> encrypted or fragmented.
>
> On a 32 bit Linux system with gcc, I can see the string with the
> 'strings' program.
>
> # strings blah
> /lib/ld-linux.so.2
> __gmon_start__
> libc.so.6
> _IO_stdin_used
> __libc_start_main
> GLIBC_2.0
> PTRh
> [^_]
> ABCDEF
>
> --
> The perfected state of a spam server is a smoking crater.
> - The Crater Corrolary to Rule #4

I think the compiler optimised the string away - ie the string wasn't
used so it just removed it. If you follow with puts(mystring) then
you do see the string in the exe.

Reason for question is to work out why declaration of string seems to
show different behaviour (on MS compiler anyway).

Refined question is:
#include <stdio.h>

int main(){
char* mystring = "ABCDEFGHIJKLMNO";
puts(mystring);

char otherstring[15];
otherstring[0] = 'a';
otherstring[1] = 'b';
otherstring[2] = 'c';
otherstring[3] = 'd';
otherstring[4] = 'e';
otherstring[5] = 'f';
otherstring[6] = 'g';
otherstring[7] = 'h';
otherstring[8] = 'i';
otherstring[9] = 'j';
otherstring[10] = 'k';
otherstring[11] = 'l';
otherstring[12] = 'm';
otherstring[13] = 'n';
otherstring[14] = 'o';
puts(otherstring);

return 0;
}


Compiler was MS VC++.

Whether I build this program with or without optimisations I can find
the string "ABCDEFGHIJKLMNO" in the executable using a hex editor.

However, I cannot find the string "abcdefghijklmno"

What is the compiler doing that is different for otherstring?


The hex editor I used was Hexedit - but tried others and still
couldn't find otherstring. Anyone any ideas why not or how to find?

By the way I am not doing this for hacking reasons.

Mark Bluemel

4/1/2011 11:09:00 AM

0

On Apr 1, 12:02 pm, Angus <anguscom...@gmail.com> wrote:
> On Apr 1, 11:20 am, Angel <angel+n...@spamcop.net> wrote:
>
>
>
> > On 2011-04-01, Angus <anguscom...@gmail.com> wrote:
>
> > > I have a very simple program as below:
>
> > > int main(){
> > >    char* mystring = "ABCDEF";
> > >    return 0;
> > > }
>
> > > I have built this program without any debugging symbols included.  If
> > > I open the program in a hex editor I cannot find the string ABCDEF.
> > > Should this string not be stored sequentially in some area of the
> > > executable?
>
> > That is machine- and compiler-dependant, the C standard says nothing
> > about it. Perhaps on some exotic platform, the string might be compressed,
> > encrypted or fragmented.
>
> > On a 32 bit Linux system with gcc, I can see the string with the
> > 'strings' program.
>
> > # strings blah
> > /lib/ld-linux.so.2
> > __gmon_start__
> > libc.so.6
> > _IO_stdin_used
> > __libc_start_main
> > GLIBC_2.0
> > PTRh
> > [^_]
> > ABCDEF
>
> > --
> > The perfected state of a spam server is a smoking crater.
> > - The Crater Corrolary to Rule #4
>
> I think the compiler optimised the string away - ie the string wasn't
> used so it just removed it.  If you follow with puts(mystring) then
> you do see the string in the exe.
>
> Reason for question is to work out why declaration of string seems to
> show different behaviour (on MS compiler anyway).
>
> Refined question is:
> #include <stdio.h>
>
> int main(){
>    char* mystring = "ABCDEFGHIJKLMNO";
>    puts(mystring);
>
>   char otherstring[15];
>   otherstring[0]  = 'a';
>   otherstring[1]  = 'b';
>   otherstring[2]  = 'c';
>   otherstring[3]  = 'd';
>   otherstring[4]  = 'e';
>   otherstring[5]  = 'f';
>   otherstring[6]  = 'g';
>   otherstring[7]  = 'h';
>   otherstring[8]  = 'i';
>   otherstring[9]  = 'j';
>   otherstring[10] = 'k';
>   otherstring[11] = 'l';
>   otherstring[12] = 'm';
>   otherstring[13] = 'n';
>   otherstring[14] = 'o';
>   puts(otherstring);
>
>   return 0;
>
> }
>
> Compiler was MS VC++.
>
> Whether I build this program with or without optimisations I can find
> the string "ABCDEFGHIJKLMNO" in the executable using a hex editor.
>
> However, I cannot find the string "abcdefghijklmno"
>
> What is the compiler doing that is different for otherstring?
>
What is otherstring initialised to (note, I didn't ask what value is
eventually assigned to otherstring)?

Note also that otherstring is not a legal C string as it is not
(guaranteed to be) null-terminated.

Ben Bacarisse

4/1/2011 11:15:00 AM

0

Angus <anguscomber@gmail.com> writes:

> I have a very simple program as below:
>
> int main(){
> char* mystring = "ABCDEF";
> return 0;
> }
>
> I have built this program without any debugging symbols included. If
> I open the program in a hex editor I cannot find the string ABCDEF.
> Should this string not be stored sequentially in some area of the
> executable?

The string is not used so at least one explanation presents itself: the
optimiser has noticed that there is not need to store the string.

Other explanations are possible. Because the string is ABCDEF I am
worried you might be looking for hexadecimal ABCDEF which is not at all
the same as looking for the hexadecimal representation of the characters
ABCDEF.

--
Ben.

puppi

4/2/2011 3:20:00 PM

0

On Apr 1, 8:02 am, Angus <anguscom...@gmail.com> wrote:
> On Apr 1, 11:20 am, Angel <angel+n...@spamcop.net> wrote:
>
>
>
> > On 2011-04-01, Angus <anguscom...@gmail.com> wrote:
>
> > > I have a very simple program as below:
>
> > > int main(){
> > >    char* mystring = "ABCDEF";
> > >    return 0;
> > > }
>
> > > I have built this program without any debugging symbols included.  If
> > > I open the program in a hex editor I cannot find the string ABCDEF.
> > > Should this string not be stored sequentially in some area of the
> > > executable?
>
> > That is machine- and compiler-dependant, the C standard says nothing
> > about it. Perhaps on some exotic platform, the string might be compressed,
> > encrypted or fragmented.
>
> > On a 32 bit Linux system with gcc, I can see the string with the
> > 'strings' program.
>
> > # strings blah
> > /lib/ld-linux.so.2
> > __gmon_start__
> > libc.so.6
> > _IO_stdin_used
> > __libc_start_main
> > GLIBC_2.0
> > PTRh
> > [^_]
> > ABCDEF
>
> > --
> > The perfected state of a spam server is a smoking crater.
> > - The Crater Corrolary to Rule #4
>
> I think the compiler optimised the string away - ie the string wasn't
> used so it just removed it.  If you follow with puts(mystring) then
> you do see the string in the exe.
>
> Reason for question is to work out why declaration of string seems to
> show different behaviour (on MS compiler anyway).
>
> Refined question is:
> #include <stdio.h>
>
> int main(){
>    char* mystring = "ABCDEFGHIJKLMNO";
>    puts(mystring);
>
>   char otherstring[15];
>   otherstring[0]  = 'a';
>   otherstring[1]  = 'b';
>   otherstring[2]  = 'c';
>   otherstring[3]  = 'd';
>   otherstring[4]  = 'e';
>   otherstring[5]  = 'f';
>   otherstring[6]  = 'g';
>   otherstring[7]  = 'h';
>   otherstring[8]  = 'i';
>   otherstring[9]  = 'j';
>   otherstring[10] = 'k';
>   otherstring[11] = 'l';
>   otherstring[12] = 'm';
>   otherstring[13] = 'n';
>   otherstring[14] = 'o';
>   puts(otherstring);
>
>   return 0;
>
> }
>
> Compiler was MS VC++.
>
> Whether I build this program with or without optimisations I can find
> the string "ABCDEFGHIJKLMNO" in the executable using a hex editor.
>
> However, I cannot find the string "abcdefghijklmno"
>
> What is the compiler doing that is different for otherstring?
>
> The hex editor I used was Hexedit - but tried others and still
> couldn't find otherstring.  Anyone any ideas why not or how to find?
>
> By the way I am not doing this for hacking reasons.

The other string, "abcdefghijklmno", simply doesn't exist in the .exe!
You are setting the elements of otherstring[] with the values 'a',
'b', 'c', ... . That's different from an initialization as in the case
of *mystring. In the latter, the compiler allocates space for
"ABCDEFGHIJKLMNO" and places it there. In the former, it simply
allocates space for it and that's it. The characters are "inserted" in
the string at runtime. The characters 'a', 'b', 'c', ... are probably
there in the .exe, but they are not pure data. They are part of code.
That's why you don't simply see a string.
What will be seen is truly compiler and architecture dependant. In my
PC, with an AMD64 processor, running Linux and compiling with gcc in
ELF64 executable format, I see under a hex editor:
E.a.E.b.E.c.E.d.E.e.E.f.E.g.E.h.E.i.E.j.E.k.E.l.E.m.E.n.E.o
(where the dot represents a non-printable character).
Each of those "E.[character]" is an operation to set a character of
otherstring[].
I said these characters are PROBABLY there in the .exe because the
compiler could have used a different approach. Instead, it could store
the value 'a' somewhere (in a register probably), and at each
attribution increment that value, producing the next (since they are
sequential letters).
Anyway, that's not really a C question. You'd be better off posting
this question in comp.lang.asm.x86

Thad Smith

4/2/2011 5:44:00 PM

0

On 4/1/2011 4:02 AM, Angus wrote:

> Refined question is:
> #include<stdio.h>
>
> int main(){
> char* mystring = "ABCDEFGHIJKLMNO";
> puts(mystring);
>
> char otherstring[15];
> otherstring[0] = 'a';
> otherstring[1] = 'b';
> otherstring[2] = 'c';
> otherstring[3] = 'd';
> otherstring[4] = 'e';
> otherstring[5] = 'f';
> otherstring[6] = 'g';
> otherstring[7] = 'h';
> otherstring[8] = 'i';
> otherstring[9] = 'j';
> otherstring[10] = 'k';
> otherstring[11] = 'l';
> otherstring[12] = 'm';
> otherstring[13] = 'n';
> otherstring[14] = 'o';
> puts(otherstring);
>
> return 0;
> }
>
>
> Compiler was MS VC++.
>
> Whether I build this program with or without optimisations I can find
> the string "ABCDEFGHIJKLMNO" in the executable using a hex editor.
>
> However, I cannot find the string "abcdefghijklmno"
>
> What is the compiler doing that is different for otherstring?

For starters, your program doesn't define a string "abcdefghijklmno". If you
want some insight, I suggest looking at an assembly listing of the generated
code (or disassembly in a debugger) to see what your compiler generates. Other
compilers may generate different code.

--
Thad