[LCP]debugging a SIGSEGV
Karthik Vishwanath
karthikv at Alum.Dartmouth.ORG
Mon Feb 7 13:22:02 UTC 2005
Thanks for the responses -- I have not had any luck with resolving this
issue, yet. Heres some more information on the program and the environment
its being run on: the code is written standard C and implements a monte
carlo type algorithm. The code requires < ~sizeof(double)*200000 bytes.
The code is complied using gcc (3.3.4) on a debian linux kernel
(2.4.18-1-k7) on an athlon xp1800+, with 1G of memory. Its a plain,
vanilla number crunching code; it does not invoke pthread_atfork() or any
other such function, its compiled/linked using-Wall -ggdb -lm, as flags.
The program crashes after different times, depending on its inputs (an
input file from the command line sets the "number of groups" that the code
needs to simulate) Earlier, I had noticed that it used to exit with
SIGSEGV after processing every single group from the input file, saving
the data from its run - only that the program crashed, without exiting
gracefully. There is now a second different input file that makes it
crash (with a SIGSEGV) at the end of its first run -- no data is saved.
I tried using strace -fF on the program with both these two different
inputs and the SIGSEGV line matching the output from strace remains the
same in both cases:
26811 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
26811 +++ killed by SIGSEGV +++
Any clues to what this means?
Jack - how does one land up trashing the libc stack? Can you provide some
examples that might do so? Am trying Valgrind next...
Thanks,
-K
On Sun, 6 Feb 2005, Jack Lloyd wrote:
>
> I've seen this happen (SIGSEGV after main returns) a few times. It's been a
> while, but ISTR that in the end I always found I was trashing some internal
> memory structure. For example, if you trash something in the stack of the libc
> startup code (called __libc_start_main in glibc, I think), you'll get a crash
> after main returns and __libc_start_main resumes.
>
> In C++ this can also happen if a destructor for a global object does something
> stupid; looking at the stack trace at the time of failure will probaby diagnose
> this (you'll see in the call chain a GCC generated function for destroying
> global objects).
>
> A tool that may help tracking this down is Valgrind. If nothing else, a clean
> Valgrind run will eliminate the possibility that it's a memory error.
>
> -Jack
>
> On Sun, Feb 06, 2005 at 02:32:48PM -0500, Karthik Vishwanath wrote:
> > My program quits _after_ processing the last lines of main() with a
> > SIGSEGV (the last lines of main() are a printf() statements). I compiled
> > the program using the -ggdb flags and used ddd to execute the program to
> > get any more info. on where the code dies and heres what ddd tells me:
> > "Program received signal SIGSEGV, Segmentation fault. 0x4012be5b in
> > __register_atfork () from /lib/libc.so.6 (gdb) "
> >
> > Can anyone tell me what is going on, and why this could be happening (I
> > have checked writes to malloc'd *s very carefully and I don't think its
> > because of accessing an uninitialized memory location etc.)? Any/all
> > pointers (no pun intended) toward throwing some light on how to get rid of
> > the segmentation fault will be very appreciated.
> >
> > Thanks,
> >
> > -K
> >
> >
> >
> > _______________________________________________
> > This is the Linux C Programming List
> > : http://lists.linux.org.au/listinfo/linuxcprogramming List
>
> _______________________________________________
> This is the Linux C Programming List
> : http://lists.linux.org.au/listinfo/linuxcprogramming List
>
More information about the linuxCprogramming
mailing list