[LCP]Address out of bounds error on Linux 7.3
Mike and/or Penny Novack
stepbystepfarm at mtdata.com
Tue Aug 19 22:44:01 UTC 2003
Let an old IBM mainframe programmer explain how a program can work on
one machine but not another (or work until a trivial change in data
storage has been made and not thereafter).
In the implementation of most operating systems, the check for "out of
bounds" isn't by exact bytes of storage defined but in terms of "pages"
of some specified size. It's simply easier to do it that way (on the
mainframes with hardware protection by page, MUCH easier). Remember,
it's virtual memory getting translated to real memory somehow, and
that's usually by "page". In other words, if your program "owns" any of
a page it owns all of it and won't be considered "oob" if it steps on
the undefined (to the program) portion of a page it "owns". Many times a
puzzled programmer has brought this sort of problem to me and I remember
very well a bug of this sort which survived in production over 20 years
causing much merriment when it finally crashed the application after a
minor change (size of buffers increased) because by then the programmer
originally responsible for the bad code was the senior vice president in
charge of all DP and supposedly safely beyond "winning" the purple
weiner (a little "trophy" passed to whoever caused the latest serious
production hang and had to keep until it could be passed on to the next
"winner").
So.......first rule out that you have indeed stepped out of bounds in
spite of the fact that the program works on another system (for which
page size might be different, which might be "paged" while the new
system is "exact", which might assign space in different order, etc.
---- all of these can cause an actual "oob" error to be noticed on one
system/machine but not another --- and remember the rule, that's NOT a
bug in the system which failed to catch the error because the only
guarantee is that a "correct" system works with correct code (all bets
are off about what happens with bad code)
An undefined pointer or runaway subscript wouldn't be the cause (that's
always hang) , but subscripting just a little too far could work on one
system but not another. How about the assignment of space for your
structures? (don't know your style; I rarely use space IN my programs
but allocate space at runtime, so if it were mine I would check the LAST
in physical allocation).
Mike
More information about the linuxCprogramming
mailing list