[LCP]Address out of bounds error on Linux 7.3

Mike and/or Penny Novack stepbystepfarm at mtdata.com
Tue Aug 19 22:44:01 UTC 2003


Let an old IBM mainframe programmer explain how a program can work on 
one machine but not another (or work until a trivial change in data 
 storage has been made and not thereafter).

In the implementation of most operating systems, the check for "out of 
bounds" isn't by exact bytes of storage defined but in terms of "pages" 
of some specified size. It's simply easier to do it that way (on the 
mainframes with hardware protection by page, MUCH easier). Remember, 
it's virtual memory getting translated to real memory somehow, and 
that's usually by "page". In other words, if your program "owns" any of 
a page it owns all of it and won't be considered "oob" if it steps on 
the undefined (to the program) portion of a page it "owns". Many times a 
puzzled programmer has brought this sort of problem to me and I remember 
very well a bug of this sort which survived in production over 20 years 
causing much merriment when it finally crashed the application after a 
minor change (size of buffers increased) because by then the programmer 
originally responsible for the bad code was the senior vice president in 
charge of all DP and supposedly safely beyond "winning" the purple 
weiner (a little "trophy" passed to whoever caused the latest serious 
production hang and had to keep until it could be passed on to the next 
"winner").

So.......first rule out that you have indeed stepped out of bounds in 
spite of the fact that the program works on another system (for which 
page size might be different, which might be "paged" while the new 
system is "exact", which might assign space in different order, etc. 
---- all of these can cause an actual "oob" error to be noticed on one 
system/machine but not another --- and remember the rule, that's NOT a 
bug in the system which failed to catch the error because the only 
guarantee is that a "correct" system works with correct code (all bets 
are off about what happens with bad code)

An undefined pointer or runaway subscript wouldn't be the cause (that's 
always hang) , but subscripting just a little too far could work on one 
system but not another. How about the assignment of space for your 
structures? (don't know your style; I rarely use space IN my programs 
but allocate space at runtime, so if it were mine I would check the LAST 
in physical allocation).

Mike




More information about the linuxCprogramming mailing list