[LCP]Rounding questions
Greg Black
gjb at gbch.net
Fri Sep 6 21:01:05 UTC 2002
Chuck Martin wrote:
| On Fri, Sep 06, 2002 at 07:21:46AM +1000, Greg Black wrote:
| > Steve Baker wrote:
| > | well). If you look in /usr/include/features.h you'll find all the "features"
| > | you can select from.
| >
| > And only if your system has features.h and all the rest of
| > panoply. None of the BSD systems I have access to here has
| > this.
|
| If this is true, I probably don't want to use round() then.
I'd say that was a good decision :-)
| > Given that the functionality of round() is trivial to implement
| > with the C89 floor() and ceil() functions, it would be much
| > simpler to implement it in terms of those.
|
| That was the way this program worked originally, but then I found
| a bug that I haven't been able to figure out, and I thought maybe
| using round() would be a better solution. Maybe you can shed some
| light on this bug. Here is the relevant part of the code as it was
| originally:
I've elided all the code and explanation for several reasons:
1. It's almost always true that changes of behaviour in a
program that result from random printf() calls being
inserted or removed is really an indicator that there
really is a bug in the program -- somewhere else than where
attention is being paid. The changes to the stack caused
by the inserted calls just migrates the impact of the bug
to some other place.
2. I don't have the time or the interest to wade through a lot
of code chasing what is probably a random bug in some other
part of the program. If you can't create a small sample
program with the problem, it usually means that you haven't
yet understood the nature of the problem.
3. I do have some suggestions that you might like to try in an
attempt to narrow down the problem space.
Despite the above, the particular symptoms you've described
might be a bug in your libraries or compiler -- although that
remains a very small probability.
| Any suggestions are welcome.
OK, here's the way I would approach this if it were my problem.
First, remove all the extraneous stuff that you've put in and
restore any other changes you've made -- as far as you can.
It's best to track this down through the most original code that
can illustrate the problem in case it's a bug that needs to be
reported to the original author.
Then, having made sure that the bug is still present, take the
following steps, making sure at each step that the bug is still
present. First, and always when debugging, make sure that all
compiler optimisations are OFF. If the problem disappears, it's
an optimiser bug, so you're done.
Next, add a single sentinel printf() immediately before and
after the code that causes the problem -- no floating point
stuff, just simple strings (e.g., "trace start" and "trace
end"). Then run the program under ktrace or truss or whatever
trace scaffolding your system supports and, if the bug persists,
check very carefully the system calls between the sentinels.
You may find that something is being done that you didn't expect
and that might be your problem.
As somebody else mentioned, FPU registers and C doubles are not
the same size in i386 machines, so subtle differences may arise
because of that. And sometimes math library functions do odd
things unless you turn the right knobs (see the IEEE-754 doco
for your system for more on this).
Another thing to know is that most C programmers are pretty
vague when it comes to corner cases of math functions and most
mathematicians are really hopeless C programmers -- so there are
odd things in the intersection of C code and math code that are
frequently very difficult to pin down.
Finally, do be aware that the most likely thing is still a
non-related bug somewhere else in the code and that this will
take a non-trivial amount of work to uncover if it's the case.
I think many of us would be interested to hear the outcome of
your debugging efforts when you have some news.
Greg
More information about the linuxCprogramming
mailing list