[LC++]optimizing array access: x86 instruction timings
Dr Mark H Phillips
mark at austrics.com.au
Thu Jan 29 15:48:02 UTC 2004
On Thu, 2004-01-29 at 16:57, Paul Gearon wrote:
> All I can offer is hearsay, but I'm pretty sure I've seen docs on 486s
> which said that all addressing modes take the same period of time.
Yes I read that somewhere too, but I also read the opposite. Sometimes
web "information" can be frustrating.
> Either way, it's very difficult to quantify these time periods due to
> the nature of pipelined, superscalar architectures. Depending on what
> else you're doing in your loop you may also find that out-of-order
> execution and pipelining could make any efficiencies like this
> completely redundant. If the addl were followed by an unrelated
> long-running operation then the addl might even occur in 0 effective
> time! :-)
Hmmm. Are there _ever_ situations where choosing a pointer incrementing
solution over index-based addressing is clearly better?
> In other words, the CPU is designed in such a way as to make questions
> like this purely hypothetical. If it really *is* an issue then the best
> solution is to try both methods and profile them. However, I'd really
> be surprised to discover that this was a bottleneck in any program.
You are right: profiling is the final answer (any tips on good profiling
tools/methodologies?), but it would be nice to be able to do some level
of reasoning about efficiency.
> Now that I think about it, all you're doing is initialising a block of
True. I should point out that this was only one of several array
addressing situations I am wanting to analyse. I wanted to work out
the principles at work and this was the simplest case (which makes
reading the assembler code easier).
More information about the tuxCPProgramming