| home / programming / optimize / speedup / chap10 / 3 | [previous] [next] |
|
You can avoid the complex do/switch logic by unrolling Duff's
Device into two loops. So instead of the original, do this (see Listing
10.17):
function duffFastLoop8(iterations) {
// from an anonymous donor to Jeff Greenberg's site
var testVal=0;
var n = iterations % 8;
while (n--)
{
testVal++;
}
n = parseInt(iterations / 8);
while (n--)
{
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
}
}
This technique is about 36 percent faster than the original Duff's
Device on IE5 Mac. Even better, optimize the loop constructs by converting
the while decrement to a do while pre-decrement like
this (see Listing 10.18):
function duffFasterLoop8(iterations) {
var testVal=0;
var n = iterations % 8;
if (n>0) {
do
{
testVal++;
}
while (--n); // n must be greater than 0 here
}
n = parseInt(iterations / 8);
do
{
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
}
while (--n);
}
This optimized Duff's Device is 39 percent faster than the original
and 67 percent faster than a normal for loop (see Table 10.3).
|
500,125 Iterations |
Normal |
Duff's Device |
Duff's Fast |
Duff's Faster |
|
Total time (ms) |
1437 |
775 |
493 |
469 |
|
Cycle time (ms) |
0.00287 |
0.00155 |
0.00099 |
0.00094 |
To test the effect of different degrees of loop unrolling, I tested large iteration loops with between 1 and 15 identical statements for the Faster Duff's Device. Table 10.4 shows the results.
|
Duff's Faster |
1 Degree |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
|
Total time (ms) |
925 |
661 |
576 |
533 |
509 |
490 |
482 |
469 |
467 |
457 |
453 |
439 |
437 |
433 |
433 |
|
Cycle time (ms) |
0.00184 |
0.00132 |
0.00115 |
0.00106 |
0.00101 |
0.00097 |
0.00096 |
0.00093 |
0.00093 |
0.00091 |
0.00090 |
0.00087 |
0.00087 |
0.00086 |
0.00086 |
As you can see in Table 10.4, the effect diminishes as the degree of
loop unrolling increases. Even after two statements, the time to loop
through many iterations is less than 50 percent of a normal for
loop. Around seven statements, the time is cut by two-thirds. Anything
over eight reaches a point of diminishing returns. Depending on your requirements,
I recommend that you choose to unroll critical loops by between four and
eight statements for Duff's Device.
If you have two loops in close proximity that use the same number of iterations (and don't affect each other), you can combine them into one loop. So instead of this:
for (i=0; i<j; i++) {
sumserv += serv(i);
}
for (i=0; i<j; i++) {
prodfoo *= foo(i);
}
Do this:
for (i=0; i<j; i++) {
sumserv += serv(i);
prodfoo *= foo(i);
}
Fusing loops avoids the additional overhead of another loop control structure and is more compact.
| home / programming / optimize / speedup / chap10 / 3 | [previous] [next] |
From Speed Up Your Site: Web Site Optimization, Chapter 10.
Copyright 2003. Reproduced by permission of New Riders.
Created: January 27, 2003
Revised: January 27, 2003
URL: http://webreference.com/programming/optimize/speedup/chap10/3/4.html