Getting (possibly) 500% speed gain on divisions
David Landman - 19/Aug/2005 (but it's still true!)
David Landman - 19/Aug/2005 (but it's still true!)
[SHOWTOGROUPS=4,20]
One of my fellow students once made a remark about the old days when a division was much faster if it was written as multiplication.
This means, instead of dividing with a 100, you multiply with 0.01 (1/100).
Because some programs of mine have a lot of divisions in their core loops, I investigated the difference.
I created the following test program.
Resulted:
On an old P3 900Mhz:
Odd Number division:
On a new P4 2.3 Ghz:
Odd Number division:
in my PC:
D:\RADStudioProjetos\Win32\Debug>Project2.exe (in Release mode a little bit difference to "minus")
Odd Number division:
D:\RADStudioProjetos\Win64\Debug>Project2.exe (in Release mode a little bit difference to "minus")
Odd Number division:
The results are variating, on simple numbers like 0.01 the speedup is allways working, but somehow the very complex numbers tend to be slower sometimes.
I use this tip allow when working with percentage.
[/SHOWTOGROUPS]
One of my fellow students once made a remark about the old days when a division was much faster if it was written as multiplication.
This means, instead of dividing with a 100, you multiply with 0.01 (1/100).
Because some programs of mine have a lot of divisions in their core loops, I investigated the difference.
I created the following test program.
Код:
program DivVsMult;
{$APPTYPE CONSOLE}
uses
SysUtils,
Windows;
var
Start,Stop, Start2, Stop2, Freq:int64;
i : integer;
t : real;
CpuSpeed : integer;
begin
{ TODO -oUser -cConsole Main : Insert code here }
{ Cpu Speed fastes cpu = 1 slower => 10
it's just to determin the number of time to do the loop
Maxint div CpuSpeed is calculated }
if ParamCount = 1 then
CpuSpeed := StrToIntDef(ParamStr(1),1)
else
CpuSpeed := 10;
Writeln('Simple Number division:');
Writeln('Calculating');
QueryPerformanceFrequency(freq);
QueryPerformanceCounter(Start);
for i:=0 to MaxInt div CpuSpeed do
t := i * (1/100);
QueryPerformanceCounter(Stop);
Writeln(Format('First Pass Result: %f',[t]));
{ This is needed because the compiler would optimize,
and would notice the result of the loop isn't used at all,
so therefor the result is useless.. so depending on the compiler, it will
choose what to do with it, this disables that optimization }
QueryPerformanceCounter(Start2);
for i:=0 to MaxInt div CpuSpeed do
t := i * (1/100);
QueryPerformanceCounter(Stop2);
Writeln(Format('Second Pass Result: %15.6f',[t]));
{ This is needed because the compiler would optimize,
and would notice the result of the loop isn't used at all,
so therefor the result is useless.. so depending on the compiler, it will
choose what to do with it, this disables that optimization }
Writeln('Done, Results:');
Writeln(Format('/ 100 Time: %6.4f seconds'+#13#10+
'/ 100 Clock: %d ticks'+#13#10+
'* 0.01 Time: %6.4f seconds'+#13#10+
'* 0.01 Clock: %d ticks',
[(Stop-Start) / freq,
(Stop-Start),
(Stop2-Start2) / freq,
(Stop2-Start2)]));
Writeln;
Writeln('Odd Number division:');
QueryPerformanceCounter(Start);
for i:=0 to high(i) div CpuSpeed do
t := i / 556;
QueryPerformanceCounter(Stop);
Writeln(Format('First Pass Result: %15.6f',[t]));
{ This is needed because the compiler would optimize,
and would notice the result of the loop isn't used at all,
so therefor the result is useless.. so depending on the compiler, it will
choose what to do with it, this disables that optimization }
QueryPerformanceCounter(Start2);
for i:=0 to high(i) div CpuSpeed do
t := i * (1/556);
QueryPerformanceCounter(Stop2);
Writeln(Format('Second Pass Result: %15.6f',[t]));
Writeln(Format('/ 556 Time: %6.4f seconds'+#13#10+
'/ 556 Clock: %d ticks'+#13#10+
'* (1/556) Time: %6.4f seconds'+#13#10+
'* (1/556) Clock: %d ticks',
[(Stop-Start) / freq,
(Stop-Start),
(Stop2-Start2) / freq,
(Stop2-Start2)]));
Writeln(Format(' (1/556) = %15.14f (approximate)',[1/556]));
// Readln;
end.
Resulted:
On an old P3 900Mhz:
Код:
Simple Number division:
Calculating
First Pass Result: 2147483,64
Second Pass Result: 2147483,64
Done, Results:
/ 100 Time: 10,3319 seconds
/ 100 Clock: 36983482 ticks
* 0.01 Time: 2,0378 seconds
* 0.01 Clock: 7294251 ticks
Odd Number division:
Код:
First Pass Result: 386238,0647482014610000
Second Pass Result: 386238,0647482014610000
Done, Results:
/ 556 Time: 10,0735 seconds
/ 556 Clock: 36058581 ticks
* (1/556) Time: 2,0446 seconds
* (1/556) Clock: 7318775 ticks
(1/556) = 0,0017985611510791 (approximate)
On a new P4 2.3 Ghz:
Код:
Simple Number division:
Calculating
First Pass Result: 2147483.64
Second Pass Result: 2147483.64
Done, Results:
/ 100 Time: 4.6227 seconds
/ 100 Clock: 16547055 ticks
* 0.01 Time: 1.0782 seconds
* 0.01 Clock: 3859508 ticks
Odd Number division:
Код:
First Pass Result: 386238.064748
Second Pass Result: 386238.064748
Done, Results:
/ 556 Time: 4.5820 seconds
/ 556 Clock: 16401425 ticks
* (1/556) Time: 12.1746 seconds
* (1/556) Clock: 43579366 ticks
(1/556) = 0.00179856115108 (approximate)
in my PC:
- Intel i7 4770K, 8GB RAM)
- Windows 10 1909 (18363.815) 64bits
- RAD Studio 10.3.3 Rio Arch
D:\RADStudioProjetos\Win32\Debug>Project2.exe (in Release mode a little bit difference to "minus")
Код:
Simple Number division:
Calculating
First Pass Result: 2147483,64
Second Pass Result: 2147483,640000
Done, Results:
/ 100 Time: 0,4676 seconds
/ 100 Clock: 4676032 ticks
* 0.01 Time: 0,4487 seconds
* 0.01 Clock: 4487480 ticks
Odd Number division:
Код:
First Pass Result: 386238,064748
Second Pass Result: 386238,064748
/ 556 Time: 0,9980 seconds
/ 556 Clock: 9979994 ticks
* (1/556) Time: 0,4545 seconds
* (1/556) Clock: 4544878 ticks
(1/556) = 0,00179856115108 (approximate)
D:\RADStudioProjetos\Win64\Debug>Project2.exe (in Release mode a little bit difference to "minus")
Код:
Simple Number division:
Calculating
First Pass Result: 2147483,64
Second Pass Result: 2147483,640000
Done, Results:
/ 100 Time: 0,4611 seconds
/ 100 Clock: 4610651 ticks
* 0.01 Time: 0,4490 seconds
* 0.01 Clock: 4490457 ticks
Odd Number division:
Код:
First Pass Result: 386238,064748
Second Pass Result: 386238,064748
/ 556 Time: 1,2865 seconds
/ 556 Clock: 12864881 ticks
* (1/556) Time: 0,4482 seconds
* (1/556) Clock: 4481993 ticks
(1/556) = 0,00179856115108 (approximate)
The results are variating, on simple numbers like 0.01 the speedup is allways working, but somehow the very complex numbers tend to be slower sometimes.
I use this tip allow when working with percentage.
[/SHOWTOGROUPS]
Последнее редактирование: