memblock vs dot vs dot with unlocked pixels [speed test]

Author

Message

Daniel TGC

Retired Moderator

17

Years of Service

User Offline

Joined: 19th Feb 2007

Location: TGC

Posted: 28th Dec 2011 12:28 Edited at: 28th Dec 2011 12:33

Link

Hi guys,

I've often heard that memblocks are super fast but I don't know anyone who's actually gone out of their way to test their speed. So I pulled a fractal program from the snippets board, and did some experiments and optimization on the code I found. I quickly discovered that moving the sync command out of the loop dramatically sped the program up by as much as 10 times. I then proceeded to test memblocks against the dot command in three different ways.

1. Dot command without unlocking and locking pixels
2. Dot command with unlocking and locking pixels
3. Memblock command

The results are as follows:

1. DOT command without unlocking = 610ms to 630ms
2. DOT command with unlocking = 157ms to 177ms
3. Memblock method = 149ms to 151ms

So it appears that memblocks are indeed a little bit faster.

But the question is, can you make this program run any faster?

Memblock version

Sync On
Sync : Sync
Global width as DWORD
Global height as DWORD
Global bit as DWORD
c as dword
make memblock from bitmap 1, 0
width = memblock dword(1, 0)
height = memblock dword(1, 4)
bit = memblock dword(1, 8)

time = timer()

for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         memloc = XYConvert(x, y)
         write memblock dword 1, memloc, c
      endif
   next x
next y

make bitmap from memblock 0, 1
sync
totaltime = timer() - time

Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

Function XYConvert(a, b)
    Offset = 12
    a = a * 4
    b = b * width * 4
    output = Offset + a + b
EndFunction output

+ Code Snippet

Sync On
Sync : Sync
Global width as DWORD
Global height as DWORD
Global bit as DWORD
c as dword
make memblock from bitmap 1, 0
width = memblock dword(1, 0)
height = memblock dword(1, 4)
bit = memblock dword(1, 8)

time = timer()

for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         memloc = XYConvert(x, y)
         write memblock dword 1, memloc, c
      endif
   next x
next y

make bitmap from memblock 0, 1
sync
totaltime = timer() - time

Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

Function XYConvert(a, b)
    Offset = 12
    a = a * 4
    b = b * width * 4
    output = Offset + a + b
EndFunction output

Dot with Unlocked pixels

sync on : sync rate 0
time = timer()
sync : sync

c as dword

for y= 0 to 479
   lock pixels
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
   unlock pixels
next y
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

+ Code Snippet

sync on : sync rate 0
time = timer()
sync : sync

c as dword

for y= 0 to 479
   lock pixels
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
   unlock pixels
next y
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

Dot without unlocked pixels

sync on : sync rate 0
time = timer()
sync : sync

c as dword

for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
next y
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

+ Code Snippet

sync on : sync rate 0
time = timer()
sync : sync

c as dword

for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
next y
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

My system specs here:

+ Code Snippet

Windows 7 64-bit Premium
Sempron 145 @ 2.8Ghz core unlocked to AMD Athlon II X2 4400e @ 3.06Ghz
nVidia 8800GTS 320Mb

Back to top

Profile PM Email Website

Max P

14

Years of Service

User Offline

Joined: 23rd Jan 2010

Location:

Posted: 28th Dec 2011 17:25 Edited at: 28th Dec 2011 17:30

Link

Putting the lock pixels/unlock pixels outside the y loop will be a bit faster.

sync on : sync rate 0
time = timer()
sync : sync

c as dword
lock pixels
for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
next y
unlock pixels
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

+ Code Snippet

sync on : sync rate 0
time = timer()
sync : sync

c as dword
lock pixels
for y= 0 to 479
   for x= 0 to 639
      a#=0.0
      c#=0.0
      s#=x/100.0-3.0
      t#=y/80.0-3.0
      n=0
      repeat
         inc n
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#
      until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0
         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c
      endif
   next x
next y
unlock pixels
sync
totaltime = timer() - time
Set Window Title str$(totaltime)
wait key
end

function fastabs(n#)
   if n# < 0 then n# = -n#
endfunction n#

Results:
1. DOT command without unlocking = 1051 ms
2. DOT command with unlocking = 208 ms
3. Memblock method = 210 ms
4. DOT command with unlocking (lock outside loop) = 201 ms

Quote: "So it appears that memblocks are indeed a little bit faster."

Seems like locking the pixels is faster for me.

Back to top

Profile PM

MrValentine

AGK Backer

13

Years of Service

User Offline

Joined: 5th Dec 2010

Playing: FFVII

Posted: 28th Dec 2011 20:42

Link

How does Occlusion culling work¿

How about fastsync¿

Back to top

Profile PM

WLGfx

16

Years of Service

User Offline

Joined: 1st Nov 2007

Location: NW United Kingdom

Posted: 28th Dec 2011 21:35

Link

Some interesting results.

Mental arithmetic? Me? (That's for computers) I can't subtract a fart from a plate of beans!
Warning! May contain Nuts!

Back to top

Profile PM Email Website

Kevin Picone

21

Years of Service

User Offline

Joined: 27th Aug 2002

Location: Australia

Posted: 28th Dec 2011 22:41

Link

+ Code Snippet

     until n=16 or fastabs(c#)>9.0 or fastabs(a#)>9.0
      if fastabs(c#)<9.0 or fastabs(a#)<9.0

With stuff like this, you can potentially save yourself some call overhead by calc'ing the value(s) once.

eg

+ Code Snippet


     abs_c#=fastabs(c#)
     abs_a#=fastabs(a#)

     until n=16 or abs_c#>9.0 or abs_a#>9.0
      if abs_c#<9.0 or abs_a#<9.0

      endif

in-lining the abs function might be better here,

+ Code Snippet


     abs_c#=c# 
     if c#<0  then abs_c#=-c#

   ; or possibly it's quicker in pro like so
     if a#<0  
           abs_a#=-a#
     else
          abs_a#=a# 
     endif

     until n=16 or abs_c#>9.0 or abs_a#>9.0
      if abs_c#<9.0 or abs_a#<9.0

you could try re-arrangement the comparisons as there are bound to be some variations that are more optimal in DBpro than others. Boolean expressions might be a better, just depends on the compiler again.

The repeat until can probably be replaced with a For/next and unrolled a bit.

+ Code Snippet


      for n=0 to 15
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#

         abs_c#=fastabs(c#)
         abs_a#=fastabs(a#)

         if abs_c#>9.0 or abs_a#>9.0 then exit
      next n

by stepping in groups of 2 say, you can avoid some more loop overhead. Groups of 4 is often a happy medium between size and loop overhead

+ Code Snippet


      for n=0 to 15 step 2
         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#

         abs_c#=fastabs(c#)
         abs_a#=fastabs(a#)

         if abs_c#>9.0 or abs_a#>9.0 then goto Done

         h#=s#*s#
         k#=t#*t#
         c#=s#*(h#-3.0*k#)+0.5
         a#=t#*(3.0*h#-k#)
         t#=a#
         s#=c#

         abs_c#=fastabs(c#)
         abs_a#=fastabs(a#)

         if abs_c#>9.0 or abs_a#>9.0 then exit


      next n
Done:

An array is probably better for the colour table than a select case, in particular when Dbpro doesn't have precalc of functions. So RGB() is being executed, even though the function is producing a constant result...

So this stuff

+ Code Snippet

         select n
            case 1 : c=rgb(255,255,255) : endcase
            case 2 : c=rgb(255,255,0): endcase
            case 3 : c=rgb(255,0,255): endcase
            case 4 : c=rgb(0,255,255): endcase
            case 5 : c=rgb(255,0,0): endcase
            case 6 : c=rgb(0,255,0): endcase
            case 7 : c=rgb(0,0,255): endcase
            case 8 : c=rgb(255,200,0): endcase
            case 9 : c=rgb(0,255,200): endcase
            case default : c=rgb(0,0,0) : endcase
         endselect
         dot x,y,c

+ Code Snippet

      if abs_c#<9.0 or abs_a#<9.0
         dot x,y,Palette(n)
      endif

etc.

Hmm, I wonder what's coming in 2011?

Back to top

Profile PM Website

Sorry your browser is not supported!

DarkBASIC Professional Discussion / memblock vs dot vs dot with unlocked pixels [speed test]