Speeding up loops?

DaveW

Active Member
Licensed User
Longtime User
As part of my program I have to convert an array of data into an image. The code works fine but because of the large number of points in the array, the process takes a very long time. One medium sized test file is 43200 points (360 x 120) and on my device it takes 233 seconds (nearly 4 min) to go through the data! This is not really acceptable and I wondered if anyone has any bright ideas for reducing the time. The only thing I can think of that might help is transfering the code to a small DLL. I am prepared to give that a go, but only if it really will help (I have never written DLLs before).

The code I am using is:
B4X:
        'palette is an imagelist of 1 x 200 pixel images for the colorschemes
   drawer.New2(Img.Image,B4PObject(5))
   brush1.New1(cBlack)
   bX = 0 'the position of a block to draw on the image
   bY = 0
   bw = 1  'the size of the block drawn (each datapoint needs to be drawn as 1 pixel wide & 4 high)
       bh = 4
       Cols = arraylen(dataarray(),1) 'the array of source values
       Rows = arraylen(dataarray(),2)
   For y = 0 To Rows-1
       label1.Text = " Drawing row " & y & " of " & Rows
       form1.Refresh 'slows it down but when you are waiting 4 mins you need some feedback!
       For x = 0 To Cols-1
                brush1.Color = palette.Pixel(ColorScheme,0,dataarray(x,y))
           drawer.FillRectangle2(brush1.Value,bx,by,bw,bh)
      bX = bX + bw
       Next
       bY = bY + bh
       bX = 0
   Next

By the way, I have tried using the FastSetPixel method but that turned out to be slower than drawing a rectangle.

David.
 

agraham

Expert
Licensed User
Longtime User
The only obvious change I would suggest is moving the colour fetching out of the inner loop as that is probably expensive. You might gain a little by not informing the user so often but most of the time is taken in the inner loop and there nothing else to move out.
B4X:
' prefetch the colours
  Dim PixelCols(palette.Count) ' actually a ReDim, the original is in Globals
  For i = 0 To palette.Count - 1
    PixelCols(i) = palette.Pixel(ColorScheme,0,i)
  Next
  For y = 0 To Rows-1
    If y Mod 10 = 0 Then ' inform the user less often
      label1.Text = " Drawing row " & y & " of " & Rows      
      DoEvents ' "might" be quicker than a full refresh 
    End IF
    For x = 0 To Cols-1
      brush1.Color = PixelCols(dataarray(x,y))
      drawer.FillRectangle2(brush1.Value,bx,by,bw,bh)
      bX = bX + bw
    Next
    bY = bY + bh
    bX = 0
  Next
 

DaveW

Active Member
Licensed User
Longtime User
Hi Andrew,

I can't move the colour fetch out as it is different for every datapoint. I tried stripping everything out - so only the empty loop was left and just that takes 50% of the time. So it seems to me that the only option is to use something other than loops in B4PPC - which is why I suggested making a DLL. I just don't know if that would really make a difference - and how hard it would be.

David.
 

agraham

Expert
Licensed User
Longtime User
I can't move the colour fetch out as it is different for every datapoint.
I meant pre-fetch all 200 colours? Did you look closely at my amended code?

I tried stripping everything out - so only the empty loop was left and just that takes 50% of the time.
Maths and therefore Loops do take a long time in Basic4ppc, especially on the device.


Don't know if that would really make a difference
Dramatically with direct bitmap manipulation. My ImageLibEx gives you an idea. BitmapEx.Mirror, Flip, RotateLeft and RotateRight are done by direct bitmap manipulation and are an indication as to the maximum sort of speed that might be obtained.

and how hard it would be.
If you need to ask ....!
 

DaveW

Active Member
Licensed User
Longtime User
:signOops: Sorry, I did not study your code. However, I had done a lot of testing and already knew that any tweaking would have only a minor effect. The loop itself is a major factor.

I have been looking at other threads discussing sharpdevelop so I think I will give that a go. Wish me luck!

David.
 

DaveW

Active Member
Licensed User
Longtime User
I just tried your suggestion of prefetching the colours and it was about 4% slower than the old direct method! The DoEvents instead of form.Refresh shaved 16% of the time.
 
Last edited:

agraham

Expert
Licensed User
Longtime User
I just tried your suggestion of prefetching the colours and it was about 4% slower than the old direct method!
I'm utterly astonished :confused: having looked at the amount of low level code that is executed for each call of palette.Pixel. You are doing this optimised compiled aren't you?

The old way does 43200 2D array lookups, 43200 calls to palette.Pixel including setting the parameters and 43200 assignments to brush1.Color.
brush1.Color = palette.Pixel(ColorScheme,0,dataarray(x,y))


My suggestion does 200 calls to palette.Pixel including setting the parameters and 200 1D array assignments.
PixelCols(i) = palette.Pixel(ColorScheme,0,i)

The inner loop now does 43200 2D array lookups, 43200 1D array lookups instead of calls to pixel.Palette and 43200 assignments to brush1.Color
brush1.Color = PixelCols(dataarray(x,y))

I really don't understand how a 1D array lookup can be slower than a method call to palette.Pixel.
 

DaveW

Active Member
Licensed User
Longtime User
The results were based on comparisons running in the IDE on the PC, not optimized compiled code ont eh device. I was just looking at relative values rather than absolute ones so I did not consider it significant (though perhaps it would be?). Also the average difference on repeated runs was probably less, however on the whole, my original method was slightly faster than your new one.
 

agraham

Expert
Licensed User
Longtime User
though perhaps it would be
It certainly would be. The overheads when optimised compiled are far lower than when running in the IDE. Also all maths in Basic4ppc is done in double-length floating point and desktops have hardware floating point whereas devices have to do it in software. This makes any reduction in floating point operations far more important on a device than on the desktop. You really should be doing this optimised compiled on a device to get valid measurements if that is where your app will end up running.
 

DaveW

Active Member
Licensed User
Longtime User
Try this library and demo app. If fills apparently instantanously on my HTC Diamond
Thanks Andrew, I will take a look.

However I have written my own DLL (really hacked one of yours - I hope you don't mind ;)) to do the display. As in your case it seems to be almost instantaneous. I have it working on the PC but have yet to try it on the device.
 

DaveW

Active Member
Licensed User
Longtime User
FYI, The Device version of the DLL works - and makes a big difference. Before the device was taking about 3 mins to create the image, now it's < 20 sec. The other minute was spend reading the XML file - so now I will have to try to fold that into the DLL as well! Not bad I think for a first stab at C# & DLLs - and in less than 24 hrs :)
 
Top