B4J Question Testing Performance Between Types and Maps--You Won't Believe The Results!!!

cklester

Well-Known Member
Licensed User
How's that for a clickbait title?! šŸ¤£

OK, now down to serious business:

I've got an app I'm building where I could easily make some portions of the data as a Type() or store it in a map.

I'll either have a map of Types() or a map of maps.

I wanted to know which one was more performant, as time will be of the essence in some use-cases.

Here's the little test program I wrote:

(To run it, just "File>New>Console (Non-UI)," and paste the above code in the "Main" source. Then press "F5.")

B4X:
'Non-UI application (console / server application)
#Region Project Attributes
    #CommandLineArgs:
    #MergeLibraries: True
#End Region

Sub Process_Globals
    Type Asset (best_bid As Float, best_ask As Float, lastPrice As Float)
End Sub

'pulled from mcqueccu's post, here: https://www.b4x.com/android/forum/threads/b4x-uuid-version-4-generator.110975/#content
Sub UUIDv4 As String
    Dim sb As StringBuilder
    sb.Initialize
    For Each stp As Int In Array(8, 4, 4, 4, 12)
        If sb.Length > 0 Then sb.Append("-")
        For n = 1 To stp
            Dim c As Int = Rnd(0, 16)
            If c < 10 Then c = c + 48 Else c = c + 55
            If sb.Length = 19 Then c = Asc("8")
            If sb.Length = 14 Then c = Asc("4")
            sb.Append(Chr(c))
        Next
    Next
    Return sb.ToString.ToLowerCase
End Sub

'slightly modified from Peter's code, here: https://www.b4x.com/android/forum/threads/randomly-shuffle-a-string-array.39435/#content
Public Sub ShuffleArray(IntArray() As Int) As Int()
    Dim ArrayVal As String
    Dim Random As Int
    For i = 0 To IntArray.Length - 1
        Random = Rnd(i, IntArray.Length)
        ArrayVal = IntArray(i)
        IntArray(i) = IntArray(Random)
        IntArray(Random) = ArrayVal
    Next
    Return IntArray
End Sub

'create one million assets with random bids/asks/prices
'build one map with types, one with maps
'perform a test of map.get() and pulling all the data into a StringBuilder string
'see who is fastest...

Sub AppStart (Args() As String)
    Private testMapWithType As Map
    Private testMapWithMap As Map

    Private testRunsCount As Int = 999999

    Private numbers(1000000) As Int

    Private listOfNames As List
    listOfNames.Initialize

    testMapWithMap.Initialize
    testMapWithType.Initialize

    Private i As Int

    Private timeStart As Long
    Private timeEnd As Long

    'build the one million assets  
    timeStart = DateTime.Now
    For i = 0 To 999999
        Private assetName As String
        Private thisAsset As Asset
        Private AssetMap As Map
        AssetMap.Initialize
      
        Private best_ask, best_bid, lastPrice As Float
      
        best_ask = Rnd(1,999999)
        best_bid = best_ask * 0.99
        lastPrice = ( best_ask + best_bid ) / 2
      
        thisAsset.best_ask = best_ask
        thisAsset.best_bid = best_bid
        thisAsset.lastPrice = lastPrice

        AssetMap.Put("best_ask",best_ask)
        AssetMap.Put("best_bid",best_bid)
        AssetMap.Put("lastPrice",lastPrice)
      
        assetName = UUIDv4
      
        Do While testMapWithMap.ContainsKey(assetName)
            assetName = UUIDv4
        Loop

        listOfNames.Add(assetName)
        numbers(i) = i
        testMapWithMap.Put(assetName,AssetMap)
        testMapWithType.Put(assetName,thisAsset)
    Next
    timeEnd = DateTime.Now
    Log("Built one million unique assets")
    Log(timeStart)
    Log(timeEnd)
    Log(timeEnd- timeStart)

    numbers = ShuffleArray(numbers)
    Private assetNamesList As List
    assetNamesList.Initialize

    'shuffle all the asset names
    For i= 0 To testRunsCount
        assetNamesList.Add(listOfNames.Get(numbers(i)))
    Next
  
    Log("Testing Maps with Types")
    Private newAsset As Asset
    Private newMap As Map
    newMap.Initialize
  
    Private disp As StringBuilder
    disp.Initialize
  
    timeStart = DateTime.Now
    For i=0 To testRunsCount
        'I suspect this is faster than disp.Remove(0,disp.Length), but don't know for sure...
        disp.Initialize
        newAsset = testMapWithType.Get(assetNamesList.get(i))
        disp.Append(assetName).Append(" -> ").Append(newAsset.best_ask).Append(" / ").Append(newAsset.best_bid).Append(" / ").Append(newAsset.lastPrice)
    Next
    timeEnd = DateTime.Now
    Log($"Length of disp: ${disp.Length}"$)
    Log($"Last disp: ${disp.ToString}"$)
    Log($"Last asset: ${assetName}"$)
    Log(timeStart)
    Log(timeEnd)
    Log(timeEnd - timeStart)

    Log("Testing Maps with Maps")
    disp.Initialize

    timeStart = DateTime.Now
    For i=0 To testRunsCount
        disp.Initialize
        newMap = testMapWithMap.Get(assetNamesList.get(i))
        disp.Append(assetName).Append(" -> ").Append(newMap.get("best_ask")).Append(" / ").Append(newMap.Get("best_bid")).Append(" / ").Append(newMap.Get("lastPrice"))
    Next
    timeEnd = DateTime.Now
    Log($"Length of disp: ${disp.Length}"$)
    Log($"Last disp: ${disp.ToString}"$)
    Log($"Last asset: ${assetName}"$)
    Log(timeStart)
    Log(timeEnd)
    Log(timeEnd - timeStart)
  
End Sub

OK, so a couple of issues:
  1. Why does timeStart and timeEnd never change values?! I am assigning a new DateTime.Now value to each variable each time.
  2. There is a discrepancy in the results. The displayed final string is different for both runs, but they should be the same!
  3. Also, some values are more precise in one set than the other! Looks like maps aren't holding the precise value.
Those issues are somewhat important, because precision is important, but what is most important is that I have the timing correct.

If the timing is correct, the results show that it does not matter (for a dataset of one million items) whether or not you store Types() in the map or maps in the map. You can retrieve the entire dataset in less than a second...

Were you surprised?! And impressed?! (Hopefully there are no glaring bugs that will invalidate the results.)

It would be great if someone would review the code for correctness and try to fix the glitches... šŸ˜ I'm sure there is much to learn!

Thank you!

First Bug Fixed! - DateTIme.Now is a Long, not a Float
 
Last edited:

agraham

Expert
Licensed User
Longtime User
I'm primarily testing whether or not it is faster to extract fields of data from a type or from a map
Extracting fields from a type will be faster than getting values for a key in a map. Accessing a field in a type is a direct indexing operation whereas a map is at least a two step operation in finding the key through its hash code and then getting it's value.
 
Upvote 1

William Lancee

Well-Known Member
Licensed User
Longtime User
Hello
Your test is impressive and seductive, BUT...

1. You are testing whether a map of 5 items is slower than a type of 5 items.
2. the million part is irrelevant except to magnify any difference
3. Type is not really a computational issue, it is an organizing factor for the compiler
4. You are really testing how fast is a Map of 5 items. I am sure that such a small map does not use hashing, and is very fast.
5. Your results confirm this

Thank you for the exercise.
 
Upvote 0

cklester

Well-Known Member
Licensed User
Hello
Your test is impressive and seductive, BUT...

1. You are testing whether a map of 5 items is slower than a type of 5 items.

I'm primarily testing whether or not it is faster to extract fields of data from a type or from a map. The results seem to show extracting from a type is slightly faster.

2. the million part is irrelevant except to magnify any difference

Yes, I knew the transactions were going to be fast (microseconds), so I iterate over enough times to see a difference.

3. Type is not really a computational issue, it is an organizing factor for the compiler
4. You are really testing how fast is a Map of 5 items. I am sure that such a small map does not use hashing, and is very fast.
5. Your results confirm this

Thank you for the exercise.

Thank you for the feedback!
 
Upvote 0

cklester

Well-Known Member
Licensed User
Extracting fields from a type will be faster than getting values for a key in a map. Accessing a field in a type is a direct indexing operation whereas a map is at least a two step operation in finding the key through its hash code and then getting it's value.

Ah, interesting. So, then, it's just a matter of determining the utility of one vs the other per context.
 
Upvote 0
Top