B4J Library [B4X][ListOfArrays] wLOAExtras.b4xlib for Evaluating Expressions of Columns in ListOfArrays - and More

The wLOAExtras library has two classes:
wLOAEval: class modeled after https://www.b4x.com/android/forum/threads/b4x-eval-expressions-evaluator.54629/
wLOAExtras: the works

wLOAExtras has 3 evaluation methods: EvalReplace EvalAppend EvalCreate
They are the same except for where the results end up (replacement of original; appended as extra columns; as a new LOA)
Think about the evaluator as an interpreter for single lines of B4X code.
For example, if 'wLOA' is instance of wLOAExtras and A, B, C and D are names of columns of a LOA:
B4X:
Dim LOA2 As ListOfArrays = wLOA.evalCreate(LOA1, Array(Result), "A + B * (C - D)")
The arguments are as follows:
1. The name of the source of all the data LOA
2. A columns specification which is very flexible
- Column names As List
- Column names As Array of Objects as in above example
- Column name As String will be converted to Array(theString)
- Null or "" which will be converted to Array() which signifies "All colums in LOA
3. The expression to be evaluated
- The operators can be any of + - * / & mod (surrounded by spaces - mod is internally changed to "|")
- literal strings - surrounded by SINGLE quotes - are turned into tokens as a first step
- numerical values are treated as expected
- The function calls can be any of the standard native B4X plus some extras
Min Max Sin Cos Tan Abs Floor Ceil Round Sqrt ATan Power Logarithm And Nand Or Nor Not
Asc Chr Length Substring2 Substring Replace ToLowerCase ToUpperCase ToCamelCase ToTitleCase Contains
Indexof Indexof2 Lastindex Lastindexof2 Standardize Normalize DegreesToRadians RadiansToDegrees

- Note that Min Max And Nand Or Nor can have more than two arguments
- If a specified operation or function can not be applied to its operands, the result will be the Null object.
- All angles in trigonometry functions will be in degrees.
- All ? in the expression will be replaced by the name of the current column being processed

wLOAExtras has 5 convenient creation functions - particularly useful for creating test LOAs.
They are: Constants, Sequence, Collect, RandomInt, RandomDouble
You specify the shape of the table: nrows and ncols.
Collect takes what is provided in the Array(...) and cycles through the items to fill the table.

You can also shuffle the items afterwards in place with ShuffleRows, ShuffleCols

wLOAExtras
has methods for selecting rows - A new LOA with new rows is created:
a. when a evaluation expression (as specified above) is true. [RowsIfTrue]
b. when all items in a row are valid (non-Null and with an object type equal to
the column's predominant object type (>95% of items have this type) [RowsIfValid]

You can also create new LOAs by selecting columns from another LOA (in any order), or by excluding columns (keeping order): SelectCols and ExcludeCols

Two functions are extremely useful in statistical work and machine learning projects. Both produce new LOAs with new rows:
a. Recode: changes a value into another using a Map. The newly created LOA has new rows.
b. One-Hot: transforms one column of categorical data into multiple columns of 0 or 1 if the value=category

Column names in newly created LOAs in this library are generated automatically based on a schema you specify.
"c": c_1 c_2 c_3 etc. [This is the default if you don't specify anything]
"C": C1 C2 C3 etc.
"_": _1_ _2_ _3_ etc.
"A": A B C ... Z AA BB CC etc.
"a": a b c ... z aa bb cc etc.
You can make your own schema by calling wLOA.NameSchema with a callback reference to a Sub.
That sub should change a column index into a name string. The sub 'applyStyle()' in wLOA may help to do that.

After a LOA is created, you can change any of its column names with the RenameColumns method which uses a Map.

There is a fancy wLOA.Display method that tries to keep the columns aligned in the Logs.
It can display any number of rows and format numerical data.
For the curious among you who looked at the examples running in B4J: where did the name of the LOA come from?
You'll have to unzip the wLOAExtras.b4xlib and look at the wLOA module in the Display method.

There is informative error handling for unknown column names and unknown function references.

There are many DateTime functions. This subject will be discussed in a later post.

The ListOfArrays is an excellent object to use for vector and matrix specification for
a wide variety of tools. For example:
summary statistics: sum mean variance stdDev frequencies median percentiles
matrix operations: reshape transpose inverse
other: regression breakdown graphs grid

I invite you to contribute additional items to this list.
If you are interested, take a look at the source code for wLOAExtras
Copy wLOAExtras.b4xlib to a new folder, change name to wLOAExtras.zip and unzip.

Post here for suggestions for improvement of if you encountered a bug. You may also PM me.

Note: I have tested in B4J console, B4Pages (both B4J and B4A) and B4A default templates.
 

Attachments

  • testlib.zip
    8.4 KB · Views: 21

William Lancee

Well-Known Member
Licensed User
Longtime User
Here is a B4Xpages project with many examples of using the wLOAExtras library.
Make sure to add the library (see #1) to your Additional folder.
 

Attachments

  • examples.zip
    10.7 KB · Views: 25
Last edited:

William Lancee

Well-Known Member
Licensed User
Longtime User
Some comments about DateTime methods in wLOAExtras.

The first group of methods convert date strings to DateTime Longs: ToDate ToTime ToDateTime
Strings must be in the DateTime.DateFormat set prior to use of these methods.
For ToDateTime, strings must be in [DateFormat]_T_[TimeFormat]. Ex. "1 JAN 2026_T_12:30"

These methods return Date Longs corresponding to Date Strings in the column spec.
"colspec" can be: 1. column names as List; 2. column names as Array; 3. column name as String
If colspec is Null or Array() or "" the operations will be applied to all columns
The original is un-affected. The result has the same number of rows as the original LOA.

If the date/time string is invalid (does not conform to the specified DateTime formats) the result will be Null.

Date Longs can also be created by specifying Years, Months, Days, Hours, Minutes, Seconds; if a value is omitted it is set to 0.
DateAndTime(LOA As ListOfArrays, colspec As Object, params() As Object) As ListOfArrays

The following methods take Data Longs in columns into date strings in a new LOA:
AsDate AsTime AsDateTime

All the standard DateTime methods are implemented
DayOfYear DayOfMonth DayOfWeek Weekday Week Year Month MonthName Hour Minute Second

Some extra methods are:
Month3lets Month3caps Weekday3lets Weekday3caps Quarter

Three methods are taken from DateUtils:
AddPeriod DaysBetween Monthdays (For a month and year in a column of Date Longs - Feb, 2024 => 29)

Why are DataTime functions not implemented in the expressions used in EvalReplace, EvalAppend, and EvalCreate?
I always wanted to say this: "It was a design decision and functionality is not affected"
 
Last edited:

William Lancee

Well-Known Member
Licensed User
Longtime User
Version 1.03 Post #1 is updated

1. Added features to accomodate objects in ListOfArrays that are not scalar.
2. Fine tuned error reporting - now you get the B4J source line # instead of Java line #
3. Added a statistical summary of numerical columns

Scalar types are: Int, Float, Double, Long, Byte, Short, String, Boolean, Char
The first example uses an Array to store the x and y of a Point (as you could do on a canvas or bitmap).
B4X:
    'create some random data points (x,y tuples)
    RndSeed(42)  'to keep demo testing consistent
    Dim data(6) As Object
    For i = 0 To 5
        data(i) = Array(Rnd(0, 600), Rnd(0, 600))
    Next
    Dim TuplesLOA As ListOfArrays = wLOA.collect(3, 2, data)
    wLOA.Display(TuplesLOA, 0, 0)
#If DemoLog
________ TuplesLOA __________
A                 B          
――――――――――――――――――――――――
530,363     48,284
570,325     305,518
319,293     182,302
―――――――――――― #rows=3 #cols=2
#End If

    'iterate through TuplesLOA and compute Distance between the two columns
    Dim ls As LOASet = TuplesLOA.CreateLOASet
    Dim lst As List: lst.Initialize
    Do While ls.NextRow
        Dim A() As Object = ls.GetValue("A")
        Dim B() As Object = ls.GetValue("B")
        lst.Add(Sqrt((A(0) - B(0)) * (A(0) - B(0)) + (A(1) - B(1)) * (A(1) - B(1))))
    Loop
    TuplesLOA.AddColumn("Distance", lst)
    wLOA.Display(TuplesLOA, 0, 1)
#If DemoLog
________ TuplesLOA __________
A                 B     Distance
――――――――――――――――――――――――
530,363     48,284      488.4    
570,325     305,518     327.8    
319,293     182,302     137.3    
―――――――――――― #rows=3 #cols=3
#End If

The second example uses the Point class instead of an tuples array (Point class is included in the Deomo1.03.zip)
I also tested how fast it would be on my system in release mode.
It took 243 msecs to compute distance and slope for the two columns A and B of a LOA with 1 million rows.

B4X:
    Dim n As Int = 1000000
    RndSeed(42)  'to keep demo testing consistent - i.e. same points as data above, same results
    Dim pts(n) As Point
    For i = 0 To n - 1
        pts(i) = pt.new(Rnd(0, 600), Rnd(0, 600))
    Next
 
    Dim markTime As Long = DateTime.now
    Dim PointsLOA As ListOfArrays = wLOA.collect(n, 2, pts)
    Dim lst1 As List = wLOA.emptyLst
    Dim lst2 As List = wLOA.emptyLst
    Dim ls As LOASet = PointsLOA.CreateLOASet
    Do While ls.NextRow
        Dim PA As Point = ls.GetValue("A")
        Dim PB As Point = ls.GetValue("B")
        lst1.Add(PA.distance(PB))
        lst2.Add(PA.slope(PB))
    Loop
    PointsLOA.AddColumn("Distance", lst1)
    PointsLOA.AddColumn("Slope", lst2)
    Log($"Time to do above task on 1000000 rows: ${DateTime.Now - markTime} msecs"$)
    '243 msecs on 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GH

    wLOA.Display(PointsLOA, 5, 3)

#If DemoLog
________ PointsLOA __________
A                       B      Distance       Slope        
――――――――――――――――――――――――
(530, 363)     (48, 284)      488.431         0.164        
(570, 325)     (305, 518)     327.832         -0.728      
(319, 293)     (182, 302)     137.295         -0.066      
(276, 492)     (476, 32)      501.597         -2.300      
(456, 570)     (343, 209)     378.272         3.195        
―――first 5 rows―― #rows=1000000 #cols=4

Time to do above task on 1000000 rows: 243 msecs
#End If

The next example shows a statistical summary of numerical data columns
B4X:
    Dim SampleLOA As ListOfArrays = wLOA.randomInt(100, 3, Array(1, 10))
    SampleLOA.SetValue(3, 1, Null)
    SampleLOA.SetValue(5, 2, "This is unsual")
    wLOA.display(SampleLOA, 7, -1)
#If DemoLog
________ SampleLOA __________
A             B             C                        
――――――――――――――――――――――――
2             5             5                        
3             9             10                      
6             7             9                        
8             --            7                        
1             1             5                        
7             4             This is unsual
2             1             9                        
―――first 7 rows―― #rows=100 #cols=3
#End If

    Dim StatsLOA As ListOfArrays = wLOA.StatsNumerical(SampleLOA, Array(), 5, "%")
    wLOA.display(StatsLOA, 0, -1)
#If DemoLog
________ StatsLOA __________
Statistic           A             B             C      
――――――――――――――――――――――――
Number Valid       100            99            99    
Minimum              1             1             1      
Maximum             10            10            10    
Sum                553           544           559  
Mean               5.5           5.5           5.6  
Variance          9.08          8.40          9.05
Std. Dev.         3.01          2.90          3.01
Bin Size          1.80          1.80          1.80
Bin #1             21%           20%           23%  
Bin #2             21%           20%           14%  
Bin #3             13%           17%           14%  
Bin #4             24%           25%           26%  
Bin #5             21%           17%           22%  
1%tile               1             1             1      
5%tile               1             1             1      
10%tile              1             1             1      
25%tile              2             1             1      
33.3%tile            3             3             3      
50%(Median)          3             4             4      
66.7%tile            6             6             6      
75%tile              7             7             8      
90%tile              8             8             8      
95%tile             10             9             9      
99%tile             10            10            10    
―――――――――――― #rows=24 #cols=4
#End If
    Log(wLOA.GetStdDev(StatsLOA))
#If DemoLog
{A=3.0132870739275717, B=2.8974786272520454, C=3.0078575035218136}
#End If

This example illustrates error reporting:
B4X:
    'Create a 5 x 3 table of random number with default column names
    Dim LOA4 As ListOfArrays = wLOA.randomDouble(5, 3, 5)
    wLOA.evalReplace(LOA4, Array(), "XX + 360 * ?")
#If DemoLog
ERR: Unknown item 'xx' in expression:  XX + 360 * ? [b4xmainpage line 99]
#End If

Attached are the examples as a B4XPages project, tested on B4J and B4A.
Make sure to add latest version of wLOAExtras.b4xlib (see post#1) to your Additional folder
 

Attachments

  • Demo1.03.zip
    14.4 KB · Views: 5
Last edited:
Top