B4J Question How to read file and data elements when file layout is provided with data file at the time of the run?

Kris.D · Sep 26, 2022

Hi,
I'm looking for idea how to solve issue of reading file and data elements not knowing how this file looks prior to the run.
I will always get 2 files, data file and another file with the layout of the data.
The data file is fixed length file. For example if Last Name is 30 bytes it will always be 30 bytes. (last name + trailing spaces)
But I have no idea where to start with this?
If I know my data prior to compile I think I could create Custom Type object or something like that but what do I do in this case?

Thank you.

This is simple example of data and layout.

DATA
First Name	Last Name	Member Date	Member Number	Member Status	Member Fee
John	Smith	5/10/2019	120501	A	3.00
Ann	Miller	8/2/2020	120502	A	3.00
Rich	Johnson	4/27/2020	120507	I	0.00
Mark	Wilson	9/14/2021	120504	A	3.00
Joe	Garcia	5/5/2022	120505	A	4.00
Debie	Jones	7/19/2022	120509	A	4.00

LAYOUT
	Field Name	Field Type	Field Length
1	First Name	Character	35
2	Last Name	Character	35
3	Member Date	Date
4	Member Number	Double	9
5	Member Status	Character	1
6	Member Fee	Decimal	3.2

Andrew (Digitwell) · Sep 26, 2022

This actually look very similar to Erels formbuilder.
You should be able tyo use this as a basis to do what you would like to do here

[B4X] Forms Builder - Designer for B4XPreferencesDialog

FormsBuilder is a desktop tool that helps with building templates for B4XPreferencesDialog: https://www.b4x.com/android/forum/threads/b4x-b4xpreferencesdialog-cross-platform-forms.103842/#post-651047 The output of FormsBuilder is a json file that is loaded with...

www.b4x.com

Or you could use B4xTable for this.

[B4X] B4XTable - Cross platform, sortable, searchable, customizable table

B4XTable is a paged based table. The UI is made of a horizontal xCLV. The data is stored in an in-memory SQLite database. How to use? 1. Add a B4XTable with the designer. 2. Add the columns: B4XTable1.AddColumn("US County", B4XTable1.COLUMN_TYPE_NUMBERS) B4XTable1.AddColumn("Name"...

www.b4x.com

Depending on how you want your GUI to appear

emexes · Sep 27, 2022

Kris.D said:
Hi,
I'm looking for idea how to solve issue of reading file and data elements not knowing how this file looks prior to the run.
If I know my data prior to compile I think I could create Custom Type object or something like that but what do I do in this case?

I would have two lists or arrays that mirror your two files. Arrays have the advantage that the index is known, eg if you have an array of five descriptors for five fields, then the first field is index 0, the second field is index 1, ... the last field is index 4 (= the five fields - 1).

Or you could use a map, with the field number as the key, eg:

B4X:

Type FieldDescriptorType( _
    FNumber As Int, _
    FName As String, _
    FType As String, _
    FLength As String, _
    DecimalPlaces As Int _
)

and then read the LAYOUT file line by line (skipping the header line, or using it to confirm that the file has the expected columns), and fill in the descriptor fields accordingly, including implied lengths eg:

- date is 8 or 10 or 11 characters (yyyymmdd, dd/mm/yy, dd-mmm-yyyy ?)
- 3.2 decimal is 6 characters, or maybe 7 characters if the +/- sign is not included in the 3.

Once you've filled out the FieldDescriptor Map, then (i) you know how many fields are in each record (FieldDescriptor.Length) and (ii) you can calculate the fixed record length of the DATA file, by summing the field lengths:

B4X:

Dim RecordLength As Int = 0
For Each fd As FieldDescriptorType In FieldDescriptorMap    'doesn't matter which order they're summed in
    RecordLength = RecordLength + fd.FLength
Next

And then "all you have to do is" stroll through the DATA file reading RecordLength-byte blocks and splitting them up according to the FLengths, interpreting each field according to its FType and adding them to an array, and then adding the completed array to a List of records.

On a related note: old DBF files have a structure similar to having your LAYOUT + DATA files but as a single file, and can be read by most spreadsheet programs.

Kris.D · Sep 27, 2022

emexes said:
I would have two lists or arrays that mirror your two files. Arrays have the advantage that the index is known, eg if you have an array of five descriptors for five fields, then the first field is index 0, the second field is index 1, ... the last field is index 4 (= the five fields - 1).

Or you could use a map, with the field number as the key, eg:

B4X:

Type FieldDescriptorType( _ FNumber As Int, _ FName As String, _ FType As String, _ FLength As String, _ DecimalPlaces As Int _ )

and then read the LAYOUT file line by line (skipping the header line, or using it to confirm that the file has the expected columns), and fill in the descriptor fields accordingly, including implied lengths eg:

- date is 8 or 10 or 11 characters (yyyymmdd, dd/mm/yy, dd-mmm-yyyy ?)
- 3.2 decimal is 6 characters, or maybe 7 characters if the +/- sign is not included in the 3.

Once you've filled out the FieldDescriptor Map, then (i) you know how many fields are in each record (FieldDescriptor.Length) and (ii) you can calculate the fixed record length of the DATA file, by summing the field lengths:

B4X:

Dim RecordLength As Int = 0 For Each fd As FieldDescriptorType In FieldDescriptorMap 'doesn't matter which order they're summed in RecordLength = RecordLength + fd.FLength Next

And then "all you have to do is" stroll through the DATA file reading RecordLength-byte blocks and splitting them up according to the FLengths, interpreting each field according to its FType and adding them to an array, and then adding the completed array to a List of records.

On a related note: old DBF files have a structure similar to having your LAYOUT + DATA files but as a single file, and can be read by most spreadsheet programs.

Thank you, this is what I was looking for.

emexes · Sep 27, 2022

emexes said:
stroll through the DATA file reading RecordLength-byte blocks and splitting them up according to the FLengths

For this bit, you do have to go through the fields in their position-in-record order eg Field 1, Field 2, Field 3, ...

unlike:

B4X:

For Each fd As FieldDescriptorType In FieldDescriptorMap    'doesn't matter which order they're summed in

So now I am inclined to say that all the groups-of-fields stuff should be arrays rather than maps or lists, with the index being the field number minus 1, or perhaps the arrays being one element longer than necessary and leaving element 0 unused, so that the array index and the field number are the same.

Probably best to still read the field descriptors into a Map, because you don't know the number of fields beforehand. Once you've read all the field descriptors and know how many there are, then just copy them into an array eg:

B4X:

Dim NumFields As Int = FieldDescriptorMap.Length
Dim FieldInfo(NumFields) as FieldDescriptorType
For FieldNumber = 1 to NumFields
    FieldInfo(FieldNumber - 1) = FieldDescriptorMap.Get(FieldNumber)
Next

hmm or perhaps you could do the following, which doesn't rely on a Map key and so you could just use a List instead:

B4X:

Dim NumFields As Int = FieldDescriptorMap.Length
Dim FieldInfo(NumFields) as FieldDescriptorType
For Each fd As FieldDescriptorType In FieldDescriptorMap    'doesn't matter which order they're copied in
    FieldInfo(fd.FNumber - 1) = fd    'array index = field number - 1
Next

emexes · Sep 27, 2022

Something like this will split a fixed-length record up into fields:

lol I started writing just the RecordToFields function, but one thing led to another and before I knew what was happening... :

B4X:

Sub Process_Globals
    
Type FieldDescriptorType( _
    FNumber As Int, _
    FName As String, _
    FType As String, _
    FLength As String, _
    DecimalPlaces As Int _
)
    
End Sub


Sub RecordToFields(RecordString As String, FieldInfo() As FieldDescriptorType) As String()

    Dim NumFields As Int =  FieldInfo.Length
  
    Dim FieldStrings(NumFields) As String
    
    Dim StartPos As Int = 0
    For FieldIndex = 0 To NumFields - 1
        Dim EndPos As Int = StartPos + FieldInfo(FieldIndex).FLength
        If EndPos > RecordString.Length Then    'ideally would have check RecordString length is correct before trying to split into fields
            EndPos = RecordString.Length
        End If
        FieldStrings(FieldIndex) = RecordString.SubString2(StartPos, EndPos)    '.Trim
        StartPos = EndPos
    Next
    
    Return FieldStrings
    
End Sub


Sub StringArrayToSV(S() As String, Delimiter As String, Separator As String) As String
    
    Dim sb As StringBuilder
    sb.Initialize
    
    For I = 0 To S.Length - 1
        If I <> 0 Then
            sb.Append(Separator)
        End If
        sb.Append(Delimiter & S(I) & Delimiter)
    Next
    
    Return sb.ToString
    
End Sub


Sub StringArrayToCSV(S() As String) As String
    
    Return StringArrayToSV(S, """", ", ")
    
End Sub


Sub AppStart (Args() As String)
    
    Log("Hello world!!!")
    
    'test sample data:
    
    Dim FieldInfo(3) As FieldDescriptorType
    
    FieldInfo(0).FLength = 5
    FieldInfo(1).FLength = 10
    FieldInfo(2).FLength = 20
    
    'three different ways of doing the same thing:
    
    Dim F() As String = RecordToFields("abcdefghijklmnopqrstuvwxyz123456789", FieldInfo)
    Log(StringArrayToCSV(F))
    
    Dim RS As String = "[F1=][FIELD 2=][FIELD THREE=======]junk"
    Dim F() As String = RecordToFields(RS, FieldInfo)
    Dim CSV As String = StringArrayToCSV(F)
    Log(CSV)
    
    Dim RS As String = "FRED FLINTSTONEBEDROCK             junk"
    Log(StringArrayToCSV( RecordToFields(RS, FieldInfo) ))
    
End Sub

Log output:

Waiting for debugger to connect...
Program started.
Hello world!!!
"abcde", "fghijklmno", "pqrstuvwxyz123456789"
"[F1=]", "[FIELD 2=]", "[FIELD THREE=======]"
"FRED ", "FLINTSTONE", "BEDROCK             "

Kris.D · Sep 27, 2022

emexes said:

Something like this will split a fixed-length record up into fields:

lol I started writing just the RecordToFields function, but one thing led to another and before I knew what was happening... :

B4X:

Sub Process_Globals
   
Type FieldDescriptorType( _
    FNumber As Int, _
    FName As String, _
    FType As String, _
    FLength As String, _
    DecimalPlaces As Int _
)
   
End Sub


Sub RecordToFields(RecordString As String, FieldInfo() As FieldDescriptorType) As String()

    Dim NumFields As Int =  FieldInfo.Length
 
    Dim FieldStrings(NumFields) As String
   
    Dim StartPos As Int = 0
    For FieldIndex = 0 To NumFields - 1
        Dim EndPos As Int = StartPos + FieldInfo(FieldIndex).FLength
        If EndPos > RecordString.Length Then    'ideally would have check RecordString length is correct before trying to split into fields
            EndPos = RecordString.Length
        End If
        FieldStrings(FieldIndex) = RecordString.SubString2(StartPos, EndPos)    '.Trim
        StartPos = EndPos
    Next
   
    Return FieldStrings
   
End Sub


Sub StringArrayToSV(S() As String, Delimiter As String, Separator As String) As String
   
    Dim sb As StringBuilder
    sb.Initialize
   
    For I = 0 To S.Length - 1
        If I <> 0 Then
            sb.Append(Separator)
        End If
        sb.Append(Delimiter & S(I) & Delimiter)
    Next
   
    Return sb.ToString
   
End Sub


Sub StringArrayToCSV(S() As String) As String
   
    Return StringArrayToSV(S, """", ", ")
   
End Sub


Sub AppStart (Args() As String)
   
    Log("Hello world!!!")
   
    'test sample data:
   
    Dim FieldInfo(3) As FieldDescriptorType
   
    FieldInfo(0).FLength = 5
    FieldInfo(1).FLength = 10
    FieldInfo(2).FLength = 20
   
    'three different was of doing the same thing:
   
    Dim F() As String = RecordToFields("abcdefghijklmnopqrstuvwxyz123456789", FieldInfo)
    Log(StringArrayToCSV(F))
   
    Dim RS As String = "[F1=][FIELD 2=][FIELD THREE=======]junk"
    Dim F() As String = RecordToFields(RS, FieldInfo)
    Dim CSV As String = StringArrayToCSV(F)
    Log(CSV)
   
    Dim RS As String = "FRED FLINTSTONEBEDROCK             junk"
    Log(StringArrayToCSV( RecordToFields(RS, FieldInfo) ))
   
End Sub

Log output:

Waiting for debugger to connect...
Program started.
Hello world!!!
"abcde", "fghijklmno", "pqrstuvwxyz123456789"
"[F1=]", "[FIELD 2=]", "[FIELD THREE=======]"
"FRED ", "FLINTSTONE", "BEDROCK             "

Emexes, first you give me idea which got me very excited because it trigger my brain cells. Now you give me solution that is clever and cleaner what I would put together. Thank you.

B4J Question How to read file and data elements when file layout is provided with data file at the time of the run?

Kris.D

Member

Andrew (Digitwell)

Well-Known Member

[B4X] Forms Builder - Designer for B4XPreferencesDialog

[B4X] B4XTable - Cross platform, sortable, searchable, customizable table

emexes

Expert

Kris.D

Member

emexes

Expert

emexes

Expert

Kris.D

Member

Similar Threads