Slow Replace in Table

Zenerdiode

Active Member
Licensed User
I'm carrying out a search and replace function, but the program execution is slow for the ammount of data that I want to crunch. Please have a look and advise if there is a better way to do things. This app is to be used on the desktop only. Ultimately the data will be a few MB.

I thought the writes to the screen - ProgressBar and % complete - may have been slowing it but if commented they make no difference. (Pitty, because I would have had a 1 second timer to do the screen updates instead.)

I've removed 63 of the 64 'IF' queries but that makes no difference either. (I would have been able to easily chop the 'IFs' in two but there is no point either.)

Is it because within my For...Next Loop; it takes a while to find the next row of the table?

Don't get me wrong, if you do a manual search and replace in Notepad - I could probably do it faster on paper(!), but if you load the file into MS Excel it is almost instantaneous.

The benefits of using a Table Control is that I may load the file as is, with <TAB> separation and save with <,> separation easily.

EDIT: 'IF' replaced with 'SELECT'
 

Attachments

  • Convert.zip
    27.3 KB · Views: 172
Last edited:

agraham

Expert
Licensed User
Longtime User
I've removed 63 of the 64 'IF' queries but that makes no difference either.
You must have removed different IFs. I got a dramatic speedup by removing most of the "If Table1.Cell(Table1.ColName(4),i)= ...." statements.

You are losing performance on all the IFs that fail against the one that succeeds. I would try using Select ... Case instead. I don't know how the IDE/legacy compiler handles it but the optimised compiler emitted code uses a single index into a jump table to select the target code and so should be many times faster than all those If ... Else If ... statements.

DoEvents takes time as well, It might be worth trying something like "If (i mod 10) = 0 Then Doevents" to only call it every 10 times round the loop.
 

Zenerdiode

Active Member
Licensed User
Thanks agraham. I've used the Select...Case and it has had a *marginal* improvement.

There's 3,500 records of 60 bytes each in the sample file and its taking 24 seconds to do the search/replace function. Thats only 150 records per second. That just sounds increadibly slow - I'm using a 2.0GHz Core Duo and would have thought even 150,000 per second is slow.
 

Erel

B4X founder
Staff member
Licensed User
Longtime User
This task is not suited for a table control.
Each time you update a field the table needs to update its internal indexing.
I've updated the code to read the input file one line at a time, update this line and save it to the output file.
Takes me 0.06 seconds (60,000 records per second):
* Several select cases are removed from the following code because this post was too long (more than 10000 characters).
B4X:
Sub Globals
    'Declare the global variables here.
    Dim pars(0)
End Sub

Sub App_Start
    Form1.Show
End Sub

Sub Button1_Click
    OpenDialog1.Filter = "CDS Files|*.txt"
    If OpenDialog1.Show <> cCancel Then
        OrigFileName=OpenDialog1.File
        TextBox1.Text=OrigFileName
        If TextBox2.Text="" Then
            TextBox2.Text=StrInsert(OrigFileName,(StrLength(OrigFileName)-4)," [Resolved]")
            TextBox2.Text=SubString(TextBox2.Text,0,(StrLength(TextBox2.Text)-3))&"csv"
        End If
    End If
End Sub

Sub Button2_Click
    SaveDialog1.Filter = "CSV Files|*.csv"
    If SaveDialog1.Show <> cCancel Then
        TextBox2.Text=SaveDialog1.File
    End If
End Sub

Sub Button3_Click
    If TextBox1.Text="" OR TextBox2.Text="" Then
        Return
    End If
    Label4.Text="Decoding..."
    Label4.Visible=True
    Label6.Text=""
    Label6.Visible=True
    DoEvents
    Table1.LoadCSV(TextBox1.Text,Chr(9),False,True)
    Msgbox(Table1.RowCount)
    t=Now
    FileOpen(IN,textbox1.Text,cRead)
    FileOpen(OUT,textbox2.Text,cWrite)
    line = FileRead(IN)
    Do While line <> EOF
        pars() = StrSplit(line,cTab)
        Sub Globals
    'Declare the global variables here.
    Dim pars(0)
End Sub

Sub App_Start
    Form1.Show
End Sub

Sub Button1_Click
    OpenDialog1.Filter = "CDS Files|*.txt"
    If OpenDialog1.Show <> cCancel Then
        OrigFileName=OpenDialog1.File
        TextBox1.Text=OrigFileName
        If TextBox2.Text="" Then
            TextBox2.Text=StrInsert(OrigFileName,(StrLength(OrigFileName)-4)," [Resolved]")
            TextBox2.Text=SubString(TextBox2.Text,0,(StrLength(TextBox2.Text)-3))&"csv"
        End If
    End If
End Sub

Sub Button2_Click
    SaveDialog1.Filter = "CSV Files|*.csv"
    If SaveDialog1.Show <> cCancel Then
        TextBox2.Text=SaveDialog1.File
    End If
End Sub

Sub Button3_Click
    If TextBox1.Text="" OR TextBox2.Text="" Then
        Return
    End If
    Label4.Text="Decoding..."
    Label4.Visible=True
    Label6.Text=""
    Label6.Visible=True
    DoEvents
    Table1.LoadCSV(TextBox1.Text,Chr(9),False,True)
    Msgbox(Table1.RowCount)
    t=Now
    FileOpen(IN,textbox1.Text,cRead)
    FileOpen(OUT,textbox2.Text,cWrite)
    line = FileRead(IN)
    Do While line <> EOF
        pars() = StrSplit(line,cTab)
        Select pars(4)
            Case "RCK1SLT02DI0001 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="SBN TR"
            Case "RCK1SLT02DI0002 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="SBM TR"
            Case "RCK1SLT02DI0003 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="(TEST) R"
            Case "RCK1SLT02DI0004 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="RECPR"
            Case "RCK1SLT02DI0005 "
                If pars(5)="OPEN     " Then
                    pars(5)="Up"
                Else
                    pars(5)="Down"
                End If
                pars(4)="(CON) SR"
            Case "RCK1SLT02DI0006 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="HJPR"
            Case "RCK1SLT02DI0007 "
                If pars(5)="OPEN     " Then
                    pars(5)="Down"
                Else
                    pars(5)="Up"
                End If
                pars(4)="(UP) KR"
            Case Else
                pars(4)=pars(4)&" <Unknown Operand>"
        End Select
        lineout = pars(0) & "," & pars(1) & "," & pars(2) & "," & pars(3) & "," & pars(4) & "," & pars(5)
        FileWrite(OUT,lineout)
        line = FileRead(IN)
    Loop
    FileClose(IN)
    FileClose(OUT)
    Label4.Text="Done"
    Msgbox("Done in "&Format((Now-t)/10000000,"F2")&" seconds.","Decode",cMsgboxOK,cMsgboxExclamation)
    TextBox1.Text=""
    TextBox2.Text=""
    Label4.Visible=False
    Label6.Visible=False
End Sub
 

agraham

Expert
Licensed User
Longtime User
This task is not suited for a table control.
It certainly isn't. Assignment to a cell is astonishingly expensive!

Each time you update a field the table needs to update its internal indexing.
And then some! I've looked at what's going on inside the CLR with Reflector and each assignment builds a new DataView from the underlying data, which is expensive in itself, and then goes on to index into that to make the changes. I can't find where the changes are reflected back to the underlying table but that's got to be expensive as well. I'm straying out of my field of expertise here but I guess that a lot of this expense is to maintain data integrity in multi-user situations where the tables are tied to a multi-user database and might possibly be subject to simultanous updates.
 

Zenerdiode

Active Member
Licensed User
This task is not suited for a table control.

It certainly isn't. Assignment to a cell is astonishingly expensive!

Erel & Andrew, many thanks to both of you for your efforts with this. The difference in using the file directly without using the table is phenomenal. I just used the Table control through my inexperience as it produced the results I wanted; albeit at colossal expense of processing cycles. I also hadn't used the Select...Case methods before, again just going with what I knew from other languages. (Hence all of those IFs...)

Forever a :sign0104:
 
Top