B4J Question Best way to read a csv file?

Mark Read

Well-Known Member
Licensed User
Longtime User
I have a SPS control system which generates a CSV file continuously and sends it to a remote PC. The new data is added to the end of the file as a block.

NH4_ISE_ZU;18.04.2017 14:51;1229,861;1;42843618791;0
HEIZ_SOLL;18.04.2017 14:51;60;1;42843618791;0
VISU_PV_1_SOLL;18.04.2017 14:51;0;1;42843618791;0
PV_1_2;18.04.2017 14:51;100;1;42843618791;0
pH_501;18.04.2017 14:51;11,30208;1;42843618791;0
T_411;18.04.2017 14:51;52,95139;1;42843618791;0
pH_502;18.04.2017 14:51;7,413195;1;42843618791;0
P2_IST_2;18.04.2017 14:51;0;1;42843618791;0
T_401;18.04.2017 14:51;10,40038;1;42843618791;0
T_402;18.04.2017 14:51;10,20037;1;42843618791;0
T_403;18.04.2017 14:51;17,90065;1;42843618791;0
T_404;18.04.2017 14:51;47,60172;1;42843618791;0
T_405;18.04.2017 14:51;53,40193;1;42843618791;0
T_406;18.04.2017 14:51;16,10058;1;42843618791;0
T_407;18.04.2017 14:51;14,30052;1;42843618791;0
T_408;18.04.2017 14:51;18,70068;1;42843618791;0
T_409;18.04.2017 14:51;53,90195;1;42843618791;0
T_410;18.04.2017 14:51;61,60223;1;42843618791;0
ZÄHL_GÄRREST;18.04.2017 14:51;0;1;42843618791;0
P4_IST_2;18.04.2017 14:51;658,2755;1;42843618791;0

On the remote PC I would like to have a B4J program read the file and put the data into an Excel file containing pre-prepared diagrams to show the data realtime.

My question is how to read the file:

1. Read the whole file each time and overwrite the data?

or

2. Read only the last block and add the data to the bottom of the excel worksheet.

I know how many readings are in a block. The data is acquired roughly every 10 seconds for a period of 2 hours. Which method would give the best result?

Many thanks
 

eps

Expert
Licensed User
Longtime User
Read the whole file each time and overwrite, but.... I would also keep a copy or two, so have a backup folder below all that, just in case!!

I think this approach is best as the file could get quite bloated over time and opening and reading the last few entries could slow down over time
 
Upvote 0

eps

Expert
Licensed User
Longtime User
Another though - as Erel asks what is the maximum file size, plus how often are updates received? This should allow you to work out how big and how quickly the file will grow in size.
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
What is the maximum file size?

This is only restricted by the harddisk on the PC - that means up to 1TB. I have a sample file with the block in post#1.

Readings every 10 seconds, for 3½ hours = 27500 lines of data and a file size of 1456 KB.

In reality I would expect no more than double this time for a test.
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
The good news is that the CSV file is a simple one which could be parsed easily.

That should not be a problem.

The main problem I see is that you need to deal with partial rows as you can read the file while the other processes writes it.

What about making a copy of the main file and then parsing that file? That way the original is only used for a few seconds.

Is there a unique field in each row?

Not quite sure what you mean. Each block of data contains the field name in the first part of the row eg. T_401, would be the temperature sensor 401, which does not mean there are 401 temperature sensors!

The SPS-system collects all the data from all the inputs and sends this as a block to the csv file. The interval for this can be set by me. 10 seconds would be a standard time. Faster would be better. the format of the data row is:

Sensor name ; date and time; current value; validity (normally always 1) ; elapsed time in minutes

Also, the data is in german format for the numbers: 3,14 instead of 3.14 but this is my problem for parsing.
 
Upvote 0

Knoppi

Active Member
Licensed User
Longtime User
Also, the data is in german format for the numbers: 3,14 instead of 3.14 but this is my problem for parsing.
B4X:
Sub TestDate
    Dim sDate As String = "18.04.2017 14:51"
   
    Dim df As String = DateTime.DateFormat    ' save DateFormat
    DateTime.DateFormat="dd.MM.yyyy HH:mm"    ' set German DateFormat
    Dim nDate As Long = DateTime.DateParse( sDate)    ' Parse date
    Log( "DateTime in Ticks = " & nDate)
    DateTime.DateFormat = df    ' restore DateFormat
   
    Log( DateTime.Date( nDate) &"  "& DateTime.Time( nDate))
End Sub

Sub TestNumber
   'if there is not a thousand point
   Dim sNum As String = "1229,861"
   Log( sNum.Replace(",", "."))
   
   'otherwise
   Dim sNum As String = "1.229,861"
   sNum = sNum.Replace(",", "#")
   sNum = sNum.Replace(".", ",")
   sNum = sNum.Replace("#", ".")
   Log( sNum)
End Sub
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
What about making a copy of the main file and then parsing that file?
Even with copying, it could be that the copy happens during an append/creation of the file and may not include a complete last line in your resulting copy. Please note that does not mean you only missed the last line of the originating file, it just means that the copy happened during a write and may therefor not contain all the information (which should be captured during the next copy) and the last line that you were able to copy (not necessarily the last line of the originating file) could be incomplete. It should not prevent you form importing the information, it's just something to check for and be aware off during your import. Even though this could be an extreme case and most of the time the copy would work, not checking for this extreme case would introduce strange bugs (strange in terms of not happening all the time).
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Start with something like this:

1. Check the file size.
2. Sleep(1000)
3. Check the file size again. If it is the same as the previous read and larger than the last stored file size then read the new data with RandomAccessFile.ReadBytes.
4. Store the file size (KVS is a good option).
5. Convert the bytes to a string and parse it.
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
read the new data with RandomAccessFile.ReadBytes

I would have used the textreader as I have never used RandomAccess before. If I understand correctly, it will allow me to read just the new data added to the file as long as I keep track of a pointer. Is this correct?
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
KeyValueStore v2.
Thank you, something new again.

My planned approach would have been:

1. Copy datafile from SPS-Control
2. Open Excel file
3. Open copy with textreader.
4. Jump to end of file
5. Read and parse the data backwards into an array (roughly array(30,3) in size) as I know the block size (number of rows)
6. Send array contents to Excel
7 Repeat.

Am I thinking too complicated?
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
Many thanks for the help. I had not thought about a list. No need to invent the wheel again!
 
Upvote 0

Mark Read

Well-Known Member
Licensed User
Longtime User
I have the program working. Parsing 27500 lines of data and importing to Excel in 1.6 seconds. I am happy with this.

I only have one problem: It will only work when the Excel file is closed which means I cannot see the data realtime. I will search and see if I can put data into an open file.

Many thanks for all the help.
 
Upvote 0
Top