B4J Question MySQL Problem

Harris · Jun 11, 2017

LoadMast Table -
pkbigint(20) AI
mastidbigint(20)
loadtypeint(11)
acttypeint(11)
recdatebigint(20)
etc...

The pk is autoinc.

The problem: Records are inserted with RLC from remote devices.
If the device sends the data but does not get a response, the records are sent again until it does get a response from the server. REPLACE INTO won't work here because of the auto inc pk.

What practical solution can one do to avoid duplicate records?

Thanks

Cableguy · Jun 11, 2017

What are the chances of having identical entries? (Not duplicates resulting from several retries but individually sent and acknowledged)?
If they're null, then I would only save the data if no duplicates existed...

Harris · Jun 11, 2017

Cableguy said:
What are the chances of having identical entries? (Not duplicates resulting from several retries but individually sent and acknowledged)?
If they're null, then I would only save the data if no duplicates existed...

It is difficult to say. Only one or two of 25 devices (exact same app/version) behave in this fashion.
It is odd because all the records are inserted at the same time (timestamp field shows this).
This would asume that the data was in triplicate on the device when the batch command was issued. But, I cannot see how that was possible in my code - since others operate normally.
Guess I am barking up the wrong tree?

I could probably post process the table, find duplicates and add the pk to a list, then delete them.
I would need to build a structure to store where I last left off - so I don't do the massive table(s) from the beginning each time.
OR, do this on the device side prior to batching - much smaller set of recent data to work with - since table is emptied when response from server is sent.
But, device side cleaning would still show server side issue if JobDone was not raised - after success of insert on server side.

Erel · Jun 12, 2017

You can use this code to disable the auto retry feature of OkHttp:

B4X:

'Replace the existing method in HttpUtils2Service
Sub Initialize
   If hc.IsInitialized = False Then
     TempFolder = File.DirTemp
     Dim jo As JavaObject = hc
     Dim builder As JavaObject = jo.RunMethod("sharedInit", Array("hc"))
     builder.RunMethod("retryOnConnectionFailure", Array(False))
     jo.SetField("client", builder.RunMethod("build", Null))    
     TaskIdToJob.Initialize
   End If
End Sub

Edit: never modify OkHttpUtils2 source. Instead: https://www.b4x.com/android/forum/t...edirections-with-okhttputils2.118469/#content

OliverA · Jun 12, 2017

Harris said:
It is odd because all the records are inserted at the same time (timestamp field shows this).

Harris said:
I could probably post process the table, find duplicates and add the pk to a list, then delete them.

How do you know they are duplicates? Because of the timestamp alone? Because of some other id (device id - i'm guessing here) and timestamp? If so, create a UNIQUE KEY on those two or three fields. The insert/update will then fail on duplicates of those fields.

Link: Creating UNIQUE KEY using CREATE TABLE
Link: Creating UNIQUE KEY using ALTER TABLE

Harris · Jun 13, 2017

OliverA said:
How do you know they are duplicates? Because of the timestamp alone? Because of some other id (device id - i'm guessing here) and timestamp? If so, create a UNIQUE KEY on those two or three fields. The insert/update will then fail on duplicates of those fields.

Link: Creating UNIQUE KEY using CREATE TABLE
Link: Creating UNIQUE KEY using ALTER TABLE

"Each field" in the record is EXACTLY like the one before (when it duplicates) - except for the primary key which is created on the server.
I am guessing that a UNIQUE key is a secondary index on the table. Yes, I could create one based on several fields, but would a 'REPLACE INTO' work here as well (with second index) ?
I don't want the batch to fail (not insert anything when dupe is found - which happens now on non indexed tables - without the REPLACE INTO statement), but REPLACE the same record and carry on with new, valid stuff.

I have another table that doesn't use a PK (primary key, auto inc). REPLACE INTO works well since an AI index (primary key) does not interfere with it. YET, good practice states that each table SHOULD have a pk to be optimized.

Essentially, add all NEW data. Don't duplicate if record ( a series of fields have the exact same data - a new index to maintain). Overwrite (REPLACE) if it does exist (no harm, no foul).

In my experience, (other, secondary) indexes are subject to corruption. When they get out of sort - all hell breaks loose. I try and keep data retrieval and puts simple ( updates, inserts) .

Also, as seen in the past, the additional indexes (can) grow larger than the table they are working on...???

I know, this is open ended.. What is best for what.

Thanks for your expert input and still seek alternatives (as Erels post above).

OliverA · Jun 13, 2017

Harris said:
YET, good practice states that each table SHOULD have a pk to be optimized.

Yes, but sometimes that rule may not fit. Private Keys (PKs) are very useful in joining tables and in clustering. Is this table used in a lot of joins? Are you using clustering?

Harris said:
seek alternatives (as Erels post above).

Ok, I'm a little on thin ice here. But what if this actually prevents data from coming through? I mean, currently you are getting duplicate data, but what if the proposed solution means data loss (since no retry attempts are made)? Can you live with data loss? Of course I may get the wrong idea of what the solution actually does and this will not result in lost/missed data.

Harris said:
I have another table that doesn't use a PK (primary key, auto inc). REPLACE INTO works well since an AI index (primary key) does not interfere with it.

Why not model this table after the previous one, since this seems to resolve your issue? If you still need a table with a PK in it, you could create another table and then import the non-pk records into it after some time interval such as:

Anything older than 5 minutes (arbitrary time I pulled out of a hat)
Older than the retry period (if one can determine what the retry period/duration is).

Just some more ideas...

B4J Question MySQL Problem

Harris

Expert

Cableguy

Expert

Harris

Expert

Erel

B4X founder

OliverA

Expert

Harris

Expert

OliverA

Expert

Similar Threads