Android Question SQLite remove duplicate records.[SOLVED]

Discussion in 'Android Questions' started by Roger Daley, Oct 12, 2019 at 8:30 AM.

  1. Roger Daley

    Roger Daley Well-Known Member Licensed User

    Hi All,

    Can anyone point me towards an example of removing duplicate records in a db?
    I have searched for methods/commands in SQLite and DBUtils but nothing stood out to me.
    Trying to match every record with every other record through iteration does not appeal and I'm sure there is a cleaner way do this.
    NOTE: When I say duplicate records, I mean the data in all columns of both records is the same not just matching the "Names" column.

    Regards Roger
     
  2. Pendrush

    Pendrush Well-Known Member Licensed User

    Create same table (copy table as new) and then add matching fields as primary key(s), then INSERT IGNORE all records form old table to new table.
     
    aeric likes this.
  3. Roger Daley

    Roger Daley Well-Known Member Licensed User

    Hi Pendrush,

    Thanks for the reply, you wouldn't have an example of what you wrote? I should have mentioned I am still learning how to spell SQL. I keep reading the tutorials but they appear to be arcane works of the druids.:(

    Regards Roger
     
  4. aeric

    aeric Active Member Licensed User

  5. mangojack

    mangojack Well-Known Member Licensed User

    Roger Daley likes this.
  6. Roger Daley

    Roger Daley Well-Known Member Licensed User

    Thanks I will read up on what's there.
    Thanks mangojack.
    I think tomorrow I'll go back to square one with Erel's tutorials and Klaus' booklets. It's late here and the more I look at it the less I understand.

    Regards Roger
     
  7. rraswisak

    rraswisak Active Member Licensed User

    Try this set of query statements:

    CREATE TABLE table_temp AS SELECT DISTINCT * FROM table_duplicate;
    DELETE FROM table_duplicate;
    INSERT INTO table_duplicate SELECT * FROM table_temp;
    DROP TABLE table_temp;

    or

    CREATE TABLE table_temp AS SELECT DISTINCT * FROM table_duplicate;
    DROP TABLE table_duplicate;
    CREATE TABLE table_duplicate AS SELECT * FROM table_temp;
    DROP TABLE table_temp;

    or

    CREATE TABLE table_temp AS SELECT DISTINCT * FROM table_duplicate;
    DROP TABLE table_duplicate;
    ALTER TABLE table_temp RENAME TO table_duplicate;

     
    Last edited: Oct 12, 2019 at 3:06 PM
    OliverA and Roger Daley like this.
  8. mangojack

    mangojack Well-Known Member Licensed User


    I tried the above code supplied by @rraswisak but could not get it to copy Distinct records ... Maybe I was doing it wrong .

    Anyway , as a part of learning new .. I had a play with Unique, Insert Or Ignore.

    The attached example is a BASIC example of extracting distinct / unique records from one table to another etc.

    It might be of interest to you or others. If so it can be adapted to suit.

    I cant emphasize more the need to backup your DB 's before you apply any of this ! I am no expert in this Field.


    @Roger Daley .. you mention DBUtils above , I dont know how reliant you are on this in your project/s , But personally when i was new to B4X and knew nothing
    of SQLite ect ... I played with DBUtils as well . I found I was spending more time trying to learn the workings of the Util. rather than SQL itself.

    I ditched it and concentrated on SQLite directly .. happy that I did. This is my Personal Opinion.

    Cheers.
     

    Attached Files:

    Roger Daley likes this.
  9. Roger Daley

    Roger Daley Well-Known Member Licensed User

    Hi All

    Attempted to use the code below, with the error:
    java.lang.ClassCastException: android.database.sqlite.SQLiteCursor cannot be cast to java.util.List
    at line:

    For i = 0 To Table1.Size - 1
    EDIT: 14 Oct. the line:
    Table1 = SQL1.ExecQuery("SELECT distinct * FROM sites")
    is the real problem. You can't use a LIST [Table1] for a SQL Query.


    Most of code in the sub is from the existing code which loads a CSV ito the db. I can't see why it works in the existing code but not in this.
    The error line has no meaning to me.

    Code:
    Sub SQLDuplicate                
        
    Private Table1 As List
        
    Private Table2 As List
        
    Private Items() As String
        Table1.Initialize
        Table2.Initialize
       
        Table1.Clear
        Table1 = SQL1.ExecQuery(
    "SELECT distinct * FROM sites")
       
        
    For i = 0 To Table1.Size - 1
            
    Private Map1 As Map
            Map1.Initialize
            Items = Table1.Get(i)
            Map1.Clear
            Map1.Put(
    "SiteName", Items(0))
            Map1.Put(
    "Longitude", Items(1))
            Map1.Put(
    "Latitude", Items(2))
            Table2.Add(Map1)
        
    Next
        SQL1.ExecNonQuery(
    "DELETE FROM sites")
        DBUtils.InsertMaps(SQL1, 
    "sites", Table2)
    End Sub
    Any clues happily received.

    Regards Roger
     
    Last edited: Oct 13, 2019 at 10:03 PM
  10. emexes

    emexes Well-Known Member Licensed User

    You don't have to match every record with every other record, ie n^2 operations; the simplest non-bug-prone method would be to go through each record and do a lookup for likely duplicates using keys like name, birthdate, telephone number, and then check just those lookup results for fully-duplicate matches on all other fields (and delete any found). This will be more like a n log n operation.

    Another way is to select all records, sorted by all fields, and then all duplicates will be grouped together when you scan through the sorted results in order. eg (psuedocode from another dialect of BASIC):
    Code:
    if SortResult.Count Then
        ThisRecord = SortResult.First
        
    Do Until SortResult.End
            LastRecord = ThisRecord
            ThisRecord = SortResult.Next

            
    If ThisRecord.FirstName = LastRecord.FirstName Then
            
    If ThisRecord.LastName.Trim.UpperCase = LastRecord.LastName.Trim.UpperCase
            
    If ThisRecord.Telephone.Filter("[09]*") = LastRecord.Telephone.Filter("[09]*"Then

                SortResult.Delete    
    'current record, but leaves cursor in position re: .Next and .End

            
    End If
            
    End If
            
    End If
        
    Loop
    End If
     
    Roger Daley and mangojack like this.
  11. rraswisak

    rraswisak Active Member Licensed User

    actualy it works for me

    base on above statement the first my thought is using distinct (*) with all column

    duplicate.gif

    I attach working sample project to prove it
     

    Attached Files:

    fredo, AnandGupta, OliverA and 4 others like this.
  12. mangojack

    mangojack Well-Known Member Licensed User

    Not really necessary ... As stated above, It was most likely something I was doing wrong. :)

    Running this "CREATE TABLE table_temp AS SELECT DISTINCT * FROM MyTable"
    for some reason just copied over all records ... Including exact duplicates.

    I did not spend much time on it to figure out why.
     
    rraswisak likes this.
  13. rraswisak

    rraswisak Active Member Licensed User

    what a coincidence that i face same problem until i realize that i still open/browse the original table, not the new one :D
     
  14. Roger Daley

    Roger Daley Well-Known Member Licensed User


    I am not at the computer at the moment but looking at the code as text on the phone it looks brilliant in it's simplicity. I obviously stuffed up the syntax when I tried it.
    It will probably be tomorrow before I can try again but will report on the results.

    Regards Roger
     
    emexes likes this.
  15. Roger Daley

    Roger Daley Well-Known Member Licensed User

    rraswisak
    Many thanks for your working example. For me it was a great tutorial on SQL and I was correct, my first attempts to use your suggestions screwed up the syntax. Working examples are the best teachers, I now have a little bit better understanding of SQL although I don't know what/why the "Transaction" statements do.


    From your example I derived the following SUB and it is working perfectly.


    Regards Roger



    Code:
    Sub SQLDuplicate(SQLD As SQL, TableName As String)   
        SQLD.BeginTransaction
        SQLD.ExecNonQuery(
    "CREATE TABLE table_temp AS SELECT DISTINCT * FROM "& TableName)
        SQLD.ExecNonQuery(
    "DROP TABLE "& TableName)
        SQLD.ExecNonQuery(
    "ALTER TABLE table_temp RENAME TO "& TableName)
        SQLD.TransactionSuccessful
        SQLD.EndTransaction
    End Sub
     
    mangojack likes this.
Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice