I need to verify in a text-file (a word-list which might contain thousands of items) that there are no duplicates therein.
To verify if there are any duplicates, I am using the Hashtable-object from Agraham's Collection-library as follows:
Above works since the key must be unique and if I would try to add a duplicate key to the hashtable, then I would get an error.
Do you have any other suggestions, which are faster, to check for duplicates. I thought about loading the text file into two separate arrays and then check one array's words against the other array's words but I think that would be slower.
As mentioned above, I am talking about a lot of words, could be 80000-90000 items.
Any advice would be appreciated.
rgds,
moster67
To verify if there are any duplicates, I am using the Hashtable-object from Agraham's Collection-library as follows:
B4X:
FileOpen(c2,"MyTextFile.txt",cRead)
s=FileRead(c2)
Do Until s=EOF
If hash.ContainsKey(s) Then
'take note of the key (word) and do something
s=FileRead(C2)
Else
hash.Add(s, strAt(s,0))
End If
s=FileRead(c2)
Loop
FileClose(c2)
Above works since the key must be unique and if I would try to add a duplicate key to the hashtable, then I would get an error.
Do you have any other suggestions, which are faster, to check for duplicates. I thought about loading the text file into two separate arrays and then check one array's words against the other array's words but I think that would be slower.
As mentioned above, I am talking about a lot of words, could be 80000-90000 items.
Any advice would be appreciated.
rgds,
moster67
Last edited: