| View previous topic :: View next topic |
| Author |
Message |
DW Contributor

Joined: 21 Mar 2003 Posts: 175 Location: UK
|
Posted: Sun Sep 11, 2005 4:47 pm Post subject: Help speeding up duplicate removal |
|
|
The code I came up with is this
| Code: |
repeat
if @greater(@count(2),0)
list seek,2,0
end
if @match(2,@item(1))
if @not(@equal(@item(1),@item(2),EXACT))
list add,2,@item(1)
end
else
list add,2,@item(1)
end
list delete,1
until @equal(@count(1),0)
|
When this code is given a large list of over 10000 lines or so it take a long time to remove the dupes.
Is there a way to do it faster or a freeware tool out there that i can run hidden to do the same thing?
Thank you |
|
| Back to top |
|
 |
SnarlingSheep Professional Member


Joined: 13 Mar 2001 Posts: 759 Location: Michigan
|
Posted: Sun Sep 11, 2005 5:07 pm Post subject: |
|
|
I might not be reading the code right, but it looks like list 2 ends up with everything list 1 has in it, except for the duplicates that are already in 2..
So couldn't you put list 1 and list 2 items into a 3rd list, and then use LIST SORT,3? Seems like it'd be the same end result to me. _________________ -Sheep
My pockets hurt... |
|
| Back to top |
|
 |
DW Contributor

Joined: 21 Mar 2003 Posts: 175 Location: UK
|
Posted: Sun Sep 11, 2005 5:39 pm Post subject: |
|
|
What it should be doing is this.
list1 starts with all the data and list2 is empty.
I check the the current top item in list1 in not in list2.
If its not then I add it to list2, but if is does exist i just delete it from list1.
do you mean i can just put all my data in to one list and sort it to remove all the dupes?
does this bug/feature still exist? |
|
| Back to top |
|
 |
SnarlingSheep Professional Member


Joined: 13 Mar 2001 Posts: 759 Location: Michigan
|
Posted: Mon Sep 12, 2005 1:37 am Post subject: |
|
|
Ah, yeah you should be able to use LIST SORT,1.
Give it a try and see if it works for what you want.. _________________ -Sheep
My pockets hurt... |
|
| Back to top |
|
 |
DW Contributor

Joined: 21 Mar 2003 Posts: 175 Location: UK
|
Posted: Mon Sep 12, 2005 6:49 am Post subject: |
|
|
I dont think the sort bug works anymore. I tried it as you said, but all it did was sort my list.
So any other ideas?
Maybe an external dos duplicate checker of some sort? Does anyone know of one?
Any ideas welcome. |
|
| Back to top |
|
 |
Dr. Dread Professional Member


Joined: 03 Aug 2001 Posts: 1065 Location: Copenhagen, Denmark
|
Posted: Mon Sep 12, 2005 2:13 pm Post subject: |
|
|
| DW wrote: | | I dont think the sort bug works anymore. |
I think it does.... But you gotta create the list as sorted: LIST CREATE,1,SORTED
Greetz
Dr. Dread _________________ ~~ Alcohol and calculus don't mix... Don't drink and derive! ~~
String.DLL * advanced string processing |
|
| Back to top |
|
 |
Dr. Dread Professional Member


Joined: 03 Aug 2001 Posts: 1065 Location: Copenhagen, Denmark
|
Posted: Mon Sep 12, 2005 2:16 pm Post subject: |
|
|
PS: Take care with large VDS lists - if you do a LIST LOADFILE, it will probably break around 100,000 items.
Perhaps use an @ok() check after the list load.
Greetz
Dr. Dread _________________ ~~ Alcohol and calculus don't mix... Don't drink and derive! ~~
String.DLL * advanced string processing |
|
| Back to top |
|
 |
DW Contributor

Joined: 21 Mar 2003 Posts: 175 Location: UK
|
Posted: Tue Sep 13, 2005 6:41 am Post subject: |
|
|
| Thanks, will let you know what happened when I have had time to check it out. |
|
| Back to top |
|
 |
Serge Professional Member


Joined: 04 Mar 2002 Posts: 1480 Location: Australia
|
Posted: Tue Sep 20, 2005 1:34 am Post subject: |
|
|
if i may clarify what was said above, if you copy a list into a list that was created with the SORTED option, then all duplicates are automatically deleted
you do not need to sift through the list item by item to check whether it is duplicated or not
this works with VDS 5 as i am writing a program now and use that feature to remove dupes
serge _________________
|
|
| Back to top |
|
 |
|