forum.vdsworld.com Forum Index forum.vdsworld.com
Visit VDSWORLD.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


Ok seeking suggestions for URL filtering

 
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> General Help
View previous topic :: View next topic  
Author Message
marty
Professional Member
Professional Member


Joined: 10 May 2001
Posts: 789

PostPosted: Mon Apr 24, 2006 2:42 pm    Post subject: Ok seeking suggestions for URL filtering Reply with quote

Ok guys I need advice and ideas over here...

Here is what I am trying to acheive and what I have done so far.

Application info - URL Filtering (300,000 urls)

Code done so far:

- VDS app loads the TXT that contains the 300,000 Urls into a internal list created by VDSLIST.DLL (more efficient than the original VDS one)
- VDS app monitors every 0.9 sec IE and Firefox for bad URLS
If IE used : I capture IE url first (from the comboEX class) than scan my internal list.. if found I send by DDE another webpage to block site.
If FireFox : I capture the URL using DDE, than check in the internal list.. if ound I send by DDE another webpage to block it


Now this seems the most efficient way I found so far.. BUT its slow on an older machine (PII 400).. it takes up to 10 seconds sometimes my app sends back the block page... Sad

Here is what I tried also so far:

- Used String.dll instead of VDSList.dll. A bit slower
- Used the sqllite.. also slow..

Thanks in advance...
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Mon Apr 24, 2006 3:41 pm    Post subject: Reply with quote

How about creating multiple list of urls? Smile Lets say one list first all urls which start with "a", one for "b", et cetera.
_________________
[ Add autocomplete functionality to your VDS IDE windows! ]
Voor Nederlandse beginners met VDS: bekijk ook eens deze tutorial!
Back to top
View user's profile Send private message
marty
Professional Member
Professional Member


Joined: 10 May 2001
Posts: 789

PostPosted: Mon Apr 24, 2006 3:48 pm    Post subject: Reply with quote

Yes thats an idea Prakash gave me this morning.. and I will certain see if I can do that.

Problem is the list is one big file and I would have to manually do that every time there is an update from the guys that maintain that list.

Good suggestion Smile
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Mon Apr 24, 2006 3:55 pm    Post subject: Reply with quote

Another way which might be a good solution, is to load the list into VDS (make sure it is sorted by the alphabet) and then create "indexes". Just create a second list, and put the 1000th, 2000th, 3000th, etc. item from the first one into the second. Now, when searching, look in the second list between which indexed items the current url belongs, and start searching from that item.

Edit: little example added Smile

Code:
# You 300.000 items list
list create,1,sorted
list add,1,111
list add,1,222
list add,1,333
list add,1,444
list add,1,555
list add,1,666
list add,1,777

# Index list
list create,2
%i = 0
%%steps = 3
while @greater(@count(1),%i)
  list add,2,@item(1,%i)
  %i = @sum(%i,%%steps)
wend

info Indexed items:@cr()@text(2)

# Locate where your item is in between in list 2
%%item = 555
%i = 0
# Adjust this while loop so that text can be compared
while @greater(%%item,@item(2,%i))
  %i = @succ(%i)
wend
# Search the first list for your item from pos %i
%i = @prod(@pred(%i),%%steps)
list seek,1,%i
if @match(1,%%item)
  info Found your item: @item(1) at line @index(1) of list 1
end


Note that you should replace the "while @greater(%%item,@item(2,%i))" line with a string compare function, which checks if %%item is before or after @item(2,%i)... Smile

_________________
[ Add autocomplete functionality to your VDS IDE windows! ]
Voor Nederlandse beginners met VDS: bekijk ook eens deze tutorial!
Back to top
View user's profile Send private message
marty
Professional Member
Professional Member


Joined: 10 May 2001
Posts: 789

PostPosted: Mon Apr 24, 2006 5:55 pm    Post subject: Reply with quote

Thanks Skit Smile Will look into that.

Will see if I can do a proxy with VDS.. the HTTPX extension is not stable.. so will look at other solution
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Serge
Professional Member
Professional Member


Joined: 04 Mar 2002
Posts: 1480
Location: Australia

PostPosted: Tue Apr 25, 2006 3:19 am    Post subject: Reply with quote

marty,

i would use that idea

Quote:
Yes thats an idea Prakash gave me this morning.. and I will certain see if I can do that.

Problem is the list is one big file and I would have to manually do that every time there is an update from the guys that maintain that list.


1. i would write a little application that would sort the list you receive from your third party into alphabetic files ie. 'a.txt' for all url's that start with 'a', and so on... this would not be hard to do

2. i would build into your program a litte routine to check to see what letter the url starts with and then only process a check agains that file

now, given that lots of url's start with 'www.', i would totally remove it and work with the rest of the domain ... even from the third party file ... this would lead to a smaller file/list than otherwise eg. 'www.naughty.org' would become 'naughty.org' and so on...

just a thought

serge

_________________
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
marty
Professional Member
Professional Member


Joined: 10 May 2001
Posts: 789

PostPosted: Tue Apr 25, 2006 4:12 pm    Post subject: Reply with quote

Thanks Serge for the suggestions! Smile
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
vdsalchemist
Admin Team


Joined: 23 Oct 2001
Posts: 1448
Location: Florida, USA

PostPosted: Tue Apr 25, 2006 4:50 pm    Post subject: Reply with quote

Marty,
I agree with Serge's suggestions since I understand how the list commands and functions work. If the list has several values that have the same prefixes it would cause the list commands and functions longer to find the item of interest since they work off of binary trees. The prefixes cause the tree to be 1 sided since they use a string compare function that compare the string <, >, or = to the new string being added. If the string is < the previous string added the tree will grow to the right if the string is > the previous string the tree will grow to the left. If the string is equal most will return an error that is handled in different ways depending on the binary tree package that is being used and the programmer that is implementing the package.

With this said I am implementing a Dictionary command and function in GadgetX that will allow you to do really fast name = value look ups. I will send you a demo ASAP.

_________________
Home of

Give VDS a new purpose!
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> General Help All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum

Twitter@vdsworld       RSS

Powered by phpBB © 2001, 2005 phpBB Group