forum.vdsworld.com Forum Index forum.vdsworld.com
Visit VDSWORLD.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


[Open Source]: Internet Spider

 
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> Visual DialogScript Open Source Projects
View previous topic :: View next topic  
Author Message
FreezingFire
Admin Team


Joined: 23 Jun 2002
Posts: 3508

PostPosted: Sun Mar 02, 2003 4:05 pm    Post subject: [Open Source]: Internet Spider Reply with quote

Here's a project which I think would be cool to develop: a link crawler for
indexing pages on the internet. It's not something I will have a lot of use
for but I think it would be interesting to see where we end up.

Here's a bit of code I was working on but didn't know how to finish.
I don't know how to extract the URLs from the pages.

Code:
  option scale,96
  option fieldsep,"|"
  option decimalsep,"."
  external vdsipp.dll

  %%URL      = http://www.vdsworld.com/
  REM -- Set the script to run for five minutes --
  %%StopTime = @fadd(@datetime(n),5)
 
  INTERNET HTTP,CREATE,1
  INTERNET HTTP,THREADS,1,ON
 
  INTERNET HTTP,PROTOCOL,1,1
  INTERNET HTTP,USERAGENT,1,VDSWORLD Internet Spider
    list create,1
   list create,2
:evloop
if @equal(@datetime(n),%%StopTime)
goto close
end
  wait event
  goto @event()

:BUTTON1BUTTON
  dialog disable,button1
  dialog disable,edit1
  INTERNET HTTP,GETHEADER,1,%%URL
goto evloop


  REM -- Just in case we want future header processing in this script --
:HTTP1ONGETHEADERDONE 
  REM -- Get the page --
  INTERNET HTTP,GET,1,%%URL
  GOTO EVLOOP

:HTTP1ONGETDONE
  list assign,2,@internet(http,content,1)
  gosub ProcessPage
  goto evloop

:ProcessPage
REM -- I don't know how to extract the URLs from the page --
REM if @match(1,http://)
REM   list add,1,@item(1, @match(1,http://))
REM end
EXIT

:CLOSE
  rem ** Always destroy the client protocols before exiting
  rem your script, to prevent from errors and crashes, also use
  rem a STOP incase a download is occuring **
  INTERNET HTTP,STOP,1
  INTERNET HTTP,DESTROY,1
exit

_________________
FreezingFire
VDSWORLD.com
Site Admin Team
Back to top
View user's profile Send private message Visit poster's website
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Sun Mar 02, 2003 4:46 pm    Post subject: Reply with quote

Try something like this to extract the URL...

Code:
if @match(1,http://)
  %%link = @item(1)
  %%link = @substr(%%link,@pos(http://,%%link),@pos(.htm,%%link))
  end
Back to top
View user's profile Send private message
ShinobiSoft
Professional Member
Professional Member


Joined: 06 Nov 2002
Posts: 790
Location: Knoxville, Tn

PostPosted: Sun Mar 02, 2003 6:35 pm    Post subject: Reply with quote

I would recommend finding the starting and ending <a></a> anchor tags
first and then extracting the URL from the anchorString.

_________________
Bill Weckel
ShinobiSoft Software

"The way is known to all, but not all know it."
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
Display posts from previous:   
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> Visual DialogScript Open Source Projects All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum

Twitter@vdsworld       RSS

Powered by phpBB © 2001, 2005 phpBB Group