forum.vdsworld.com Forum Index forum.vdsworld.com
Visit VDSWORLD.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


Converting Very Large HTML files to text equivalents

 
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> General Help
View previous topic :: View next topic  
Author Message
Henrik
Valued Newbie


Joined: 09 Jul 2000
Posts: 35
Location: Copenhagen, Denmark

PostPosted: Sun Mar 10, 2002 4:43 am    Post subject: Converting Very Large HTML files to text equivalents Reply with quote

Hi guys !

I want to convert very large HTML files (1 - 10 MB) to their text only equivalents.

Due to the list loadfile memory bug in VDS, I tried to accomplish this by using this code:

The script uses blowfish.dll, VDSINET DLL and VDSBIN DLL

Code:

  %%KEY = MYKEY

  title WebAnalyzer: Export Report as Text

  external @path(%0)export.dll
  external @path(%0)html.dll 
external @path(%0)loader.dll
option errortrap,error
 
  DIALOG CREATE,WebAnalyzer: Export As Text,100,100,503,47,NOSYS,ONTOP
  DIALOG ADD,TEXT,text1,6,4,100,16,Choose files ...
  DIALOG ADD,PROGRESS,PROGRESS1,12,134,358,24
  DIALOG ADD,TEXT,TEXT2,22,5,,,
  DIALOG SHOW
 
  list create,1
  list create,2
 
  :START
  list clear,1
  %%TEXTBUFFER =
  %%INFILE = @filedlg("WebAnalyzer Reports (*.wra)|*.wra",Select Report to Save as Text,%1,)
  if @not(%%INFILE)
  goto close
  end
 
  %%filename = @filedlg("All files (*.*)|*.*",Select Output Text file,@path(%%INFILE)@name(%%INFILE).txt,SAVE)

if %%filename
  file delete,%%FILENAME
  dialog set,text1,Exporting - Step 1 ...
  %%TEMP = @Blowfish(DecryptFile,%%Key,%%INFILE,c:\@name(%%FILENAME).tmp)
   fileio open,c:\@name(%%FILENAME).tmp,RW,denynone
    fileio seek,0,start
 
  dialog set,text1,Exporting - Step 2 ...

  %%FP = -1
  repeat
  fileio seek,@succ(%%FP),start
  %%READ = @fileio(read,30000)
  %%HTML = @FILEIO(HEX2STRING, %%READ)
  %%TEXT = @net(html,%%HTML)
  %%TEXTBUFFER = %%TEXTBUFFER%%TEXT
  %%FP = @sum(%%FP,30000)
  rem info %%FP
  rem info %%TEXTBUFFER
  gosub UPDATE_PROGRESSBAR
  dialog set,text2,%%FP / @file(%%INFILE,Z) Bytes
  until @greater(%%FP,@file(%%INFILE,Z))
  dialog set,text1,Exporting - Step 3 ...
  wait 1
  rem %%TEXTBUFFER = @net(html,%%TEXTBUFFER)
  list add,1,%%TEXTBUFFER
  list savefile,1,%%FILENAME
  fileio close
  dialog set,text1,Exporting - Done !
end
goto close

rem ***
rem *** ERROR HANDLER
rem ***
:error
  if @equal(@error(E),901)
    warn Error with opening the file. File may be in use!
  else
    if @equal(@error(E),902)
      warn Error with creating the file.
    else
      if @equal(@error(E),903)
        warn Error with reading from the file.
      else
        if @equal(@error(E),904)
          warn Error with writing to the file.
        else
          warn Unexpected error @error(E)!
        end
      end
    end
  end
  goto evloop

rem ***
rem *** CLOSE
rem ***
:close
file delete,c:\@name(%%FILENAME).tmp
  fileio close
  exit
 

rem ***
rem *** UPDATE PROGRESSBAR - OK
rem ***
:UPDATE_PROGRESSBAR
%%PROGRESS = @format(@fmul(@fdiv(%%FP,@file(%%INFILE,Z)),100),3.0)
dialog set,progress1,%%PROGRESS
exit



The problem is that far from the whole file is written as output file ?
Does anyone have any idea what's wrong ?

Maybe a DLL that will take an input HTML file of any size and write its text only text equivalent as output file would be needed for this ?

Thanks !

Henrik

_________________
Henrik Skov
Email: henrikskov@mail.dk
Back to top
View user's profile Send private message Send e-mail
Zeitenwanderer
Newbie


Joined: 26 Feb 2002
Posts: 10

PostPosted: Sun Mar 10, 2002 10:45 pm    Post subject: www.jafsoft.com Reply with quote

Not sure, if I get you right. But apart from your own script, do you know the wonderful tool asctohtml?

It would do the job you describe - and it does it quite perfectly.
www.jafsoft.com, if you want a closer look.

Greet him from his German translator, if you find a guestbook. Smile

Cheers,

George
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> General Help All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum

Twitter@vdsworld       RSS

Powered by phpBB © 2001, 2005 phpBB Group