forum.vdsworld.com Forum Index forum.vdsworld.com
Visit VDSWORLD.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


[Open Source]: Simple OCR (Optical Character Recognition)

 
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> Visual DialogScript Open Source Projects
View previous topic :: View next topic  
Author Message
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Sun Apr 09, 2006 6:54 pm    Post subject: [Open Source]: Simple OCR (Optical Character Recognition) Reply with quote

Because I make a lot of (paper) notes at school, I started looking for a way to scan them into my computer. I tried some standard applications which do OCR and ICR, but both did not return good results. That is why I started making my own OCR/ICR engine.

Normal letters are difficult for computers to read. An "a" looks like a "d", an "c" like a "e" or "o", etc. To overcome this problem, I created my own characters. In the version of the program which comes with this post, eight different characters can be recognized and outputted as numbers corresponding to them. Just like with morse code and braille, you can combine two characters to create more combinations (8*8=64). For now, my program does not do this since it is far from done and maybe it still will be possible to let it read normal letters.

I used a normal ballpoint to write my characters on normal paper with lines. To get better recognition results you can use a black fine-liner.

Here comes the source code, maybe some people can improve it/use it in their own projects... Smile The images used can be found in the zip file included Smile

Code:
  # Turn the output bitmap on or off
  %%DebugOutput = on

  #define command,image
  #define function,image,IsPixel,RecognizeChar,IsCharPixel

  # Enter your registration info for the vdsimage.dll below
  # vdsimage.dll is a part of Tommy Sools' VDSDLL3
  external @path(%0)vdsimage.dll

  option decimalsep,.
  %%Inputfile = @path(%0)Input.bmp

  title Basic OCR

  DIALOG CREATE,Basic OCR,-1,0,850,500
  DIALOG ADD,STYLE,sBLUE,,,,BLUE,
  DIALOG ADD,STYLE,sRED,,,,RED,
  DIALOG ADD,STYLE,sGRAY,,,,SILVER,
  DIALOG ADD,STYLE,sOUTPUT,Courier New,8,,,
  DIALOG ADD,LINE,LINE1,348,-2,854,150
  DIALOG ADD,BITMAP,bInput,0,0,852,350,%%Inputfile
  DIALOG ADD,TEXT,Leftmargin,0,113,1,350,,,sRED,CLICK,HSPLIT
  DIALOG ADD,TEXT,Rightmargin,0,801,1,350,,,sBLUE,CLICK,HSPLIT
  DIALOG ADD,TEXT,Firstline,90,0,852,1,,,sRED,CLICK,VSPLIT
  DIALOG ADD,TEXT,Helpline1,119,0,852,1,,,sGRAY
  DIALOG ADD,TEXT,Helpline2,148,0,852,1,,,sGRAY
  DIALOG ADD,TEXT,Secondline,177,0,852,1,,,sBLUE,CLICK,VSPLIT
  DIALOG ADD,BUTTON,bStart,366,12,192,24,Start processing
  DIALOG ADD,BUTTON,bStop,396,12,192,24,Stop processing
  DIALOG ADD,PROGRESS,pStatus,450,12,192,18,0
  DIALOG ADD,EDIT,eOutput,366,246,486,102,How to use:@cr()@lf()@cr()@lf()Put the horizontal blue line on the first line of your picture. Put the red line on the fourth line. (Note the helplines will take positions on the second and third line) Now specify the left and right margin with the two vertical lines and press the Start processing button!,,MULTI,WRAP,SCROLL,TABS,sOUTPUT
  DIALOG ADD,MENU,File,Open bitmap,Close
  DIALOG ADD,STATUS,Statusbar,
  DIALOG ADD,BITMAP,bChars,366,756,42,108,@path(%0)Chars.bmp
  DIALOG SHOW

  dialog disable,bStop

:Evloop
  wait event
  goto @event()

:LeftmarginCLICK
  # Move the left margin line
  %%MousedownX = @diff(@mousepos(X),@dlgpos(Leftmargin,L))
  while @mousedown(L)
    if @greater(@diff(@dlgpos(Rightmargin,L),30),@diff(@mousepos(X),%%MousedownX))
      dialog setpos,Leftmargin,0,@diff(@mousepos(X),%%MousedownX)
    end
  wend
  goto Evloop

:RightmarginCLICK
  # Move the right margin line
  %%MousedownX = @diff(@mousepos(X),@dlgpos(Rightmargin,L))
  while @mousedown(L)
    if @greater(@diff(@mousepos(X),%%MousedownX),@sum(@dlgpos(Leftmargin,L),30))
      dialog setpos,Rightmargin,0,@diff(@mousepos(X),%%MousedownX)
    end
  wend
  if @greater(@dlgpos(Rightmargin,L),@dlgpos(,W))
    dialog setpos,Rightmargin,0,@diff(@dlgpos(,W),10)
  end
  goto Evloop

:FirstlineCLICK
  # Move the first horizontal line + the helplines
  %%MousedownY = @diff(@mousepos(Y),@dlgpos(Firstline,T))
  while @mousedown(L)
    if @greater(@diff(@dlgpos(Secondline,T),10),@diff(@mousepos(Y),%%MousedownY))
      dialog setpos,Firstline,@diff(@mousepos(Y),%%MousedownY)
      # Adjust helplines
      dialog setpos,Helpline1,@sum(@dlgpos(Firstline,T),@div(@diff(@dlgpos(Secondline,T),@dlgpos(Firstline,T)),3))
      dialog setpos,Helpline2,@sum(@dlgpos(Firstline,T),@prod(@div(@diff(@dlgpos(Secondline,T),@dlgpos(Firstline,T)),3),2))
    end
  wend
  goto Evloop

:SecondlineCLICK
  # Move the second horizontal line + the helplines
  %%MousedownY = @diff(@mousepos(Y),@dlgpos(Secondline,T))
  while @mousedown(L)
    if @greater(@diff(@mousepos(Y),%%MousedownY),@sum(@dlgpos(Firstline,T),10))
      dialog setpos,Secondline,@diff(@mousepos(Y),%%MousedownY)
      # Adjust helplines
      dialog setpos,Helpline1,@sum(@dlgpos(Firstline,T),@div(@diff(@dlgpos(Secondline,T),@dlgpos(Firstline,T)),3))
      dialog setpos,Helpline2,@sum(@dlgpos(Firstline,T),@prod(@div(@diff(@dlgpos(Secondline,T),@dlgpos(Firstline,T)),3),2))
    end
  wend
  if @greater(@dlgpos(Secondline,T),@dlgpos(bInput,H))
    dialog setpos,Secondline,@diff(@dlgpos(bInput,H),10)
  end
  goto Evloop

:Open BitmapMENU
  # Open a new file
  %%Temp = @filedlg(*.bmp)
  if @file(%%Temp)
    %%Inputfile = %%Temp
    dialog set,bInput,%%Inputfile
  end
  goto Evloop

:bStartBUTTON
  dialog disable,bStart
  dialog enable,bStop
  dialog clear,eOutput
 
  image open,%%Inputfile

  %%ImageHeight = @image(height)
  %%ImageWidth = @image(width)

  %%Firstline = @dlgpos(Firstline,T)
  %%Secondline = @dlgpos(Secondline,T)
  %%Leftmargin = @dlgpos(Leftmargin,L)
  %%Rightmargin = @dlgpos(Rightmargin,L)

  %%LineHeight = @div(@diff(@dlgpos(Secondline,T),@dlgpos(Firstline,T)),3)
  %%LineWidth = @diff(%%Leftmargin,%%Rightmargin)

  %%CurrentLine = 0
  %%CurrentLineY = %%Firstline

  %%TotalLines = @div(@diff(%%ImageHeight,%%Firstline),%%LineHeight)

  dialog set,pStatus,@format(@fmul(@fdiv(1,%%TotalLines),100),0.0)
  dialog set,Statusbar,Decoding line 1 of %%TotalLines

  # Loop through all lines or until the stop button is pressed
  while @both(@greater(%%ImageHeight,@sum(%%CurrentLineY,%%LineHeight)),@unequal(@event(),bStopBUTTON))
    %%CurrentColumn = 0
    %%CurrentColumnX = %%Leftmargin
   
    %%LeftCharBorder = -1
    %%RightCharBorder = -1
    # Loop through all columns from %%Leftmargin to %%Rightmargin
    while @greater(%%Rightmargin,%%CurrentColumnX)
      # Loop through all pixels of the current column to see if it is empty or not.
      # Empty pixels have a value of 9500000 or higher. Non empty pixels have a value
      # of 9500000 or lower.
      %%NonEmptyPixel = 0
      %%CurrentPixel = 0
      %%CurrentPixelY = %%CurrentLineY
      while @both(@greater(@sum(%%CurrentLineY,%%LineHeight),%%CurrentPixelY),@equal(%%NonEmptyPixel,0))
        %%Pixel = @IsPixel(%%CurrentColumnX,%%CurrentPixelY)
        # Check if the pixel is empty or not
        if %%Pixel
          %%NonEmptyPixel = @succ(%%NonEmptyPixel)
        end
        %%CurrentPixel = @succ(%%CurrentPixel)
        %%CurrentPixelY = @sum(%%CurrentLineY,%%CurrentPixel)
      wend
      if @greater(%%NonEmptyPixel,0)
        # This column contains a pixel (begin of character)
        if @equal(%%RightCharBorder,-1)
          if @equal(%%LeftCharBorder,-1)
            # The left border is determined
            %%LeftCharBorder = %%CurrentColumnX
            %%RightCharBorder = -1
          end
        end
      else
        # This column is empty (end of character)
        if @unequal(%%LeftCharBorder,-1)
          %%RightCharBorder = @succ(%%RightCharBorder)
          if @equal(%%RightCharBorder,1)
            # The right border is determined
            %%RightCharBorder = @pred(%%CurrentColumnX)
            if @greater(@diff(%%RightCharBorder,%%LeftCharBorder),5)
              %%TempTop = %%CurrentLineY
              %%TempBottom = @sum(%%CurrentLineY,%%LineHeight)
              %%TempLeft = %%LeftCharBorder
              %%NonEmptyPixel = 0
              # Determine the exact top
              while @equal(%%NonEmptyPixel,0)
                while @both(@greater(%%RightCharBorder,%%TempLeft),@equal(%%NonEmptyPixel,0))
                  if @ispixel(%%TempLeft,%%TempTop)
                    %%NonEmptyPixel = 1
                  end
                  %%TempLeft = @succ(%%TempLeft)
                wend
                %%TempLeft = %%LeftCharBorder
                %%TempTop = @succ(%%TempTop)
              wend
              %%TempLeft = %%LeftCharBorder
              %%NonEmptyPixel = 0
              # Determine the exact bottom
              while @equal(%%NonEmptyPixel,0)
                while @both(@greater(%%RightCharBorder,%%TempLeft),@equal(%%NonEmptyPixel,0))
                  if @ispixel(%%TempLeft,%%TempBottom)
                    %%NonEmptyPixel = 1
                  end
                  %%TempLeft = @succ(%%TempLeft)
                wend
                %%TempLeft = %%LeftCharBorder
                %%TempBottom = @pred(%%TempBottom)
              wend
              # Check if a space should be added before the character
              if @greater(%%LeftCharBorder, @sum(%%LastRightBorder,@diff(%%RightCharBorder,%%LeftCharBorder),10))
                # Check if the character is not the first one in a line
                if %%LastRightBorder
                  dialog set,eOutput,@dlgtext(eOutput)" "
                end
              end
              # Recognize character
              dialog set,eOutput,@dlgtext(eOutput)@RecognizeChar(%%TempTop,%%TempBottom,%%LeftCharBorder,%%RightCharBorder)
              if @equal(%%DebugOutput,on)
                image line,%%TempTop,%%LeftCharBorder,@diff(%%RightCharBorder,%%LeftCharBorder),0,BLACK
                image line,%%TempBottom,%%LeftCharBorder,@diff(%%RightCharBorder,%%LeftCharBorder),0,BLACK
                image line,%%TempTop,%%LeftCharBorder,0,@diff(%%TempBottom,%%TempTop),BLACK
                image line,%%TempTop,%%RightCharBorder,0,@diff(%%TempBottom,%%TempTop),BLACK
              end
              %%LastRightBorder = %%RightCharBorder
            end
            %%LeftCharBorder = -1
            %%RightCharBorder = -1
          end
        end
      end
      %%CurrentColumn = @succ(%%CurrentColumn)
      %%CurrentColumnX = @sum(%%Leftmargin,%%CurrentColumn)
    wend
    %%CurrentLine = @succ(%%CurrentLine)
    %%CurrentLineY = @sum(%%Firstline,@prod(%%CurrentLine,%%LineHeight))
    %%LastRightBorder =
    dialog set,eOutput,@dlgtext(eOutput)@cr()@lf()
    dialog set,Statusbar,Decoding line @succ(%%CurrentLine) of %%TotalLines
    dialog set,pStatus,@format(@fmul(@fdiv(@succ(%%CurrentLine),%%TotalLines),100),0.0)
  wend
  dialog set,Statusbar,Done decoding
  if @equal(%%DebugOutput,on)
    image save,@path(%0)DebugOutput.bmp
  end
  dialog enable,bStart
  dialog disable,bStop
  info Done
  goto Evloop

:Close
  exit

:RecognizeChar
  # This function gets the top, bottom, left and right pixel positions of the
  # found character. Use these coordinates to "see" which character was written down.
  # %1 = top
  # %2 = bottom
  # %3 = left
  # %4 = right
  %3 = @succ(%3)
  %4 = @pred(%4)
 
  # Char 1, 3, 4, 6, 7 & 8
  # Left Top
  if @IsCharPixel(%3,%1)
    # Left Bottom
    if @IsCharPixel(%3,%2)
      # Right Top
      if @IsCharPixel(%4,%1)
        # Right Bottom
        if @IsCharPixel(%4,%2)
          # = Left Top + Left Bottom + Right Top + Right Bottom
          exit 7
        else
          # = Left Top + Left Bottom + Right Top + <> Right Bottom
          exit 1
        end
      # Right Bottom
      elsif @IsCharPixel(%4,%2)
        # = Left Top + Left Bottom + Right Bottom
        exit 3
      else
        # If not Right Top & Right Bottom
        if @both(@not(@IsCharPixel(%4,%1)),@not(%4,%2))
          # = Left Top + Left Bottom + <> Right Top + <> Right Bottom
          exit 8
        end
      end
    else
      # Right Top
      if @IsCharPixel(%4,%1)
        # Right Bottom
        if @IsCharPixel(%4,%2)
          # = Left Top + Right Top + Right Bottom
          exit 4
        end
      # Right Bottom
      elsif @IsCharPixel(%4,%2)
        # = Left Top + Right Bottom
        exit 6
      end
    end
  # Char 2 & 5
  # Right Top
  elsif @IsCharPixel(%4,%1)
    # Right Bottom
    if @IsCharPixel(%4,%2)
      # Left Bottom
      if @IsCharPixel(%3,%2)
        # = Right Top + Right Bottom + Left Bottom
        exit 2
      end
    # Left Bottom
    elsif @IsCharPixel(%3,%2)
      # = Right Top + Left Bottom
      exit 5
    end
  end
  # Character was not recognized, return an error X
  exit X

:IsCharPixel
  # This function will return 1 if the specified pixel is not empty (including a neighbour pixel) or
  # returns Null if the pixel is empty. Same as the IsPixel function, with the exception that this
  # one is greedier; it returns 1 more often.
  # %1 = x
  # %2 = y
  %5 = @image(pixel,@pred(%1),%2)
  %6 = @image(pixel,@succ(%1),%2)
  %7 = @image(pixel,%1,@succ(%2))
  %8 = @image(pixel,%1,@pred(%2))
  %9 = @image(pixel,%1,%2)
  if @greater(10000000,%5)@equal(%5,BLACK)@greater(10000000,%6)@equal(%6,BLACK)@greater(10000000,%7)@equal(%7,BLACK)@greater(10000000,%Cool@equal(%8,BLACK)@greater(10000000,%9)@equal(%9,BLACK)
    # Pixel is not empty
    exit 1
  end
  exit

:IsPixel
  # This function will return 1 if the specified pixel is not empty (including a neighbour pixel) or
  # returns Null if the pixel is empty.
  # %1 = x
  # %2 = y
  %9 = @image(pixel,%1,%2)
  if @greater(9500000,%9)@equal(%9,BLACK)
    # Pixel is not empty, check its neighbours; left, right and below
    %7 = @image(pixel,@pred(%1),%2)
    %8 = @image(pixel,@succ(%1),%2)
    %9 = @image(pixel,%1,@succ(%2))
    if @greater(9500000,%7)@equal(%7,BLACK)@greater(9500000,%Cool@equal(%8,BLACK)@greater(9500000,%9)@equal(%9,BLACK)
      exit 1
    end
    exit
  end
  exit



ocr.png
 Description:
Screenshot of the main program
 Filesize:  250.97 KB
 Viewed:  976 Time(s)

ocr.png



ocr.zip
 Description:
Source code, used images and dlls

Download
 Filename:  ocr.zip
 Filesize:  528.62 KB
 Downloaded:  1025 Time(s)


_________________
[ Add autocomplete functionality to your VDS IDE windows! ]
Voor Nederlandse beginners met VDS: bekijk ook eens deze tutorial!
Back to top
View user's profile Send private message
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Mon Apr 10, 2006 8:12 pm    Post subject: Reply with quote

Included another example with this post. I use 100 DPI grayscale scans to be recognized, which gives pretty good results (100% recognition if you write the characters properly Smile). One thing you should avoid is scanning paper on which is written on the other side because it shines through a little bit. Smile


Input 2.bmp
 Description:
An example of a document you can scan
 Filesize:  329.71 KB
 Viewed:  9453 Time(s)

Input 2.bmp



screenshot.JPG
 Description:
Screenshot of the results
 Filesize:  63.06 KB
 Viewed:  938 Time(s)

screenshot.JPG



_________________
[ Add autocomplete functionality to your VDS IDE windows! ]
Voor Nederlandse beginners met VDS: bekijk ook eens deze tutorial!
Back to top
View user's profile Send private message
Dr. Dread
Professional Member
Professional Member


Joined: 03 Aug 2001
Posts: 1065
Location: Copenhagen, Denmark

PostPosted: Tue Apr 11, 2006 10:58 am    Post subject: Reply with quote

Very impressive Thumbs Up

Dread

_________________
~~ Alcohol and calculus don't mix... Don't drink and derive! ~~

String.DLL * advanced string processing
Back to top
View user's profile Send private message
FreezingFire
Admin Team


Joined: 23 Jun 2002
Posts: 3508

PostPosted: Mon May 01, 2006 11:58 pm    Post subject: Reply with quote

Skit -- That's really amazing... The power of VDS is limitless Very Happy
_________________
FreezingFire
VDSWORLD.com
Site Admin Team
Back to top
View user's profile Send private message Visit poster's website
WidgetCoder
Contributor
Contributor


Joined: 28 May 2002
Posts: 126
Location: CO, USA

PostPosted: Tue May 02, 2006 4:28 am    Post subject: Reply with quote

Wow this is truly ingenious.. Good accuracy at an acceptable speed!! This is certainly proof-positive that with a few tools and a lot of knowledge you can solve any problem; whereas I usually attack my problems with a lot of tools and too few knowledge... Wink
Back to top
View user's profile Send private message Send e-mail
Skit3000
Admin Team


Joined: 11 May 2002
Posts: 2166
Location: The Netherlands

PostPosted: Tue May 02, 2006 3:52 pm    Post subject: Reply with quote

I created the same code in Visual Basic .NET and VB6 as well to see if I could speed it up, but they both took about twice as long to get the same results as VDS... Smile
_________________
[ Add autocomplete functionality to your VDS IDE windows! ]
Voor Nederlandse beginners met VDS: bekijk ook eens deze tutorial!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forum.vdsworld.com Forum Index -> Visual DialogScript Open Source Projects All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum




Twitter@vdsworld       RSS

Powered by phpBB © 2001, 2005 phpBB Group