[Gambas-user] Wordcount in a TextEdit

Ron Onstenk ronstk at ...239...
Sat Apr 19 20:37:52 CEST 2008


On Saturday 19 April 2008 11:27, Maximillian Von Kloisterheim wrote:
> richard terry wrote:
> > Max, did you try just reading it out as plain text, doing something like 
> > stripping out all ascii 32's and then counting - this would fail if things 
> > like word , but overall should be ok.
> > 
> > Regards
> > 
> > Richard
> > 
> 
> Hi Richard, yes, that's what I did, and it does work, but it doesn't 
> take into consideration double spaces, words with numbers in them and 
> numbers themselves, to mention but a few.
> 
> The count doesn't have to be deadly accurate, but it needs to be close. 
> Iv noticed a whole lot of difference between different programs and the 
> way they count their words, so I don't need to be 100%, just as long as 
> I know the criteria.
> 
> As it is I'm making do with the bits of code that Ron posted, but I'm 
> also looking into the possibility of counting words as you type (Just 
> like the spelling), and I'm seriously thinking of making the time to 
> create a new editor control that has such functionality built in.
> 
> There is also the possibility of SHELLing out to AWK to get the job 
> done, but just like the spelling, while this may be accurate, its not 
> easy to get it to work on a per-word basis as you type.
> 
> Regards
> 
> Max
> 
On Saturday 19 April 2008 11:27, Maximillian Von Kloisterheim wrote:
> richard terry wrote:
> > Max, did you try just reading it out as plain text, doing something like 
> > stripping out all ascii 32's and then counting - this would fail if things 
> > like word , but overall should be ok.
> > 
> > Regards
> > 
> > Richard
> > 
> 
> Hi Richard, yes, that's what I did, and it does work, but it doesn't 
> take into consideration double spaces, words with numbers in them and 
> numbers themselves, to mention but a few.
> 
> The count doesn't have to be deadly accurate, but it needs to be close. 
> Iv noticed a whole lot of difference between different programs and the 
> way they count their words, so I don't need to be 100%, just as long as 
> I know the criteria.
> 
> As it is I'm making do with the bits of code that Ron posted, but I'm 
> also looking into the possibility of counting words as you type (Just 
> like the spelling), and I'm seriously thinking of making the time to 
> create a new editor control that has such functionality built in.
> 
> There is also the possibility of SHELLing out to AWK to get the job 
> done, but just like the spelling, while this may be accurate, its not 
> easy to get it to work on a per-word basis as you type.
> 
> Regards
> 
> Max
> 

Hint:
keep a mirror buffer with the text.
keep count variable
keep lastkey variable

check for 'space', 'cr', ',' or '.' if key if pressed 
'only these makes a word valid after all

if newkey <> lastkey then then ' prevent invokes on i.e. double spaces

1)
  put textbox in tmp buffer
  the replace cr to space (my replcae example code part)
  (optional delete all numbers someway in tmp)
  compare mirror buffer against textbox/tmp buffer
  if different
    call 'wc' 'awk' or my bits of code with textbox/tmp as source.
    if count different
      store new count
      copy textbox/tmp to mirror buffer
    end
  end

2)
  put textbox in tmp buffer
  the replace cr to space (my replcae example code part)
  (optional delete all numbers someway in tmp)
  compare mirror buffer against textbox/tmp buffer

  if different
    tmpcount=0
    split the different on space into array
      (the code as example code)
    foreach 
      check for valid word
      if valid increment tmpcount
    next

    if len(mirror) < len(tmp) then 
      count=count + tmpcount
    end 

    if len(mirror) > len(tmp) then
      count=count - tmpcount
      or use full scan with wc/awk/code and store new count
    end

    copy tmp to mirror buffer

  end
 
end ' lastkey<>newkey

I think using a mirror buffer and count here for compare against
current text will speedup.
The way in 2) here does the time consuming check/split on a smaller text/array
and for both only when a possible word is given

BTW
The second way I did use in the past in a assembly program for something like 
this so this is from very old memory.


Ron




More information about the User mailing list