[Gambas-user] Wordcount in a TextEdit
Ron Onstenk
ronstk at ...239...
Sat Apr 19 20:37:52 CEST 2008
On Saturday 19 April 2008 11:27, Maximillian Von Kloisterheim wrote:
> richard terry wrote:
> > Max, did you try just reading it out as plain text, doing something like
> > stripping out all ascii 32's and then counting - this would fail if things
> > like word , but overall should be ok.
> >
> > Regards
> >
> > Richard
> >
>
> Hi Richard, yes, that's what I did, and it does work, but it doesn't
> take into consideration double spaces, words with numbers in them and
> numbers themselves, to mention but a few.
>
> The count doesn't have to be deadly accurate, but it needs to be close.
> Iv noticed a whole lot of difference between different programs and the
> way they count their words, so I don't need to be 100%, just as long as
> I know the criteria.
>
> As it is I'm making do with the bits of code that Ron posted, but I'm
> also looking into the possibility of counting words as you type (Just
> like the spelling), and I'm seriously thinking of making the time to
> create a new editor control that has such functionality built in.
>
> There is also the possibility of SHELLing out to AWK to get the job
> done, but just like the spelling, while this may be accurate, its not
> easy to get it to work on a per-word basis as you type.
>
> Regards
>
> Max
>
On Saturday 19 April 2008 11:27, Maximillian Von Kloisterheim wrote:
> richard terry wrote:
> > Max, did you try just reading it out as plain text, doing something like
> > stripping out all ascii 32's and then counting - this would fail if things
> > like word , but overall should be ok.
> >
> > Regards
> >
> > Richard
> >
>
> Hi Richard, yes, that's what I did, and it does work, but it doesn't
> take into consideration double spaces, words with numbers in them and
> numbers themselves, to mention but a few.
>
> The count doesn't have to be deadly accurate, but it needs to be close.
> Iv noticed a whole lot of difference between different programs and the
> way they count their words, so I don't need to be 100%, just as long as
> I know the criteria.
>
> As it is I'm making do with the bits of code that Ron posted, but I'm
> also looking into the possibility of counting words as you type (Just
> like the spelling), and I'm seriously thinking of making the time to
> create a new editor control that has such functionality built in.
>
> There is also the possibility of SHELLing out to AWK to get the job
> done, but just like the spelling, while this may be accurate, its not
> easy to get it to work on a per-word basis as you type.
>
> Regards
>
> Max
>
Hint:
keep a mirror buffer with the text.
keep count variable
keep lastkey variable
check for 'space', 'cr', ',' or '.' if key if pressed
'only these makes a word valid after all
if newkey <> lastkey then then ' prevent invokes on i.e. double spaces
1)
put textbox in tmp buffer
the replace cr to space (my replcae example code part)
(optional delete all numbers someway in tmp)
compare mirror buffer against textbox/tmp buffer
if different
call 'wc' 'awk' or my bits of code with textbox/tmp as source.
if count different
store new count
copy textbox/tmp to mirror buffer
end
end
2)
put textbox in tmp buffer
the replace cr to space (my replcae example code part)
(optional delete all numbers someway in tmp)
compare mirror buffer against textbox/tmp buffer
if different
tmpcount=0
split the different on space into array
(the code as example code)
foreach
check for valid word
if valid increment tmpcount
next
if len(mirror) < len(tmp) then
count=count + tmpcount
end
if len(mirror) > len(tmp) then
count=count - tmpcount
or use full scan with wc/awk/code and store new count
end
copy tmp to mirror buffer
end
end ' lastkey<>newkey
I think using a mirror buffer and count here for compare against
current text will speedup.
The way in 2) here does the time consuming check/split on a smaller text/array
and for both only when a possible word is given
BTW
The second way I did use in the past in a assembly program for something like
this so this is from very old memory.
Ron
More information about the User
mailing list