[Gambas-user] New experimental highlighting component

Benoît Minisini benoit.minisini at gambas-basic.org
Fri Oct 20 02:42:29 CEST 2023


Hi,

I recently committed a new component named 'gb.highlight' that aims at 
replacing the current 'gb.eval.highlight' component.

In that component, text highlighting is defined not by Gambas code, but 
by a definition file.

The definition file is transformed in a bunch of Gambas code compiled at 
runtime that does the highlighting process.

The definition file has a (relatively) simple syntax.

For exemple, here is the HTML highlighting file:

--8<--------------------------------
doctype{Preprocessor}:
   from <!DOCTYPE to >
comment:
   from <!-- to -->
entity{Operator}:
   match &[A-Za-z]+;
   match &#[0-9]+;
markup{Function}:
   match <[a-zA-Z0-9]+ to >
   attribute{Datatype}:
     match [a-zA-Z0-9-]+
   equal{Normal}:
     symbol =
   value{String}:
     from " to "
     from ' to '
     string.entity{Escape}:
       match &[A-Za-z]+;
       match &#[0-9]+;
   value.unquoted{String}:
     match [^"'`=<>\s]+
markup.close{Function}:
   match </[A-zA-Z0-9]+\s*>
--8<--------------------------------

Lines that ends with a ":" introduce a highlighting state. That state 
has a name, and an associated color written between '{' and '}'. If 
there is no explicit color, the state name is used, with the first 
letter converted to uppercase.

After a state, you have a "command" that define which text is associated 
with that state.

Here is the list of commands currently available:

* from BEFORE to AFTER -> Everything between the BEFORE text and the 
AFTER text limits included.

* from BEFORE -> Everything from the BEFORE text until the end of the line.

* between ... -> Like "from ...", but the BEFORE and AFTER limits keep 
the state of the parent - it seems that the command does not really work 
at the moment.

* match REGEXP -> The text that matches the Perl regular expression.

* match BEFORE to AFTER -> Everything between the BEFORE and AFTER 
regular expressions.

* symbol SYMBOL1 SYMBOL2 ... SYMBOLN -> To match symbols (usually 
operators in a programming language).

* word WORD1 WORD2 ... WORDN -> To match words (usually keywords in a 
programming language).

The commands must be indented with any number of spaces.

After the commands, you can define other states. If these states keep 
the same indent (same number of initial spaces) than the commands, then 
these states are imbricated. They are checked only in the context of 
their parent state.

So indentation is important, it defines the imbrication level.

If you have more than one command in a state, then each command is 
checked with a logical 'or'. In other words, the state is identified by 
any of the matching commands.

Here is the javascript highlighting definition file:

--8<---------------------------------------------------------------
documentation:
   from /** to */
comment:
   from /* to */
   from //
string:
   from " to "
   from ' to '
   escape:
     match \\[ntrbf0'"\\]
sstring{String}:
   from ` to `
   escape:
     match \\[ntrbf0'"{\\]
   subst{Preprocessor}:
     between { }
regexp{Datatype}:
   match /.*/[a-z]*
   escape:
     match \\?
number:
   match [+-]?[0-9.]*
   match 0x[0-9a-fA-F]*
keyword:
   word break case catch class const continue debugger default delete do 
else enum export extends finally for get if import in instanceof let new 
return set super switch throw try typeof var void while with yield
function:
   word function
constant:
   word false null this true undefined NaN Infinity
operator:
   symbol { } . >= + << ! = >>= ; == - >> ~ += >>>= ( , != * >>> && -= 
&= ) < === % & || *= |= [ > !== ++ | ? %= ^= -- ^ : <<= ] <= / /=
identifier:
   match [A-Za-z_$][A-Za-z_$0-9]*
--8<---------------------------------------------------------------

And that's all!

Like in 'gb.eval.highlight', the API is the TextHighlighter class. At 
the moment, only the TextHighlighter.ToHTML() method has been 
implemented, and the method that creates an highlighter from a 
definition file.

At the moment, the color names are those found in highlighting theme 
files located in the Gambas IDE source code.

So to test, you do something like that:

--8<---------------------------------------------------------------
Dim hTextHighlighter As TextHighlighter

hTextHighlighter = TextHighlighter.FromFile("html.highlight")

File.Save("~/test.html", hTextHighlighter.ToHTML(File.Load("page.html")))
--8<---------------------------------------------------------------

This is experimental, a lot of things must be done yet, or will change 
for sure.

But I wanted to make it public as soon as possible to have people's 
comments, and because I am relatively proud of it. :-)

Regards,

-- 
Benoît Minisini.


More information about the User mailing list