I am trying to make a lexical analyzer for school questions, the point is that the comments must start with a double slash "//" and after them there can be characters of any type, be it exclamation and question marks, arithmetic symbols, spaces or special characters. However, for the definition of regular expressions these symbols are used. The regular expression I currently have is:
[a-zA-ZÀ-ÖØ-öø-ÿ_0-9 \n\t\r,.;]+
I took part of this regular expression from the answer that is in the following link . With the regular expression as I have it so far it accepts strings like:
//Hola, que tal;
//Hola.;,ooo
//
//1234
Comments that you do not accept are, for example:
//Hola--------
//Comentario+-+-*
//Hola/+
Although the result is acceptable for small comments, there are times when you want to draw a matrix or an array with characters, which this regular expression does not do.
What I recommend is to separate what you want to analyze into parts, for example, if your context is to analyze java code, what you can do is have a regex to find only comments:
What this expression does is match any character
/
with 2 or more occurrences, escaped with\
, since the character/
is reserved for regex syntax. In the next section we tell it to find any character that is a white space\s
and any character that is not a white space\S
with 0 or more occurrences.With this other you can compare class start syntax in Java:
Where we indicate that there must be any white space with 0 or more repetitions then the word
public
, again any white space with 0 or more repetitions and so on.Don't try to do everything with an expression, try to separate each line at a time, otherwise it will be very heavy and the compiler, in the language you use, could take a long time trying to generate the bytecode for this expression.
Another recommendation that I can make is that you use online tools to do quick tests, there are other more complex expressions that can only be done through the compiler, but in general these online tools always cover most of them.
Reference: Online Tool