What is a promise in Javascript?

Question

Máxima Alekz

Asked: 2020-12-05 18:08:35 +0800 CST 2020-12-05 18:08:35 +0800 CST 2020-12-05 18:08:35 +0800 CST

JavaScript RegEx - Capture text between specific characters

772

I have the following text:

Hi, I was calling to ask you to -b-please- take the -b-kids- to -b-school-

How can I with Javascript capture the text that is inside -b- ... - and so what, regardless of whether there is pasted text such as: hello-b-Sofia-

Just what's inside. Well, I'd like to make that text bold, italic, or strikethrough.

Something like:

Hello, I was calling to ask you to take the children to school .

I had found a regular expression that worked but it only worked in PHP, I don't know much about regular expressions.

Well, considering that hyphens are allowed inside -b-... - I would like to know which method is more insightful or efficient, but, explaining why, of course. And I would like to know in what circumstances indexOf is better and what others which RegExp

Regarding Mariano's Response

The regular expression works very well, although it only works if the content does not have hyphens within it. As can be seen in fav-or .

I think this regular expression is even more complex than it normally looks. Being a string: "I use many --- because -- I am - rebellious -.-."

Then it would be:

-b-I use many --- because -- I am - rebellious -.-.-

Now, there should be a rule that it should always look for the last existing hyphen before another -b- ; Not the first, according to the given regular expression, matches the first occurrence.

After that, if that "last" is not found, there is no match and therefore normal text remains. And thanks for being the fastest cowboy in the west: v

Regarding Montoro's answer

It sounds great to "make life more complicated" sometimes or maybe all the fucking time, because I usually find problems with everything.

The solution with indexOf is faster in execution than RegExp , although in terms of code handling it is a bit complex. I do not understand the use of some -1 (I do not understand much). It sounds crazy, but it really works even with the use of hyphens inside.

And Lol, I usually use JQuery :)

2 Answers

Voted

Mariano · Answer 1 · 2020-12-05T18:27:16+08:00

It can be used replace( regexp, reemplazo)with the following regular expression:

/-b-([^-]+)-/g

and replacing it with<b>$1</b>

var texto = "Hola, llamaba para pedirte el -b-favor- de que lleves los -b-niños- a la -b-escuela-",
    regex = /-b-([^-]+)-/g,
    reemp = "<b>$1</b>",
    resultado = document.getElementById("resultado");

resultado.innerHTML = texto.replace(regex,reemp);

<p id="resultado"></p>

Description:

-b-- matches the literal text.
([^-]+)Group 1 - matches:
- [^-]+- 1 or more characters that are not hyphens ( -).
-- matches the literal text
Modifier: g- Find all matches, not just the first one.

Group 1, in addition to matching the text that is between hyphens, also creates a trap. When replacing, $1contains the value of that capture.

Avoid HTML tags within the syntax:

Also, within the syntax used -b-... -there should be no HTML tags, so as not to " break " the structure. One possible way around it would be to match structs that don't have any <, using the regex:

/-b-([^-<]+)-/g

Any of these expressions work in any dialect of Perl- based regex ( Perl-like regex ), so they will work in JavaScript, PHP, or any of the other commonly used languages.

Include dashes within the syntax:

And if we wanted to make it a bit more complicated: how would we go about allowing hyphens within bold text? We could ask them to get away with a \. In that case we would use:

/-b-([^-<\\]*(?:\\.[^-<\\]*)*)-/g

This structure uses the technique known as unrolling the loop , by having la \inside disallowed characters as normal , and then matching a slash followed by any character ( \\.) and more normal characters .

End Code:

var texto = String.raw`Hola, llamaba para pedirte el -b-fav\-or- de que lleves los -b-niños- a la -b-e\-s\-c\-u\-e\-l\-a- hoy`,
    regex = /-b-([^-<\\]*(?:\\.[^-<\\]*)*)-/g,
    reemp = "<b>$1</b>",
    resultado = document.getElementById("resultado");

resultado.innerHTML = texto.replace(regex,reemp);

<p id="resultado"></p>

Answers to edited question:

I would like to know which method is more insightful or efficient

I'm not going to answer the general question, since it's based on opinions, but I will compare it with @AlvaroMontoro's answer , which is excellent and I recommend giving it a +1 vote. And it is worth clarifying that the proposed implementations seek different results (we are comparing pears with bananas, see the point below).

If we take the general comparison, and for the examples used, differences of approximately 9% (in the order of 6μs) are observed, something that I would not call relevant for JavaScript. However, it all depends on the text being compared. For example, if we take a longer text (6 paragraphs), we can get results with approximately twice the efficiency with regex (comparison in JSPerf ). And it is probably also possible to direct the tests to texts that benefit lastIndexOf().

only works if the content has no hyphens inside it

This is not correct. As discussed in this answer, to allow hyphens within the syntax, they must be escaped with a backslash ( \).

$Works with '-b-fav\-or-'$

Demo en regex101.com

there should be a rule that it should always look for the last existing hyphen before another -b- ; not the first

Why do I think it is not convenient to search for the last occurrence of a hyphen? I think it's a wrong decision to look for the last occurrence, since it doesn't allow to effectively close a syntax. Let's consider this example:

-b-Título:-
Y ahora en el texto no tengo forma de usar un guión porque sino -acá- 
lo tomaría como el fin de las negritas

If this were the syntax used in the SO posts, we wouldn't be able to use hyphens after the last bolds, we would have no way to close them . If it were used in user-entered text, I wouldn't know how to document the use. Instead, I think it's much more efficient (and more commonly used) to ask it to escape them "-b-por fav\-or-".

However, if you are still looking to match the last occurrence, I would ask for clarification in the question as to how a hyphen can be used after the last bold.

I think this regular expression is even more complex than it normally looks

It is a myth that a longer regular expression is less efficient, that you hear many times, but it is still false and many times it is just the opposite. In fact, the technique used is very common, and you can read about it in more detail at:

Section 6.7. Unrolling the Loop in the book "Mastering Regular Expressions" by Jeffrey Friedl.
Unrolling the loop (the previously referenced article).
Mimic an Alternation Quantified by a Star (rexegg.com)
Using regexes, how to efficiently match strings between double quotes with embedded double quotes? (SW)
Regex match anything up to word - without non-greedy operators (SO)

Note: I could have presented it more abbreviated, /-b-(([^-\\]|\\.)*)-/gbut I preferred to incorporate a much more efficient version, and of higher quality (here longer is more efficient).

It basically consists of using:

normal* ( especial normal* )*

Where normal is all characters except -, \and <, and special is any character preceded by a backslash, to match \-.

Mechanism:

it tries to match any normal character [^-<\\]*, as much as possible,
and from there, try to match an escape \\.
followed by more normal characters[^-<\\]*
Repeat point (2) as many times as necessary (for each escape present in the text).

And thanks for being the fastest cowboy in the west.

I didn't mean to :-) I believe in quality above all other things.

Alvaro Montoro · Answer 2 · 2020-12-05T19:27:26+08:00

I know that the question cries out for regular expressions, and that using them will simplify your life a lot ( Mariano's solution is very elegant and barely occupies a single line)... but sometimes I like to complicate my life :P

Regular expressions are powerful and flexible... but that also makes them slow. If you're looking for a specific string, indexOfit will work too. Based on that, I have made a small algorithm that, inside a loop and sequentially:

Find the string -b-and replace it with<b>
Find the following -and replace it with</b>

Note: indexOfreturns the position (index) where the first occurrence of the searched substring begins. If not found, it returns -1. For example: the string " hola caracola". If we do .indexOf("ola"), the result will be 1, which is the index where the searched substring first appears (remember that in JavaScript, the first position is 0). And if we do .indexOf("adios"), the result is going to be -1, because the searched string is not found.

The code isn't very pretty or as clean as Mariano's solution, but testing with JSPerf , its performance seems to be comparable.

This would be the code:

var texto = "Hola, llamaba para pedirte el -b-favor- de que lleves los -b-niños- a la -b-escuela-",
    ini = 0,
    pos = 0,
    texto2 = "",
    resultado = document.getElementById("resultado");

while ((pos = texto.indexOf("-b-")) > -1) {	
  var posguion = texto.indexOf("-", pos+3);
  texto2 += texto.substring(ini, pos) +"<b>"+ texto.substring(pos+3, posguion) +"</b>";
  texto = texto.substring(posguion+1);
}

resultado.innerHTML = texto2;

<div id="resultado"></div>

Edit: Mariano told me that the code had a problem if the chain was not closed correctly (if there was a -b-without -after)... and he was right. So I changed the code a bit so that an additional check is done to avoid an infinite loop.

The result looks like this:

var texto = "Hola, llamaba para pedirte el -b-favor- de que lleves los -b-niños- a la -b-escuela-",
    ini = 0,
    pos = 0,
    texto2 = "",
    resultado = document.getElementById("resultado");

while ((pos = texto.indexOf("-b-")) > -1) {	
  if ((posguion = texto.indexOf("-", pos+3)) > -1) {
    texto2 += texto.substring(ini, pos) + "<b>" +  texto.substring(pos+3, posguion) + "</b>";
    texto = texto.substring(posguion+1);
  } else {
  	texto2 += texto.substring(ini, pos) + "<b>"  +  texto.substring(pos+3) + "</b>";
    texto = texto.substring(pos+3);
  }
}

resultado.innerHTML = texto2;

<div id="resultado"></div>

Assume that if you have left a -b-without its closing, then it is bold until the end of the sentence. And the results in JSPerf seem to remain comparable.

Máxima Alekz correctly commented that my code did not allow internal hyphens. A workaround for if you allow them would be to traverse the chain backwards instead of forwards. To do this instead of using indexOf, we would use lastIndexOf.

Note: lastIndexOfreturns the position (index) where the last occurrence of the searched substring begins. If not found, it returns -1. For example: the string " hola caracola". If we do .indexOf("ola"), the result will be 10, which is the index where the searched substring appears for the last time. And if we do .indexOf("adios"), the result is going to be -1, because the searched string is not found.

What the algorithm does now is look for the last one -b-in the chain and link it to the last -one found after that. If no hyphen is found, the end of string is considered to be the end of bold.

The code would look like this:

var texto = "Hola, llamaba para pedirte el -b-favor- de que lleves los -b-niños -y niñas-- a la -b-escuela-",
    ini = 0,
    pos = 0,
    texto2 = "",
    resultado = document.getElementById("resultado");

while ((pos = texto.lastIndexOf("-b-")) > -1) {	
  if ((posguion = texto.lastIndexOf("-")) > -1 && posguion > pos+3) {
    texto2 = "<b>" +  texto.substring(pos+3, posguion) + "</b>" + texto.substring(posguion+1) + texto2;

  } else {
    texto2 = "<b>" + texto.substring(pos+3) + "</b>" + texto2;
  }
  texto = texto.substring(0, pos);
}

texto2 = texto + texto2;
resultado.innerHTML = texto2;

<div id="resultado"></div>

And here the results in JSPerf , which are still similar to the ones above.

JavaScript RegEx - Capture text between specific characters

Regarding Mariano's Response

Regarding Montoro's answer

End Code:

Answers to edited question:

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?