What is a promise in Javascript?

Question

RazerJs

Asked: 2020-12-07 06:29:00 +0800 CST 2020-12-07 06:29:00 +0800 CST 2020-12-07 06:29:00 +0800 CST

Do not select intermediate characters in a regex

772

I was making one regexthat extracts the tagsfrom htmlof a stringto just leave the text between these.. Ex:

"<a href='#'>go to <b>start</b> page</a>" 
capturo: <a href='#'>, <b>, </b> y </a>
resultado = go to start page

"<div>prueba</div>"
capturo: <div> y </div>
resultado = prueba

My regex is the following:

var reg = new Regex((\<.+\>)|(\<\/.+\>)|(\<.+\/>), 'g')

It is designed so that if it finds a tagtype <tag>, </tag>or <tag/>matches and then makes a replacewith my regex and so I would only have the text... But it also matches the intermediate characters... I have tried several things using (?:)so that it does not capture the characters between the two tagbut it doesn't work for me.

I also tried with :

\<.+\>(?:.)+\<\/.+\>

I would like if possible to know how not to match characters in the middle of a regex...

Regex tests

1 Answers

Voted

Angel Fraga Parodi · Answer 1 · 2020-12-07T07:29:43+08:00

I had to do something similar, if I remember correctly I used something similar to this.

var content = [
   "<a required='ok' asdas='asdasd'>1<b>2</b></a>", // 12
   "<a required>1<b>2</b></a>", // 12
   '<a required href="asdasdasdasd"       >1<b>2</b></a>', // 12
   "<a>1<b>2</b></a> 3 </c>", // 12 3
   "<a-1>1<b>2</b></a1> 3 </c>", // 12 3
   "<a-1>1<b>2</b></a1> 3 <d-2-0> hello </d-2-0>  </c>", // 12 3 hello  
   "<a-1>1<b>2</b></a1> 3 <d-2-0> hello </d-2-0> c </c>", // 12 3 hello < 
]; 

var reg = /<.+?>/g


content.forEach(s => console.log(s + ' => ', s.replace(reg, '')))

EDIT:

If what you are looking for is to obtain the content of an HTML element as text, that is, by removing tags, it can be done thanks to the innerTextor property textContent. You could even create an element in memory, add the content as innerHTML and then use the methods mentioned above.

console.log('innerText', container.innerText)
console.log('textContent', container.textContent)

<div id="container" class="mi-clase">
  Mi contenido
  <br>
  <p foo="bar"> más contenido </p>
  <p data-otro-atributo="foo" >aquí (  ) van dos espacios</p>
</div>

innerTextOne of the drawbacks of this method is that it returns a single space in the case of finding several in a row.

I advise using instead textContent.

NOTE

In the example code above we apply on an existing element. In the following example we generate everything in memory from a text string.

function extraerTexto(contenido) {
 var contenedor = document.createElement('div');
 contenedor.innerHTML = contenido;
 var texto = contenedor.textContent;
 contenedor = null;
 return texto;
}

var content = [
   "<a>1<b>2</b></a>", // 12
   "<a>1<b>2</b></a> 3 </c>", // 12 3
   "<a-1>1<b>2</b></a1> 3 </c>", // 12 3
   "<a-1>1<b>2</b></a1> 3 <d-2-0> hello </d-2-0>  </c>", // 12 3 hello  
   "<a-1>1<b>2</b></a1> 3 <d-2-0> hello </d-2-0> < </c>", // 12 3 hello < 
   `  Mi contenido
  <br>
  <p foo="bar"> más contenido </p>
  <p data-otro-atributo="foo" >aquí (  ) van dos espacios</p>`
];

 
content.forEach(s => console.log(extraerTexto(s)))

Do not select intermediate characters in a regex

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?