What is a promise in Javascript?

Question

Asked: 2020-02-19 09:39:04 +0800 CST 2020-02-19 09:39:04 +0800 CST 2020-02-19 09:39:04 +0800 CST

How to use regex to get a name between spaces in my text file?

772

I have quite a long text with data names etc. Let's say the data would be like this:

[espacios]Elnombre1[espacios](237) 
[espacios]Elnombre3(237) 
[espacios]Elnombre4(17)

I just need to get the names. Normally the names go with spaces before and after the name, and finally parentheses and inside a number.

Also, I need to add some text inside parentheses (any text will do).

Expected result:

nombremio123textoquepuse(losparentesis)

I tried with:

with open("e.txt", 'r+') as f:
    texto = re.sub('^\s+([a-zA-Z-0-9]+)\s*', f.read())
    f.seek(0)
    f.write(texto)
    f.truncate()

Any possible way to do it by reading a text file and rewriting it with the correct data?

3 Answers

Voted

abulafia · Answer 1 · 2020-02-19T10:35:39+08:00

An idea. Split the line by the parentheses and then make a stripof the first element, to eliminate the spaces that it may have at the beginning and at the end. Namely:

lineas = [
  '   Elnombre1     (237)',
  '      Elnombre3(237)',
  '    Elnombre4(17)' 
  ]

nombres = []

for linea in lineas:
    nombre = linea.split('(')[0].strip()
    nombres.append(nombre)

print(nombres)

Comes out

['Elnombre1', 'Elnombre3', 'Elnombre4']

Update

To work on file, if the number of lines is not huge, one approach would be to read it first, process the lines and accumulate the results in a list, and write it later.

nombres = []
with open("e.txt", "r") as f:
   for linea in f:
     nombre = linea.split('(')[0].strip()
     nombres.append(nombre)
with open("e.txt", "w") as f:  # Cuidado! Sobreescribiendo fichero
   for nombre in nombres:
     f.write("{}\n".format(nombre))

The drawback of this approach is that you have to have the results in memory before writing them. This shouldn't be a problem unless the file is monstrous, but if it were a problem then it would be better to open two files (the original for reading and the results for writing) and write the lines as they are processed instead of storing them in a list. In the end, once the files are closed, you could rename the output file and give it the same name as the input.

Update 2

In a later edition of the question, the possibility of extracting what goes in the parentheses and adding extra text (I understand that prefixed) is requested.

For this type of processing, it becomes preferable to build a regular expression that captures the different parts of the line that are of interest. However regular expressions are known to be a touchy subject, and there is already another answer that shows how to use them, so I will show here the "handmade" solution (although it is not the one I would recommend in general).

To extract what is inside the parentheses, we can take advantage of the fact that we have already divided the line by the (, so [1]the rest of the line will be in the element. Just remove the last character (which will be the )) to get what was inside the parentheses. Namely:

lineas = [
  '   Elnombre1     (237)',
  '      Elnombre3(225)',
  '    Elnombre4(17)' 
  ]

texto_prefijado = "textoquepuse"
nombres = []

for linea in lineas:
    trozos = linea.split('(')
    nombre = trozos[0].strip()
    numero = trozos[1][:-1] 
    nombres.append("{}{}{}".format(nombre, texto_prefijado, numero))

print(nombres)

['Elnombre1textoquepuse237', 'Elnombre3textoquepuse225', 'Elnombre4textoquepuse17']

bl4ckdrvg0n · Answer 2 · 2020-02-20T04:00:30+08:00

One way to get the name using a regular expression could be the following:

  import re

  lineas = [
  '    Elnombre1     (237)',
  '    Elnombre3(237)',
  '    Elnombre4(17)' 
  ]

  nombres = [re.findall('^\s+([a-zA-Z-0-9]+)\s*', x)[0] for x in lineas]

  print(nombres)

End output:

['Thename1', 'Thename3', 'Thename4']

Explanation

What the regular expression ^\s+([a-zA-Z-0-9]+)\s*does is look for a space character to start with \sat least once, continue with letters or numbers (the parentheses () allows capturing the name) and this is followed again *by zero or more space characters \s.

References

Mariano · Answer 3 · 2020-02-22T14:15:58+08:00

Replace capturing a part of the text

Taking as an example:

texto = "        El nombre 1         (237)"

So, from the beginning of the text ^, optional spaces *, any number of characters .*?, optional spaces , the parentheses with the number $\d+$and the end of the text $.

I used .*?with the ?at the end to tell it to match "as little as possible". This is a lazy quantifier . And in this way, it does not consume the spaces that are after the name.
In that same construction I used a dot ( .), which matches any character except a newline, but you could perfectly well limit it to whatever you want, for example:
[\w .,;!áéíóúüñ]*?, or any character except spaces [^ ]*?, etc.

What we're going to do is capture the name. When parentheses are used in a regular expression, the matched text is captured and saved, so that text can be used in the replacement, using \1.

Regex:

^ *(.*?) *\(\d+\)$

Replacement:

\1(otrotexto)

Keep in mind: otrotextoit shouldn't have \s (or you should escape them as \\).

Code:

import re

texto = "        El nombre 1         (237)"
entre_parentesis = "otrotexto"

regex = r"^ *(.*?) *\(\d+\)$"
subst = r"\1(" + entre_parentesis +  r")"


resultado = re.sub(regex, subst, texto)

if resultado:
    print (resultado)

Result:

El nombre 1(otrotexto)

demonstration:

https://ideone.com/s16Wrx

Replace inner spaces with_

To convert "El nombre 1 (237)"to "El_nombre_1(otrotexto)"we use a function as a parameter of re.sub(). Let's use a lambda.

import re

texto = "El nombre 1 (237)"
sep   = "_"
subst = "(otrotexto)"
regex = r" *(?:\(\d+\)$|([^ ]+))"


texto = re.sub(
           regex,
           lambda m: (sep if m.start() else "") + m.group(1) if m.group(1) else subst,
           texto
        )

print (texto)

El_nombre_1(otrotexto)

How to use regex to get a name between spaces in my text file?

Update

Update 2

Explanation

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?