I am processing text, I want to remove several elements such as punctuation marks, exclamations, urls, among other things, in this case I want to replace the (accents, their symbol in HTML).
I try to change a part of a string using the replace() method of the String class in Java, but it does not replace the desired substring, as I have been able to verify by printing the string just after doing the replace.
Suppose this is the line where I want to remove the &#:
LOOOOL TA PHOTO =D !you as aimé
And this is the code that I have done and with which I do not achieve the objective:
public class Procesamiento {
private String cadena_;
Procesamiento(String archivo) throws Exception{
FileReader f = new FileReader(archivo);
BufferedReader b = new BufferedReader(f);
FileWriter ficheroPreVocabulario = new FileWriter("/home/alien/Escritorio/PLN_IAA/preVocabulario.txt");
PrintWriter pw1 = new PrintWriter(ficheroPreVocabulario);
while((cadena_=b.readLine())!=null) {
String a = cadena_;
String[] linea = a.split("\\s+"); //tienes que dobleescapar, el primer \ escapa al segundo \, si solo pusieras un \ escaparías la s, es decir, se trataria la s como caracter literal, y como no quieres eso,quieres indicar que tiene un significado especial
for(int i=0;i<linea.length;i++) {
eliminaHashtags(linea[i]);
}
pw1.println(String.join(" ", linea));
}
b.close();
}
void eliminaHashtags(String cadena_) throws Exception {
if(cadena_.contains("&#")) {
cadena_.replace("&#","");
System.out.println(" la cadena_ despues de eliminar hashtag " + cadena_); // Lo he probado, no elimina el &#
}
}
}
All imports have been done and the program does not complain about that part. Help is appreciated. Thank you!
Strings are constants according to the documentation. you will not be able to change its value, what you have to do is capture the return value of the replace function .
Any method that modifies the original value of the text string is going to have to be assigned to a variable. The original object will not change.