I have a df of this style
x <- data.frame(col=c("X 100", "X 200", "casa", "B", "X 150", "C","X234"))
col
1 X 100
2 X 200
3 casa
4 B
5 X 150
6 C
7 X234
Where X+ number appears I want to put a hyphen between them so that it looks like this
col
1 X-100
2 X-200
3 casa
4 B
5 X-150
6 C
7 X-234
I do the following:
library("stringr")
x$col2 <-str_replace(x$col, "(\\X)(\\d{3})","\\1-\\2")
col col2
1 X 100 X -100
2 X 200 X -200
3 casa casa
4 B B
5 X 150 X -150
6 C C
7 X234 X-234
And I'm left with that blank space in the middle that I can't get rid of.
It seems to me that you have a concept problem and you are expecting a result that is not going to happen. Let's look at a very simple example:
How does the
str_replace()
in this example work?:\\X
, that is, only theX
\\1
replaces only theX
with theX
, the rest of the string that does not fit this pattern is not modified.In your case, when you do this:
Simply what you do is:
X
by that same value\\1
-\\2
, that is the same number with a hyphenWhat you are looking for could be achieved if the pattern made a complete "match", something like this:
It is not a very big change, but you get the pattern to completely encompass what you are looking to reformulate, that is,
X
All of the above is now replaced by the new pattern and with this you get rid of the space.
If the number of digits after the
x
is 3, then the code is perfect.But in the case that the number of digits was different from 3. The regular expression would not be the best.
As in your question you say where a
X
and a number appear. So the correct thing would be to first place a condition that indicates that the element has a number.I am increasing elements, specifically the
X434545
one that has more than 3 numbers. The solution in base R would be like this.And with the package
stringr
it would be like this:I hope I can serve you.