Hi, I need to convert a string in CamelCase to hyphenated, I've been trying a bit of regular expressions but I can't find any that can separate me, the idea is to enter a string in CamelCase:
Entry:
'HolaMundoCruel'
Departure:
'hola-mundo-cruel'
Thank you
Let's try to match the beginning of each word. There are 2 types of words in Pascal Notation :
Words starting in capital letters, followed by at least one lowercase letter
In this case, we are only interested in verifying that it is followed by a lowercase 1 (it is the only relevant thing to put a hyphen in front of and lead to lowercase).
Although there could also be digits between the two letters, and we add it:
Acronyms (consecutive capital letters).
Matches 1 uppercase, followed by more uppercase or digits
[A-Z][A-Z\d]*
.But also, that it is followed by another capital letter or the end of the text
(?=[A-Z]|$)
.That way, we prevent it from consuming the next word. For example,
HTML
in .HTMLFormateado
HTML
in .FormatoHTML
Combining the two previous expressions into one, we are left with:
This expression already matches all cases. If we replace with
r"-\g<0>"
(a hyphen followed by the text that was matched), we have:Do not insert hyphens at the beginning of the text
To prevent it from inserting hyphens at the beginning, we are going to pass a function as an argument to check, on each replacement, if it
match.start()
is0
. If it's the first word (starts at position 0), we don't use a hyphen, otherwise we prepend a hyphen.Inside the function, we use
str.lower()
to carry to lowercase.end code
Convert from PascalCase to kebab-case.
We use exactly the same logic as in the last code, with a lambda.
Tests:
Result:
demonstration:
http://ideone.com/Xd9mUw
You can make use of
re.sub
allowing each match (in this case an uppercase letter inside the string) to be replaced by another given string (in this case '-').lower
To remove caps you can use the class methodstr
:Another alternative is using
re.finditer
to separate the words (this is also valid if we wanted to obtain a list of the words contained in the camel). Having this, it is enough to join them again using thejoin()
methodstr
:Output of both: