Let me explain, I want the regular expression to \w
find the characters [a-zA-Z_0-9áéíóúüñÁÉÍÓÚÜÑ]
that, as you know, are part of what is considered a word in the Spanish language.
I thought it depended on the locale, but it doesn't, since even changing the CurrentCulture before executing the regular expression, it still doesn't give me the results with accented letters.
Thread.CurrentThread.CurrentCulture = New CultureInfo("es-ES")
I have also used the alternative \p{L}
with the same result.
I have looked at the options that can be put when executing regular expressions in .NET and there is one called RegexOptions.CultureInvariant
, but it does not work for me because it is intended for case-sensitive string comparisons.
https://msdn.microsoft.com/en-us/library/yd1hzczs(v=vs.110).aspx#Anchor_11
I could use an ad-hoc expression [a-zA-Z_0-9áéíóúüñÁÉÍÓÚÜÑ]
, but it's longer, less readable, and I want to support more languages in the future so I'd have to create a different one for each language.
What I am asking is if .NET has a list of languages with the characters that are part of a word and only passing the locale for example "es-ES" would already interpret \w
or \p{L}
as[a-zA-Z_0-9áéíóúüñÁÉÍÓÚÜÑ]