I really don't know if it's UTF8 or what, but I'm receiving a json from a server that brings characters like accents and things like that that I want to convert to normal text, some of the fields that json brings are vision, mission and values:
"mision": "<p>Esta es una visi&oacute;n de prueba de SQL SOftware</p>",
"vision": "<p>Y esta es la visi&oacute;n</p><span style=\"color:#666666;font-family:arial, sans-serif;font-size:14px;line-height:22px;text-align:justify;background-color:#ffffff;\"></span>",
"valores": "<p>Descripci&oacute;n<br /></p>",
How can I convert this strangely formatted data to normal text? I tried it this way but it didn't work:
var deco = Encoding.Default.GetBytes(payload);
var json = Encoding.UTF8.GetString(deco);
.Net framework offers
System.Net.WebUtility.HtmlDecode
andSystem.Web.HttpUtility.HtmlDecode
that you can use for your requirement.The result of both methods is the same and the most obvious difference is the availability in the different versions of .Net, the most noticeable being that it
WebUtiliy
is available for UWP andHttpUtiliy
not.Take into account the possibility that you have to apply the conversion several times because in your example there are cases where one time will not be enough.
In the case of the first string
...visi&oacute;n...
, the first decode converts it tovisión
and the second is that you getvisión
Below is a segment of the code I used to test
At first glance it seems that decoding 2 times is enough, but I recommend testing with different strings to confirm it.
If after decoding, you want to remove the HTML elements, Ravi Thapliyal 's answer (from the English site) may be helpful. There it uses a regular expression for that purpose: