Sanitize and 'clean' html for safe consumption in a plain text format.
var plainText = Textorize.HtmlToPlainText("<span>I contain html</span><p>convert me</p>");
// plaintext = "I contain html\nconvert me\n"
Converts html input to a safe plain text representation without html. Content in Style and Script tags are completely removed, html entity characters are explicitly converted to their unicode characters. Invalid html is handled best effort for a reasonable equivalent plain text output.
Keep in mind the following equivalence:
Textorize(input) == Textorize(HtmlEncode(Textorize(input)))
For more examples see the testsuite
PM> Install-Package Textorizer
> dotnet add package Textorizer
Dual licensed
MIT
https://opensource.org/licenses/MIT
Unlicense