HTML Entity Encoder / Decoder
Convert special characters like <, >, & and " to their HTML entity equivalents, or decode them back.
What is HTML entity encoding?
HTML entity encoding converts characters that have special meaning in HTML — such as <, >, &, and " — into safe named or numeric references that browsers display as text rather than interpreting as markup. For example, the less-than sign < becomes <. A browser that receives <script> displays the literal text <script> instead of executing a script element. This one mechanism is the foundation of safe HTML output and cross-site scripting (XSS) prevention.
XSS — why HTML encoding matters for security
Cross-site scripting (XSS) is one of the most common web vulnerabilities. It occurs when user-supplied text is inserted into an HTML page without being encoded, allowing attackers to inject <script> tags or event handlers that run in the victim's browser. A simple example: if a comment field stores <script>document.cookie</script> and the page displays it unencoded, every visitor who loads that page executes the attacker's script. HTML encoding all untrusted output — before inserting it into HTML — is the correct and complete defence. Modern frameworks like Blazor, React, and Angular encode output by default; the risk arises when developers deliberately bypass encoding with raw HTML rendering.
Named entities vs numeric references
HTML entities can be written in two ways. Named entities use a human-readable name prefixed with & and ending with ; — for example & for &, © for ©, for a non-breaking space. Numeric references use the Unicode code point in decimal (©) or hexadecimal (©) form. Named entities are more readable; numeric references work for any Unicode character whether or not it has a named form.
What Encode actually produces — it's narrower than you'd expect
We tested this directly, and Encode and Decode aren't symmetric here. Decode handles the full range of named entities — ©, ™, —, ' all convert back to their characters correctly. Encode is narrower: it only touches < > & ", plus an apostrophe (as the numeric ', not the named '), and a handful of other characters it renders numerically rather than by name. Characters like ™ and — pass through Encode completely untouched — they're valid UTF-8 already, so HTML doesn't require encoding them, but it does mean Encode won't produce the named entities below even though Decode understands them:
<→ < (less-than sign — essential for displaying HTML tags as text)>→ > (greater-than sign)&→ & (ampersand — must always be encoded in HTML attributes and text)"→ " (double quote — needed inside HTML attribute values)'→ ' (apostrophe — Decode accepts this; Encode produces'instead) → non-breaking space (prevents line breaks between words)©→ © (copyright symbol — Decode-only; Encode leaves a literal © untouched)™→ ™ (trademark symbol — Decode-only; Encode leaves a literal ™ untouched)—→ — (em dash — Decode-only; Encode leaves a literal — untouched)…→ … (ellipsis)
How to use this tool
- Paste your text into the input editor.
- Click Encode to convert special characters to HTML entities.
- Click Decode to convert HTML entities back to their original characters.
- Copy the safe output and use it in your HTML template.
When to use HTML entity encoding
Encode any text that originated from outside your application — user input, database values, API responses, URL parameters — before inserting it into HTML. In server-rendered applications like ASP.NET Core Razor or Blazor, the Razor template engine encodes output by default when you use @variable. Only bypass encoding when you intentionally want to render HTML from a trusted source, using constructs like @((MarkupString)value) in Blazor or Html.Raw() in Razor Pages — and only after thoroughly sanitising the HTML with a library like HtmlSanitizer.
Input
Result
HTML entity output