Frequently asked questions about Html escaping and encoding

Can I avoid escaping the text in html?

You can avoid escaping text, by not writing special characters like < > &, but we strongly recommend to at least write these characters with the correct HTML entities.

You must set the correct encoding on your web page, we recommend adding a meta tag in the <head> section of your html page, and to use UTF-8 as encoding.

Example of UTF-8 meta tag:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
.....
</head>
There are a few problems with the non escaping approach.
Visitors can overwrite the encoding specified by the page.
You would still have to write < > & manually, to prevent your page from messing up.
Your html editor can mess up special characters, if you open a utf-8 encoded document in an editor, sometimes it will be saved as standard iso-8859-1.
All these problems can be avoided by using the correct escape codes.


How do I post source code or display html tags on a web page?

You can post source code on a HTML page, by using our online html escape tool.
Simply paste you code into the escape tool, and then insert your escaped source code on your web page.
There is, however, one problem, all indents are lost. We recommend writing these indents afterwards, by using the character &nbsp; (none breaking space), to indent your source code.

When posting in wordpress, escape the source code and paste it into the text editor, instead of the wysiwyg editor.
It is also recommended to place your source code within a <code></code> tag.

Loading data using Ajax messes up my characters.

Make sure you are using a correct encoding, on the page, which you receive the data from.
It is recommended to use UTF-8, and the encoding should be set server side, in Java,
the following line sets the encoding to UTF-8 in the response.
response.setContentType("text/html; charset=utf-8");
It is also recommended to use technologies like JSON or other methods of wrapping the data.

When forms are submitted, special characters are messed up.

This problem can have several reasons. Generally you should always use the following guidelines when submitting forms.
ALWAYS use post, forms posted as get, will always mess up special characters.
Specify the encoding of the parameters, on the receiving page. In Java this is done by using the following line, before starting to read the parameters:
request.setCharacterEncoding("UTF-8");
Almost all languages has similar ways of setting the encoding.
Always set the encoding on the page, which post's the parameters. This can be done with a simple meta tag. Normally something similar to:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
This can also be done server side. In java, by using the line:
response.setContentType("text/html; charset=utf-8");

We have also seen Apache/Tomcat servers with mod_rewrite, which mistreated post parameters and servers which was running with a 7 bit character set.
Those problems can be hard to fix, and the solution is to either contact a skilled admin or Google for answers.

Why should I always use UTF-8?

UTF-8 has some great advantages, it contains all special characters, and is compatible with "old school" characters (below character code 128).
We live in a multicultural world, with a lot of special characters,
but most of the world, uses standard characters from the English speaking world.
UTF-8 characters has a length of one byte, when the character code is below 128, and of two (or more) bytes, if the character code is or above 128.
Since most text on the Internet is written with standard English characters, the size of a UTF-8 document is almost the same, as a standard ASCII document, and still has support for special characters.
The UTF-8 character set, is a true expansion of the ASCII character set,
and therefore documents in ASCII format, does not need conversion.