- Treat truncated multi-byte characters as an error.
- Don't allow ASCII control characters to appear in the middle of a
multi-byte character.
- Handle ~ escapes according to the HZ standard (RFC 1843).
- Treat unrecognized ~ escapes as an error.
- Multi-byte characters (between ~{ ~} escapes) are GB2312, not CP936.
(CP936 is an extended version from MicroSoft, but the RFC does not
state that this extended version of GB should be used.)