mirror of
https://github.com/php/doc-zh.git
synced 2026-03-25 23:52:17 +01:00
267 lines
7.4 KiB
XML
267 lines
7.4 KiB
XML
<?xml version="1.0" encoding="utf-8"?>
|
||
<!-- $Revision$ -->
|
||
<!-- EN-Revision: 9b1673cf114a1e10c4563ab9223cb56aed552b89 Maintainer: Gregory Status: ready -->
|
||
<!-- CREDITS: mowangjuanzi, Luffy -->
|
||
<refentry xmlns="http://docbook.org/ns/docbook" xml:id="function.utf8-decode">
|
||
<refnamediv>
|
||
<refname>utf8_decode</refname>
|
||
<refpurpose>
|
||
将字符串从 UTF-8 转换为 ISO-8859-1,替换无效或者无法表示的字符。
|
||
</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
&warn.deprecated.function-8-2-0;
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1 role="description">
|
||
&reftitle.description;
|
||
<methodsynopsis>
|
||
<modifier role="attribute">#[\Deprecated]</modifier>
|
||
<type>string</type><methodname>utf8_decode</methodname>
|
||
<methodparam><type>string</type><parameter>string</parameter></methodparam>
|
||
</methodsynopsis>
|
||
<para>
|
||
该函数将字符串 <parameter>string</parameter> 从 <literal>UTF-8</literal> 编码转换为
|
||
<literal>ISO-8859-1</literal>。字符串中不是有效 UTF-8 字节或者不存在于 <literal>ISO-8859-1</literal>
|
||
的 <literal>UTF-8</literal> 字符(即 <literal>U+00FF</literal> 以上的码点)将转化为 <literal>?</literal>。
|
||
</para>
|
||
|
||
<note>
|
||
<para>
|
||
Many web pages marked as using the <literal>ISO-8859-1</literal> character
|
||
encoding actually use the similar <literal>Windows-1252</literal> encoding,
|
||
and web browsers will interpret <literal>ISO-8859-1</literal> web pages as
|
||
<literal>Windows-1252</literal>. <literal>Windows-1252</literal> features
|
||
additional printable characters, such as the Euro sign
|
||
(<literal>€</literal>) and curly quotes (<literal>“</literal>
|
||
<literal>”</literal>), instead of certain <literal>ISO-8859-1</literal>
|
||
control characters. This function will not convert such
|
||
<literal>Windows-1252</literal> characters correctly. Use a different
|
||
function if <literal>Windows-1252</literal> conversion is required.
|
||
</para>
|
||
</note>
|
||
</refsect1>
|
||
|
||
<refsect1 role="parameters">
|
||
&reftitle.parameters;
|
||
<para>
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term><parameter>string</parameter></term>
|
||
<listitem>
|
||
<para>
|
||
UTF-8 编码的字符串。
|
||
</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</para>
|
||
</refsect1>
|
||
|
||
<refsect1 role="returnvalues">
|
||
&reftitle.returnvalues;
|
||
<para>
|
||
返回 <parameter>string</parameter> 的 ISO-8859-1 翻译。
|
||
</para>
|
||
</refsect1>
|
||
|
||
<refsect1 role="changelog">
|
||
&reftitle.changelog;
|
||
<para>
|
||
<informaltable>
|
||
<tgroup cols="2">
|
||
<thead>
|
||
<row>
|
||
<entry>&Version;</entry>
|
||
<entry>&Description;</entry>
|
||
</row>
|
||
</thead>
|
||
<tbody>
|
||
<row>
|
||
<entry>8.2.0</entry>
|
||
<entry>
|
||
弃用此函数。
|
||
</entry>
|
||
</row>
|
||
<row>
|
||
<entry>7.2.0</entry>
|
||
<entry>
|
||
This function has been moved from the XML extension to the core of PHP.
|
||
In previous versions, it was only available if the XML extension was installed.
|
||
</entry>
|
||
</row>
|
||
</tbody>
|
||
</tgroup>
|
||
</informaltable>
|
||
</para>
|
||
</refsect1>
|
||
|
||
<refsect1 role="examples">
|
||
&reftitle.examples;
|
||
<example>
|
||
<title>基础示例</title>
|
||
<programlisting role="php">
|
||
<![CDATA[
|
||
<?php
|
||
// Convert the string 'Zoë' from UTF-8 to ISO 8859-1
|
||
$utf8_string = "\x5A\x6F\xC3\xAB";
|
||
$iso8859_1_string = utf8_decode($utf8_string);
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
|
||
// Invalid UTF-8 sequences are replaced with '?'
|
||
$invalid_utf8_string = "\xC3";
|
||
$iso8859_1_string = utf8_decode($invalid_utf8_string);
|
||
var_dump($iso8859_1_string);
|
||
|
||
// Characters which don't exist in ISO 8859-1, such as
|
||
// '€' (Euro Sign) are also replaced with '?'
|
||
$utf8_string = "\xE2\x82\xAC";
|
||
$iso8859_1_string = utf8_decode($utf8_string);
|
||
var_dump($iso8859_1_string);
|
||
?>
|
||
]]>
|
||
</programlisting>
|
||
&example.outputs;
|
||
<screen>
|
||
<![CDATA[
|
||
5a6feb
|
||
string(1) "?"
|
||
string(1) "?"
|
||
]]>
|
||
</screen>
|
||
</example>
|
||
</refsect1>
|
||
|
||
<refsect1 role="notes">
|
||
&reftitle.notes;
|
||
<note>
|
||
<title>弃用和替代方案</title>
|
||
<para>
|
||
从 PHP 8.2.0 开始,<emphasis>弃用</emphasis>此函数,并将在未来的版本中删除。应检查现有用途并用适当的替代方案。
|
||
</para>
|
||
<para>
|
||
类似的功能可以通过 <function>mb_convert_encoding</function> 实现,支持 ISO-8859-1 和许多其他字符编码。
|
||
<informalexample>
|
||
<programlisting role="php">
|
||
<![CDATA[
|
||
<?php
|
||
$utf8_string = "\xC3\xAB"; // 'ë' (e with diaeresis) in UTF-8
|
||
$iso8859_1_string = mb_convert_encoding($utf8_string, 'ISO-8859-1', 'UTF-8');
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
|
||
$utf8_string = "\xCE\xBB"; // 'λ' (Greek lower-case lambda) in UTF-8
|
||
$iso8859_7_string = mb_convert_encoding($utf8_string, 'ISO-8859-7', 'UTF-8');
|
||
echo bin2hex($iso8859_7_string), "\n";
|
||
|
||
$utf8_string = "\xE2\x82\xAC"; // '€' (Euro sign) in UTF-8 (not present in ISO-8859-1)
|
||
$windows_1252_string = mb_convert_encoding($utf8_string, 'Windows-1252', 'UTF-8');
|
||
echo bin2hex($windows_1252_string), "\n";
|
||
?>
|
||
]]>
|
||
</programlisting>
|
||
&example.outputs;
|
||
<screen>
|
||
<![CDATA[
|
||
eb
|
||
eb
|
||
80
|
||
]]>
|
||
</screen>
|
||
</informalexample>
|
||
</para>
|
||
<para>
|
||
根据安装的扩展,其他有效选项是
|
||
<methodname>UConverter::transcode</methodname> 和 <function>iconv</function>。
|
||
</para>
|
||
<para>
|
||
以下都给出相同的结果:
|
||
<informalexample>
|
||
<programlisting role="php">
|
||
<![CDATA[
|
||
<?php
|
||
$utf8_string = "\x5A\x6F\xC3\xAB"; // 'Zoë' in UTF-8
|
||
$iso8859_1_string = utf8_decode($utf8_string);
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
|
||
$iso8859_1_string = mb_convert_encoding($utf8_string, 'ISO-8859-1', 'UTF-8');
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
|
||
$iso8859_1_string = iconv('UTF-8', 'ISO-8859-1', $utf8_string);
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
|
||
$iso8859_1_string = UConverter::transcode($utf8_string, 'ISO-8859-1', 'UTF8');
|
||
echo bin2hex($iso8859_1_string), "\n";
|
||
?>
|
||
]]>
|
||
</programlisting>
|
||
&example.outputs;
|
||
<screen>
|
||
<![CDATA[
|
||
5a6feb
|
||
5a6feb
|
||
5a6feb
|
||
5a6feb
|
||
]]>
|
||
</screen>
|
||
</informalexample>
|
||
针对无效的字符串或者不能用 ISO 8859-1 表示的字符串,指定 <literal>'?'</literal> 作为
|
||
<methodname>UConverter::transcode</methodname> 的 <literal>'to_subst'</literal>
|
||
选项,将获得同 <function>utf8_decode</function> 相同的结果。
|
||
<informalexample>
|
||
<programlisting role="php">
|
||
<![CDATA[
|
||
<?php
|
||
$utf8_string = "\xE2\x82\xAC"; // € (Euro Sign) does not exist in ISO 8859-1
|
||
$iso8859_1_string = UConverter::transcode(
|
||
$utf8_string, 'ISO-8859-1', 'UTF-8', ['to_subst' => '?']
|
||
);
|
||
var_dump($iso8859_1_string);
|
||
?>
|
||
]]>
|
||
</programlisting>
|
||
&example.outputs;
|
||
<screen>
|
||
<![CDATA[
|
||
sring(1) "?"
|
||
]]>
|
||
</screen>
|
||
</informalexample>
|
||
</para>
|
||
</note>
|
||
</refsect1>
|
||
|
||
<refsect1 role="seealso">
|
||
&reftitle.seealso;
|
||
<para>
|
||
<simplelist>
|
||
<member><function>utf8_encode</function></member>
|
||
<member><function>mb_convert_encoding</function></member>
|
||
<member><methodname>UConverter::transcode</methodname></member>
|
||
<member><function>iconv</function></member>
|
||
</simplelist>
|
||
</para>
|
||
</refsect1>
|
||
|
||
</refentry>
|
||
<!-- Keep this comment at the end of the file
|
||
Local variables:
|
||
mode: sgml
|
||
sgml-omittag:t
|
||
sgml-shorttag:t
|
||
sgml-minimize-attributes:nil
|
||
sgml-always-quote-attributes:t
|
||
sgml-indent-step:1
|
||
sgml-indent-data:t
|
||
indent-tabs-mode:nil
|
||
sgml-parent-document:nil
|
||
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
|
||
sgml-exposed-tags:nil
|
||
sgml-local-catalogs:nil
|
||
sgml-local-ecat-files:nil
|
||
End:
|
||
vim600: syn=xml fen fdm=syntax fdl=2 si
|
||
vim: et tw=78 syn=sgml
|
||
vi: ts=1 sw=1
|
||
-->
|