1
0
mirror of https://github.com/php/php-src.git synced 2026-04-29 19:23:22 +02:00
Commit Graph

54 Commits

Author SHA1 Message Date
Sara Golemon 30a2bd1d11 Another (and hopefully last) major streams commit.
This moves unicode conversion to the filter layer
(rather than at the lower streams layer)
unicode_filter.c has been moved from ext/unicode to main/streams
as it's an integral part of the streams unicode conversion process.

There are now three ways to set encoding on a stream:

(1) By context
$ctx = stream_context_create(NULL,array('encoding'=>'latin1'));
$fp = fopen('somefile', 'r+t', false, $ctx);

(2) By stream_encoding()
$fp = fopen('somefile', 'r+');
stream_encoding($fp, 'latin1');

(3) By filter
$fp = fopen('somefile', 'r+');
stream_filter_append($fp, 'unicode.from.latin1', STREAM_FILTER_READ);
stream_filter_append($fp, 'unicode.to.latin1', STREAM_FILTER_WRITE);

Note: Methods 1 and 2 are convenience wrappers around method 3.
2006-03-29 01:20:43 +00:00
Andrei Zmievski 3eee3a5fd6 Fix collator instantiation. 2006-03-28 04:33:29 +00:00
Andrei Zmievski cbbfebc428 Fix typos. 2006-03-28 03:28:08 +00:00
Andrei Zmievski b36d2dfef6 Rewrite unicode_encode() and unicode_decode() functions. Apply the new
conversion error semantics.
2006-03-27 03:19:30 +00:00
Andrei Zmievski db50082fe9 Add unicode_get_error_mode() and unicode_get_subst_char(). 2006-03-26 21:22:59 +00:00
Derick Rethans ad6a972de3 - Implemented basic collation support. For some reason "new Collator" gives segfaults when the object's collation resource is used.
- The following example shows what is implemented:

<?php
$orig = $strings = array(
    'côte',
    'cote',
    'côté',
    'coté',
    'fluße',
    'flüße',
);

echo "German phonebook:\n";
$c = collator_create( "de@collation=phonebook" );
foreach($c->sort($strings) as $string) {
    echo $string, "\n";
}
echo $c->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON
    ? "With" : "Without", " french accent sorting order\n";

echo "\nFrench with options:\n";
$c = collator_create( "fr" );
$c->setAttribute(Collator::CASE_FIRST, Collator::UPPER_FIRST);
$c->setAttribute(Collator::CASE_LEVEL, Collator::ON);
$c->setStrength(Collator::SECONDARY);
foreach($c->sort($strings) as $string) {
    echo $string, "\n";
}
echo $c->getAttribute(Collator::FRENCH_COLLATION) == Collator::ON
    ? "With" : "Without", " french accent sorting order\n";
?>
2006-03-26 11:06:24 +00:00
Andrei Zmievski 1709428494 Implement to-Unicode conversion error behavior. Note the adjusted APIs. 2006-03-26 06:19:24 +00:00
Andrei Zmievski c254b21cca Add protos. 2006-03-26 03:33:10 +00:00
Andrei Zmievski 930bde5897 * Remove unicode.from_error_mode and unicode.from_subst_char from INI
settings.
* Add unicode_set_error_mode() and unicode_set_subst_char() functions to
  manipulate these global settings.
2006-03-26 01:48:33 +00:00
Andrei Zmievski fe0cccc003 Use intern->type for break iterator. 2006-03-24 21:06:36 +00:00
Antony Dovgal 9cee8be28e first check for NULL, then use the pointer 2006-03-24 10:21:56 +00:00
Derick Rethans 3056defb26 - Moved strtotitle to ext/standard and implemented the fallback case to
non-unicode with ucwords. There is also an implementation for unicode ucwords
  but that returns different results then strtotitle as it uppercases the
  first character of every word, and doesn't *titlecase* a word. The test case
  shows that.
2006-03-22 10:20:20 +00:00
Derick Rethans 7f7300ae0b - Update windows file too (not tested, but should work). 2006-03-21 13:57:16 +00:00
Derick Rethans c86cf4fbea - Make ext/unicode an extension that is always there and can not be disabled. 2006-03-21 13:56:50 +00:00
Sara Golemon 48798021b5 Refactor streams layer for PHP6.
Don't be frightened by the size of this commit.
A significant portion of it is restoring the read buffer semantics back
to what PHP4/5 use.  (Or a close aproximation thereof).

See main/streams/streams.c and ext/standard/file.c for a set of
UTODO comments covering work yet to be done.
2006-03-13 04:40:11 +00:00
Andrei Zmievski 20301a153f Should use word break iteration instead of title, as title one has been
deprecated since Unicode 3.2>
2006-03-02 20:40:45 +00:00
Dmitry Stogov c366cc6d1a Nuke int32_t (everywhere except streams layer) and signed/unsigned warnings 2006-03-02 13:12:45 +00:00
Dmitry Stogov e3b7f3fd0d Unicode support: MS Visual C compatibility 2006-02-26 11:57:14 +00:00
Marcus Boerger bb924b320f - Add test 2006-02-18 17:13:39 +00:00
Marcus Boerger f81239a2b3 - Change to offsetof as suggested by Clayton 2006-02-17 08:24:56 +00:00
Marcus Boerger 45820dd7ea - Little speedup + first test 2006-02-15 21:34:21 +00:00
Marcus Boerger 4e172c21a5 - Change unicode_enabled() to unicode_semantics() per Andrei's suggestion 2006-02-13 19:55:17 +00:00
Dmitry Stogov 09ca61c125 Made server wide switch for unicode on/off (according to PDM). 2006-02-13 10:23:59 +00:00
Marcus Boerger 83f1271312 - Add unicode_enabled() to check whether unicode_semantics is on 2006-02-13 09:20:19 +00:00
Andrei Zmievski 5418ae7976 Implement character/word/line/sentence iterators and the reverse
counterparts.
2006-02-11 00:16:43 +00:00
Andrei Zmievski 086dec2719 Make ReverseTextIterator a separate class. 2006-02-10 00:23:29 +00:00
Andrei Zmievski dfd6f3e3b8 Reverse iteration for combining sequences. 2006-02-08 00:16:50 +00:00
Andrei Zmievski 1a6b00fc01 Implement reverse iteration for codeunits and codepoints. Combining
sequences are next.

# This is ugly, though.
# foreach (new TextIterator($a, # TextIterator::CODE_POINT|TextIterator::REVERSE) as $k => $c) {
#    var_dump("$k: $c");
# }
# Any suggestions?
2006-02-07 20:01:28 +00:00
Andrei Zmievski f71a7cb1f4 Implement combining sequences support in TextIterator. 2006-02-07 00:13:54 +00:00
Andrei Zmievski 071835ea59 - Fix up a bunch of stuff.
- Register TextIterator type constants.

# Not sure if I like them as class constants. Cleaner, but also longer
# to type.
2006-02-06 22:58:10 +00:00
Andrei Zmievski 9e07b59f9c Some TODO items. 2006-02-06 18:18:41 +00:00
Andrei Zmievski 1e9fa8c10a Make TextIterator fast again, now that we don't have to worry about
references.
2006-02-06 17:42:28 +00:00
Marcus Boerger c67d8b2152 - Iterator API was changed 2006-02-05 23:31:47 +00:00
Andrei Zmievski 589d28e429 Implement Traversable instead of Iterator. 2006-02-04 00:41:42 +00:00
Andrei Zmievski fe5aac2f41 Add code unit ops. 2006-02-04 00:35:37 +00:00
Andrei Zmievski 4a3bf22b81 Abstract the iterator interface so that we can add new types. 2006-02-04 00:23:52 +00:00
Andrei Zmievski 94e3087be7 Gah. In order to avoid memory corruption when using references in
foreach() this code is necessary. But it makes iterator 6x slower. We
should keep thinking about how to optimize it.
2006-02-03 23:50:42 +00:00
Andrei Zmievski aa7ed0788c Guard against assign-by-ref. 2006-02-03 21:53:05 +00:00
Andrei Zmievski 682ec6e25e Rewrite to use C-level iterators for performance. Also, cache the string
in the iterator object for immutability.
2006-02-03 00:09:19 +00:00
Antony Dovgal d63e26191a fix win32 snapshots 2006-02-02 14:45:54 +00:00
Sebastian Bergmann f3ddda4229 Fix Andrei. 2006-02-02 06:01:27 +00:00
Andrei Zmievski d887f2238b Remove debug message. 2006-02-02 00:05:21 +00:00
Andrei Zmievski 2b763aa305 Check for intern->text before destroying it. 2006-02-01 23:53:53 +00:00
Andrei Zmievski d4c929764a Proof-of-concept for TextIterator. Much more work to be done here. 2006-02-01 23:50:50 +00:00
foobar 251c5173fd bump year and license version 2006-01-01 13:10:10 +00:00
foobar a208d9a966 - Nuke php3 legacy 2005-12-06 02:28:26 +00:00
foobar 9b7a28e9a2 fix configure help 2005-11-10 08:04:57 +00:00
Derick Rethans 6e3d5a9e22 - Rename icu_loc* to i18n_loc*
- Added i18n_strtotitle (name is not sure yet) - work in progress.
2005-09-14 14:56:01 +00:00
Sara Golemon 23b729dc15 sizeof(char) != sizeof(UChar) Don't tell ucnv_toUnicode it has more space than it really does 2005-08-25 08:59:24 +00:00
foobar dde3f89dd4 Nuked EOL from error message 2005-08-18 12:56:36 +00:00