There was one faulty test in the suite which only passed before because UTF-16 had no
identify filter. After this was fixed, it exposed the problem with the test.
- Make everything less gratuitously verbose
- Don't litter the code with lots of unneeded NULL checks (for things which
will never be NULL)
- Don't return success/failure code from functions which can never fail
- For encoding structs, don't use pointers to pointers to pointers for the
list of alias strings. Pointers to pointers (2 levels of indirection)
is what actually makes sense. This gets rid of some extraneous
dereference operations.
The check ensures that the decoded codepoint is between 0x10000-0x10FFFF,
which is the valid range which can be encoded in a UTF-16 surrogate pair.
However, just looking at the code, it's obvious that this will be true.
First of all, 0x10000 is added to the decoded codepoint on the previous
line, so how could it be less than 0x10000?
Further, even if the 20 data bits already decoded were 0xFFFFF (all ones),
when you add 0x10000, it comes to 0x10FFFF, which is the very top of the
valid range. So how could the decoded codepoint be more than 0x10FFFF?
It can't.
This is a default destructor for mbfl_convert_filter structs. The thing is: there
isn't really anything that needs to be done to those structs before freeing them.
The default destructor just zeroed out some fields, but there's no reason why
we should actually do that.
Man, I can be pedantic sometimes. Tiny little things like misspelled words just
hurt me inside. So while it's not really a big deal, I couldn't leave these typos
alone...
These were unused, and almost certainly will never be used:
- MBFL_ENCTYPE_MWC4BE
- MBFL_ENCTYPE_MWC4LE
- MBFL_ENCTYPE_SHFTCODE
- MBFL_ENCTYPE_ENC_STRM
For the latter two, there were some encodings which were marked with these flags;
but nothing ever _checked_ these particular flags.
This is just a very silly feature of mbstring -- you can compile the source files with
HAVE_MBSTRING undefined, and it will all just compile to (almost) nothing. What is the
use of this? Why compile the source files and link against them if you don't want the
mbstring extension? It doesn't make any kind of sense.
Very interesting... it turns out that when Valgrind support was enabled,
`#include "config.h"` from within mbstring was actually including the file "config.h"
from Valgrind, and not the one from mbstring!!
This is because -I/usr/include/valgrind was added to the compiler invocation _before_
-Iext/mbstring/libmbfl.
Make sure we actually include the file which was intended.
This function uses various subfunctions to convert case of Unicode wchars.
Previously, these subfunctions would store the case-converted characters in
a buffer, and the parent function would then pass them (byte by byte) to
the next filter in the filter chain.
Rather than passing around that buffer, it's better for the subfunctions to
directly pass the case-converted bytes to the next filter in the filter chain.
This speeds things up nicely.
Rather than using a magic boolean parameter to choose different behavior of
the subfunction, inline it. The code size doesn't really grow anyways. And
soon these will be trimmed down more.