When a tar phar is created, `phar_open_from_fp()` is also called, but
since the file has just been created, none of the format checks can
succeed, so we continue to loop, but must not check again for the
format. Therefore, we bring back the old `test` variable.
Closes GH-9620.
The phar wrapper needs to uncompress the file; the uncompressed file
might be compressed, so the wrapper implementation loops. This raises
potential DOS issues regarding too deep or even infinite recursion (the
latter are called compressed file quines[1]). We avoid that by
introducing a recursion limit; we choose the somewhat arbitrary limit
`3`.
This issue has been reported by real_as3617 and gPayl0ad.
[1] <https://honno.dev/gzip-quine/>
It is insufficient to check whether the `base` is contained in `fname`;
we also need to ensure that `fname` is properly separated. And of
course, `fname` has to start with `base`.
Firstly, we must not forget to set appropriate error codes for "manual"
checks in `virtual_file_ex()`.
Secondly, we must not call `php_error_docref2()` for warnings regarding
unary functions; thus, we introduce `php_win32_docref1_from_error()`.
Closes GH-6872.
When Phars are flushed, a new temporary file is created for each entry
which should be compressed, and the `compressed_filesize` is retrieved.
Afterwards, the Phar manifest is written, and only after that the files
are copied to the actual Phar. So for each such entry there is an open
temp file, what easily exceeds the limit.
Therefore, we use a single temporary file for all entries, and store
the start offset in the otherwise unused `header_offset` member. We
ensure that the `cfp` members are properly set to NULL even if flushing
fails, to avoid use after free scenarios.
This solution is based on a suggestion by @lserni[1].
Closes GH-6643.
[1] <https://github.com/box-project/box2/issues/80#issuecomment-77147371>
The default encoding of filenames in a ZIP archive is IBM Code Page
437. Phar, however, only supports UTF-8 filenames. Therefore we have
to mark filenames as being stored in UTF-8 by setting the general
purpose bit 11 (the language encoding flag).
The effect of not setting this bit for non ASCII filenames can be seen
in popular tools like 7-Zip and UnZip, but not when extracting the
archives via ext/phar (which is agnostic to the filename encoding), or
via ext/zip (which guesses the encoding). Thus we add a somewhat
brittle low-level test case.
Closes GH-6630.
When extracting compressed files from an uncompressed Phar, we must not
use the direct file pointer, but rather get an uncompressed file
pointer.
We also add a test to show that deflated and stored entries are
properly extracted.
This also fixes#79912, which appears to be a duplicate of #69279.
Co-authored-by: Anna Filina <afilina@gmail.com>
Closes GH-6599.
We must not assume that the first end of central dir signature in a ZIP
archive actually designates the end of central directory record, since
the data in the archive may contain arbitrary byte patterns. Thus, we
better search from the end of the data, what is also slightly more
efficient.
There is, however, no way to detect the end of central directory
signature by searching from the end of the ZIP archive with absolute
certainty, since the signature could be part of the trailing comment.
To mitigate, we check that the comment length fits to the found
position, but that might still not be the correct position in rare
cases.
Closes GH-6507.
`phar_path_check()` already strips a leading slash, so we must not
attempt to strip the trailing slash from an now empty directory name.
Closes GH-6508.
Apparently, there are broken tarballs out there which are actually in
ustar format, but did not write the `ustar` marker. Since popular tar
tools like GNU tar and 7zip have no issues dealing with such tarballs,
Phar should also be more resilient.
Thus, when the first checksum check of a tarball in (presumed) in old-
style format fails, we check whether the checksum would be suitable for
ustar format; if so, we treat the tarball as being in ustar format.
Closes GH-6479.
Phar signatures practically are of limited size; for the MD5 and SHA
hashes the size is fixed (at most 64 bytes for SHA512); for OpenSSL
public keys there is no size limit in theory, but "64 KiB ought to be
good enough for anybody". So we check for that limit, to avoid fatal
errors due to out of memory conditions.
Since it is neither possible to have the signature compressed in the
ZIP archive, nor is it possible to manually add a signature via Phar,
we use ZipArchive to create a suitable archive for the test on the fly.
Closes GH-6474.
Currently ./configure --enable-phar --program-suffix=7.4 will
result in binaries named php7.4 and phar but should instead
result in php7.4 and phar7.4
Closes GH-5650.
We have to properly clean up in case phar_flush() is failing.
We also make the expectation of the respective test case less liberal
to avoid missing such bugs in the future.
The php_stream_read() and php_stream_write() functions now return
an ssize_t value, with negative results indicating failure. Functions
like fread() and fwrite() will return false in that case.
As a special case, EWOULDBLOCK and EAGAIN on non-blocking streams
should not be regarded as error conditions, and be reported as
successful zero-length reads/writes instead. The handling of EINTR
remains unclear and is internally inconsistent (e.g. some code-paths
will automatically retry on EINTR, while some won't).
I'm landing this now to make sure the stream wrapper ops API changes
make it into 7.4 -- however, if the user-facing changes turn out to
be problematic we have the option of clamping negative returns to
zero in php_stream_read() and php_stream_write() to restore the
old behavior in a relatively non-intrusive manner.