1
0
mirror of https://github.com/php/php-src.git synced 2026-04-16 04:21:18 +02:00
Commit Graph

273 Commits

Author SHA1 Message Date
Máté Kocsis
1f48feebb9 Improve some TypeError and ValueError messages
Closes GH-5377
2020-04-14 14:38:45 +02:00
Máté Kocsis
c5fb4f0794 Generate function entries from stubs for a couple of extensions
Migrates ext/standard, ext/tidy, ext/tokenizer,
ext/xml, ext/xml_reader, and ext/xml_writer. Closes GH-5381.
2020-04-14 11:49:02 +02:00
Alex Dowad
80598f1250 Syntax errors caused by unclosed {, [, ( mention specific location
Aside from a few very specific syntax errors for which detailed exceptions are
thrown, generally PHP just emits the default error messages generated by bison on syntax
error. These messages are very uninformative; they just say "Unexpected ... at line ...".

This is most problematic with constructs which can span an arbitrary number of lines, such
as blocks of code delimited by { }, 'if' conditions delimited by ( ), and so on. If a closing
delimiter is missed, the block will run for the entire remainder of the source file (which
could be thousands of lines), and then at the end, a parse error will be thrown with the
dreaded words: "Unexpected end of file".

Therefore, track the positions of opening and closing delimiters and ensure that they match
up correctly. If any mismatch or missing delimiter is detected, immediately throw a parse
error which points the user to the offending line. This is best done in the *lexer* and not
in the parser.

Thanks to Nikita Popov and George Peter Banyard for suggesting improvements.

Fixes bug #79368.
Closes GH-5364.
2020-04-14 11:22:23 +02:00
Máté Kocsis
3709e74b5e Store default parameter values of internal functions in arg info
Closes GH-5353. From now on, PHP will have reflection information
about default values of parameters of internal functions.

Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>
2020-04-08 18:37:51 +02:00
Nikita Popov
5a09b9fb0f Add PhpToken class
RFC: https://wiki.php.net/rfc/token_as_object

Relative to the RFC, this also adds a __toString() method,
as discussed on list.

Closes GH-5176.
2020-03-26 11:09:18 +01:00
George Peter Banyard
c9db32271a Remove deprecated (real) cast
Closes GH-5220
2020-03-12 15:40:21 +01:00
Tyson Andre
3d8342aa3c [skip ci] Skip 2 tokenizer tests if tokenizer isn't loaded
`./configure --disable-tokenizer` can disable tokenizer

Closes GH-5184
2020-02-16 19:22:22 -05:00
Nikita Popov
8a4068988b Clarify that token_get_all() never returns false
It can only fail in TOKEN_PARSE mode, in which case it will throw.
2020-02-14 16:50:12 +01:00
Nikita Popov
f8d795820e Reindent phpt files 2020-02-03 22:52:20 +01:00
Máté Kocsis
2015c7a48e Fix another batch of indentation in tests 2020-02-02 23:33:40 +01:00
Máté Kocsis
8b36be268d Fix indentation and whitespaces in tests
In preparation for GH-5074
2020-01-31 17:47:14 +01:00
Christoph M. Becker
dabc28d182 Fix #78880: Spelling error report
We fix the most often occuring typos according to a recent codespell
report[1] in tests, code comments and documentation.

[1] <https://fossies.org/linux/test/php-src-master-f8f48ce.191129.tar.gz/codespell.html>.
2019-12-21 11:58:00 +01:00
Máté Kocsis
27e83d0fb8 Add union return types for function stubs 2019-11-11 14:54:55 +01:00
Nikita Popov
4e563e6c95 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix handling of overflowing invalid octal in tokenizer
2019-10-14 16:37:27 +02:00
Nikita Popov
641f9615cc Fix handling of overflowing invalid octal in tokenizer
If token_get_all() is used, we still need to correctly distinguish
LNUMBER vs DNUMBER here for backwards compatibility.
2019-10-14 16:36:27 +02:00
Nikita Popov
d44cf9b539 Replace "unexpected character" warning with ParseError
Closes GH-4767.
2019-10-04 11:28:58 +02:00
Nikita Popov
7e77617533 Merge branch 'PHP-7.4' 2019-09-30 10:42:47 +02:00
Nikita Popov
19e7e4b197 Fixed bug #78604
<?php followed by EOF is valid since PHP 7.4.
2019-09-30 10:41:14 +02:00
Nikita Popov
1593f79ecd Merge branch 'PHP-7.4' 2019-09-28 21:30:15 +02:00
Tyson Andre
7e0f7b7677 Reduce memory used by token_get_all()
Around a quarter of all strings in array tokens would have a string that's one
character long (e.g. ` `, `\`, `1`)

For parsing a large number of php files,
The memory increase dropped from 378374248 to 369535688 (2.5%)

Closes GH-4753.
2019-09-28 21:29:58 +02:00
Gabriel Caruso
5d6e923d46 Remove mention of PHP major version in Copyright headers
Closes GH-4732.
2019-09-25 14:51:43 +02:00
Nikita Popov
3372a24b99 Merge branch 'PHP-7.4' 2019-09-14 12:10:15 +02:00
Nikita Popov
3f76f9416f Fix double-free on invalid large octal with separators
To clean up the mess here a bit, check for invalid octal digits
with an explicit loop instead of mixing this into the string to
number conversion.

Also clean up some type usage.
2019-09-14 12:10:06 +02:00
Christoph M. Becker
68edbbfe76 Merge branch 'PHP-7.4'
* PHP-7.4:
  Add the last missing SKIPIF
2019-09-04 08:54:07 +02:00
Fabien Villepinte
ced5bb7d88 Add the last missing SKIPIF 2019-09-04 08:53:35 +02:00
Nikita Popov
a8792887d9 Merge branch 'PHP-7.4' 2019-08-27 22:01:59 +02:00
Nikita Popov
e5c7f71004 Don't specify precedence for T_INC/T_DEC
As these do not operate on expressions, precedence is meaningless
for them.
2019-08-27 21:59:56 +02:00
Stephen Reay
cfbcd4caef Arginfo stubs for tokenizer 2019-08-11 19:03:04 +02:00
Nikita Popov
ce972ba349 Merge branch 'PHP-7.4' 2019-07-16 11:54:40 +02:00
Nikita Popov
c9acc90186 Support <?php followed by EOF
This is an annoying edge-case for canonicalization.
2019-07-16 11:53:48 +02:00
Nikita Popov
193bcf9650 Merge branch 'PHP-7.4' 2019-07-15 12:52:18 +02:00
Nikita Popov
9ad094e371 Emit T_BAD_CHARACTER for unexpected characters
Avoid having holes in the token stream which are annoying and
inefficient to reconstruct on the consumer side.
2019-07-15 12:51:01 +02:00
Nikita Popov
a89b959320 Merge branch 'PHP-7.4' 2019-07-12 17:22:58 +02:00
Nikita Popov
0d568b9fd5 Don't split T_INLINE_HTML at partial PHP tag
If <?php occurs without required trailing whitespace, we should keep
it as part of a single T_INLINE_HTML region.
2019-07-12 17:22:11 +02:00
Nikita Popov
dc18af96f9 Merge branch 'PHP-7.4' 2019-06-17 12:44:42 +02:00
George Peter Banyard
b2d6d29632 Remove unnecessary short_open_tags use in tokenizer test 2019-06-17 12:43:00 +02:00
Nikita Popov
d570620917 Merge branch 'PHP-7.4' 2019-06-11 16:21:57 +02:00
George Peter Banyard
7f5f277cf2 Remove unnecessary short_open_tag INI directive in tests
Closes GH-4249.
2019-06-11 16:14:10 +02:00
Peter Kokot
4251600203 Merge branch 'PHP-7.4'
* PHP-7.4:
  Enhance the tokenizer data generator script
2019-05-14 21:51:51 +02:00
Peter Kokot
b4dec0e11d Enhance the tokenizer data generator script
Changes:
- executable from any location (for example, project root)
- some minor common shell scripts CS fixes
- error reporting done based on the presence of the parser file
2019-05-14 21:50:29 +02:00
Peter Kokot
2cf90bb2f0 Merge branch 'PHP-7.4'
* PHP-7.4:
  Normalize comments in *nix build system m4 files
2019-05-12 18:51:50 +02:00
Peter Kokot
75fb74860d Normalize comments in *nix build system m4 files
Normalization include:
- Use dnl for everything that can be ommitted when configure is built in
  favor of the shell comment character # which is visible in the output.
- Line length normalized to 80 columns
- Dots for most of the one line sentences
- Macro definitions include similar pattern header comments now
2019-05-12 18:43:03 +02:00
Nikita Popov
79f41944ba Merge branch 'PHP-7.4' 2019-05-02 15:07:04 +02:00
Nikita Popov
f3e5bbe6f3 Implement arrow functions
Per RFC: https://wiki.php.net/rfc/arrow_functions_v2

Co-authored-by: Levi Morrison <levim@php.net>
Co-authored-by: Bob Weinand <bobwei9@hotmail.com>
2019-05-02 15:04:03 +02:00
Nikita Popov
0720313bd4 Merge branch 'PHP-7.4' 2019-03-28 09:31:18 +01:00
Nikita Popov
7f72d771e8 Revert "Switch to bison location tracking"
This reverts commit e528762c1c.

Dmitry reports that this has a non-trivial impact on parsing
overhead, especially on 32-bit systems. As we don't have a strong
need for this change right now, I'm reverting it.

See also comments on
e528762c1c.
2019-03-28 09:29:08 +01:00
Peter Kokot
90b70b2626 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix tokenizer_data_gen.sh for non-posix bison
2019-03-24 01:54:46 +01:00
Guilliam Xavier
516d27775d Fix tokenizer_data_gen.sh for non-posix bison
And run it to update tokenizer_data.c after recent changes in
zend_language_parser.y that reordered some tokens
2019-03-24 01:53:59 +01:00
Nikita Popov
6b7545c225 Merge branch 'PHP-7.4' 2019-03-21 16:28:19 +01:00
Nikita Popov
e528762c1c Switch to bison location tracking
Locations for AST nodes are now tracked with the help of bison
location tracking. This is more accurate than what we currently do
and easier to extend with more information.

A zend_ast_loc structure is introduced, which is used for the location
stack. Currently it only holds the start lineno, but can be extended
to also hold end lineno and offset/column information in the future.

All AST constructors now accept a zend_ast_loc* as first argument, and
will use it to determine their lineno. Previously this used either the
CG(zend_lineno), or the smallest AST lineno of child nodes.

On the parser side, the location structure for a whole rule can be
obtained using the &@$ character salad.
2019-03-21 16:27:48 +01:00