PhpSpreadsheet/tests
oleibman 5de82981d8
Html Reader Not Handling non-ASCII Data Correctly (#2943)
* Html Reader Not Handling non-ASCII Data Correctly

Fix #2942. Code was changed by #2894 because PHP8.2 will deprecate how it was being done. See linked issue for more details. Dom loadhtml assumes ISO-8859-1 in the absence of a charset attribute or equivalent, and there is no way to override that assumption. Sigh. The suggested replacements are unsuitable in one way or another. I think this will work with minimal disruption (replace ampersand, less than, and greater than with entities representing illegal characters, then use htmlentities, then restore ampersand, less than, and greater than).

* Better Implementation

Use regexp to escape non-ASCII. Less kludgey, less reliant on the vagaries of the PHP maintainers.

* Additional Tests

Test non-ASCII outside of cell contents: sheet title, image alt attribute.

* Apply Same Change in Second Location

Forgot to change loadFromString.

* Additional Test

Confirm escaped ampersand is handled correctly.
2022-07-16 22:08:44 -07:00
..
PhpSpreadsheetTests Html Reader Not Handling non-ASCII Data Correctly (#2943) 2022-07-16 22:08:44 -07:00
data Html Reader Not Handling non-ASCII Data Correctly (#2943) 2022-07-16 22:08:44 -07:00
bootstrap.php 100% Coverage for Calculation/DateTime (#1870) 2021-02-27 20:43:22 +01:00