* Keep Calculated String Results Below 32K
This is the result of an investigation into issue #2884 (see also PR #2913). It is, unfortunately, not a fix for the original problem; see the discussion in that PR for why I don't think there is a practical fix for that specific problem at this time.
Excel limits strings to 32,767 characters. We already truncate strings to that length when added to the spreadsheet. However, we have been able to exceed that length as a result of the concatenation operator (Excel truncates); as a result of the CONCATENATE or TEXTJOIN functions (Excel returns #CALC!); or as a result of the REPLACE, REPT, SUBSTITUTE functions (Excel returns #VALUE!). This PR changes PhpSpreadsheet to return the same value as Excel in these cases. Note that Excel2003 truncates in all those cases; I don't think there is a way to differentiate that behavior in PhpSpreadsheet.
However, LibreOffice and Gnumeric do not have that limit; if they have a limit at all, it is much higher. It would be fairly easy to use existing settings to differentiate between Excel and LibreOffice/Gnumeric in this respect. I have not done so in this PR because I am not sure how useful that is, and I can easily see it leading to problems (read in a LibreOffice spreadsheet with a 33K cell and then output to an Excel spreadsheet). Perhaps it should be handled with an additional opt-in setting.
I changed the maximum size from a literal to a constant in the one place where it was already being enforced (Cell/DataType). I am not sure that is the best place for it to be defined; I am open to suggestions.
* Implement Some Suggestions
... from @MarkBaker.
Fix#2897. We have been relying on iconv/mb_convert_encoding to detect invalid UTF-8, but all techniques designed to validate UTF-8 seem to accept FFFE and FFFF. This PR explicitly converts those characters to FFFD (Unicode substitution character) before validating the rest of the string. It also substitutes one or more FFFD when it detects invalid UTF-8 character sequences.
A comment in the code being change stated that it doesn't handle surrogates. It is right not to do so. The only case where we should see surrogates is reading UTF-16. Additional tests are added to an existing test reading a UTF-16 Csv to demonstrate that surrogates are handled correctly, and that FFFE/FFFF are handled reasonably.
Fix#2908. When support for two-cell anchors was added for drawings, we neglected to adjust the second cell address when rows or columns are added or deleted. It also appears that "twoCell" and "oneCell" were introduced as lower-case literals when support for the editAs attribute was subsequently introduced.
* Ignore square-$-brackets prefix in format string
* Test for square-$-brackets prefix in format string issue fixed
* Fix for phpstan compliance
* Additional assert for checking number format of tested source cell
File from https://www.rondebruin.nl/win/s2/win003.htm. I have been in conversation with the author, who has no objection to its use. I have not actually opened the file in Excel (at least not with macros enabled); I am using it merely to demonstrate that the ribbon data is read and written correctly. Test added; no source code changed. This should slightly increase coverage for Reader/Xlsx (moderate), Writer/Xlsx (slight), and Spreadsheet (substantial). Note that this file has no Ribbon Bin objects, so some coverage is still lacking.
Mostly new tests, some code annotations, some minor code changes:
- RichText clone logic is wrong
- TextElement doesn't have object properties, doesn't need clone
Disabled it earlier because its reliance on an external site not under our control was causing problems. URL in spreadsheet is now changed to point to an image in phpspreadsheet.readthedocs.io, which should be more reliable. Test is re-enabled.
Fix#2810. Repairing some Phpstan diagnostics, used `?:` rather than `??` in a few places.
2 different Html modules are affected. Also, Ods Reader, but its problem is with sheet title rather than cell contents. And, as it turns out, Ods Reader was already not handling sheets with a title of `0` correctly - it made a truthy test before setting sheet title. That is now changed to truthy or numeric. Other readers are not susceptible to this problem. Tests are added.
* Real Errors Identified in Calculation by Scrutinizer
Before Scrutinizer broke, I took a look at the remaining 43 errors which it categorized as 'major'. Most of these were false positives, but, in the case of Calculation and Reader/Xlsx/Chart, I was able to determine that its analysis of some of the problems was correct. There is little point addressing the false positives until it starts working again, but we should fix the real errors.
This PR addresses the real errors in Calculation.
- A test for `$pCellParent` should have been a test for `$pCellWorksheet`.
- A test for `$operand1Data['reference']` should have been a test for `$operand1Data['value']`.
- A test for `$operand2Data['reference']` should have been a test for `$operand2Data['value']`.
* Fix Attempt to Erroneously Call trim on Array
Fix#2780. Warning message is being issued when getting calculated value of cell with value `=INDIRECT(ADDRESS(ROW(),COLUMN()-2))/$C$4`. This appears to be the case for all recent (and probably not so recent) releases; it is not a result of changes to the code base. Fix added to this PR because the erring section of code was proximate to code already changed in the PR. Test added.
* Minor Code Changes
Apply some suggestions from @MarkBaker
Fixed that bug, using a slightly faster algorithm for the sort index than the simple fix would have used, and modified the tests that didn't have the correct expected result :-(
Fix#2768. DateFormatter handles only one of six special formats for time intervals `[h] [hh] [m] [mm] [s] [ss]`. This PR extends support to the rest. There should be no more than one of these in any format string. Although it certainly could make sense to treat `[d] [dd]` in the same manner, Excel does not seem to support those.
Interesting observations - hours and minutes are truncated (presumably because they may be followed by minutes and seconds), but seconds are rounded. Also, there are some floating point issues, which fortunately showed up for the example in the original issue. There, the time interval was 1.15, which should evaluate to a minutes value of 1656 (as it does in Excel). However, on my system it evaluated to 1655 because of a rounding error in the 13th decimal place. To overcome this, values are rounded to 10 decimal places before truncating.
* Fix reading of files in the root of a zip
Xlsx.php relies in dirname($filename) for path generation. When path is a bare filename (i.e. files in the root of the zip file), dirname($filename) returns a relative path to the current directory ("."). This is ok for filesystems, but not when accesing contents in a zip file.
Xlsx documents with files in the root of the zip container are not common, but legit. I've found it to happen in files generated by Google Campaign Manager 360.
* Update Xlsx.php
* Update Xlsx.php
* Update CHANGELOG.md
* Add files via upload
* Create XlsxRootZipFilesTest.php
* Update XlsxRootZipFilesTest.php
* Add files via upload
* Delete rootZipFiles.xlsx
* Update XlsxRootZipFilesTest.php
* Update Xlsx.php
These changes have already been implemented twice, and been regressed twice. I'll try once more (with a different approach), then give up ...
As configured, Phpstan running under Php7 reports no errors. However, running under Php8, it reports 100 (!) errors. The vast majority of these are due to two reasons:
- renaming parameters in Php builtin functions in preparation for named parameters.
- using the new class GdImage rather than type resource as the argument type for many image-based functions.
Regardless of the cause, this will be a problem sooner or later. This PR is an attempt to get ahead of that problem. For source members, it mostly adds annotations or updates doc-blocks. Only 2 members have changes to executable code, and these are very minor - BitWise and Writer/Xlsx. For test members, all baseline errors are deleted and the code is fixed. Php7 and Php8 both report no errors with this configuration.
Note that this method is used when translating Excel functions between en and other locale languages, as well as when converting formulae between different spreadsheet formats (e.g. Ods to Excel)
Nor is this a perfect solution, as there may still be issues when function calls have array arguments that themselves contain function calls; but it's still better than the current logic
Fix#2499, which see for details of an obscure problem affecting both PhpSpreadsheet and Excel. Add support for palette contained in workbook styles. This seems to be a very rare occurrence, so allow it only when the palette contains exactly 64 entries. If there are other possibilities, we'll presumably have a new workbook to guide us how to handle them. Also add some tests for specification of indexed color without palette, another rarity (no in-range examples amongst our current files). Also change one private static array, initialized once at run-time and never changed, to a constant.
Implement Array-enabled for ERROR.TYPE() function
Extract ERROR.TYPE() function tests into separate test file
Extract error function tests into separate test files
And thus complete the implemented Information functions
* First steps toward array-enabling the information functions
Also includes moving unit tests out from Functions and into a separate, dedicated Information folder
* Resolve issue with IF(), branch pruning and calculation cache (ensure that we don't convert the if condition to a bool before we've tested to see if it evaluates to an error)
More refactoring
* Start work on array-enabling the Lookup and Reference functions
Requires a new method (`evaluateArrayArgumentsSubsetFrom()`) in the `ArrayEnabled` Trait to handle functions where the arguments that need special array handling are trailing rather than leading arguments
* Split Information functions into a dedicated class and namespace and categorise as Value or Error
* Refactor all error functions into the new ExcelError class