* More Scrutinizer Catch Up
Continue the work of PR #3043 by attending to the 12 remaining 'new' issues.
* Php 8.1 Problem
One new null-instead-of-string problem.
A lot of easily fixed problems throughout Writer/Xlsx/*, mostly supplying int rather than string as input to WriteAttribute/WriteElement. There are, in fact, so many opportunities, that I will split it over 2 or 3 PRs. But this first one will get Phpstan baseline down to the goal on its own.
Some of the other problems are also easily fixed. In particular, the docBlocks in Style/ConditionalFormatting/ConditionalDataBar do not allow for null values, and should.
The setup for unit testing in Github in the "Install dependencies" log reports 11 members as "does not comply with with psr-4 autoloading standard." In each case, it is because the test namespace does not match the directory; in most cases, it was caused by the member being moved from one directory to another without changing the namespace declaration. No harm results from these problems, but there's also no reason to not correct them.
Shared/Font is hardly covered in unit tests (as opposed to Style/Font which is completely covered). And it presented some good opportunities for code optimization. I wrote and tested the new unit tests first, then optimized the code and confirmed that everything still works.
There is still a bit of a gap with "exact" measurements. I had tests ready, but had to withdraw them when I discovered they weren't quite portable (see https://github.com/php/php-src/issues/9073).
* Fix Spreadsheet Copy, Disable Clone, Improve Coverage
This PR was supposed to be merely to increase coverage in Spreadsheet. However, in doing so, I discovered that neither clone nor copy worked correctly. Neither had been covered in the test suite. Copy not only did not work, it broke the source spreadsheet as well. I tried to debug and got nowhere; I even tried using myclabs/deep-copy which is already in use in the test suite, but it failed as well. However, write and reload ought to work just fine for copy. It can't be used for clone; however, since copy does what clone ought to do, there's no reason why clone needs to be used, so __clone is changed to throw an exception if attempted.
One other source change was needed, an obvious bug where an if condition uses 'or' when it should use 'and'. Also, one docblock declaration needed a change. Aside from that, the rest of this PR is test cases, and overall coverage passes 89% for the first time.
* Clone is Okay After All
But copy wasn't, changing it to just return clone. Perhaps save and reload will be needed instead at some point, but not yet.
* An Error I Cannot Reproduce
PHP8.1 unit test says error because GdImage can't be serialized. I can't reproduce this error on any of my test systems. I have no idea why GdImage is even involved. Using try/catch to see if it helps.
* Weird Failures in Github
I thought restoring clone was a good idea. That left me in a state where, after one change, copy/clone no longer worked on Github (unable to reproduce on any of my test systems). After a second change, copy worked but clone didn't, again unable to reproduce. So, reverting to original version - copy does save and reload, clone throws exception.
* Big Memory Leak in One Test
Many tests leak a little by not issuing disconnectWorksheets at the end. CellMatcherTest leaks a lot by opening a spreadsheet many times. Phpunit reported a high watermark of 390MB; fixing CellMatcherTest brings that figure down to 242MB. I have also changed it to use a formal assertion in many cases where markTestSkipped was (IMO inappropriately) used.
* Another Leak
Issue2301Test lacks a disconnect. Adding one reduces HWM from 242MB to 224.
* Html Reader Not Handling non-ASCII Data Correctly
Fix#2942. Code was changed by #2894 because PHP8.2 will deprecate how it was being done. See linked issue for more details. Dom loadhtml assumes ISO-8859-1 in the absence of a charset attribute or equivalent, and there is no way to override that assumption. Sigh. The suggested replacements are unsuitable in one way or another. I think this will work with minimal disruption (replace ampersand, less than, and greater than with entities representing illegal characters, then use htmlentities, then restore ampersand, less than, and greater than).
* Better Implementation
Use regexp to escape non-ASCII. Less kludgey, less reliant on the vagaries of the PHP maintainers.
* Additional Tests
Test non-ASCII outside of cell contents: sheet title, image alt attribute.
* Apply Same Change in Second Location
Forgot to change loadFromString.
* Additional Test
Confirm escaped ampersand is handled correctly.
* Move Gridlines from Chart to Axis
This could, I hope, be my last major change to Chart for a while. When I first noticed this problem, I thought it would be a breaking change. However, although this change establishes some deprecations, I don't think it breaks anything. Major and minor gridlines had only been settable by the Chart constructor. This PR moves them where they belong, to Axis (eexisting Chart constructor code will still work). This allows them to be specified from both X and Y axis.
Chart is now entirely covered except for 2 statements, one deprecated and one that I just can't figure out. 99.71% for Charts, 88.96% overall. All references to the Chart directory in Phpstan baseline are eliminated.
* Minor Fixes, Unit Tests
Line style color type should default to null not prstClr.
Chart X-axis and Y-axis should alway be Axis, never null.
Add some unit tests.
* More Tests, Some Improvements
Make it easier to change line styles, adding an alternate method besides a setter function with at least a dozen parameters.
Fix#2897. We have been relying on iconv/mb_convert_encoding to detect invalid UTF-8, but all techniques designed to validate UTF-8 seem to accept FFFE and FFFF. This PR explicitly converts those characters to FFFD (Unicode substitution character) before validating the rest of the string. It also substitutes one or more FFFD when it detects invalid UTF-8 character sequences.
A comment in the code being change stated that it doesn't handle surrogates. It is right not to do so. The only case where we should see surrogates is reading UTF-16. Additional tests are added to an existing test reading a UTF-16 Csv to demonstrate that surrogates are handled correctly, and that FFFE/FFFF are handled reasonably.
* Ignore square-$-brackets prefix in format string
* Test for square-$-brackets prefix in format string issue fixed
* Fix for phpstan compliance
* Additional assert for checking number format of tested source cell
File from https://www.rondebruin.nl/win/s2/win003.htm. I have been in conversation with the author, who has no objection to its use. I have not actually opened the file in Excel (at least not with macros enabled); I am using it merely to demonstrate that the ribbon data is read and written correctly. Test added; no source code changed. This should slightly increase coverage for Reader/Xlsx (moderate), Writer/Xlsx (slight), and Spreadsheet (substantial). Note that this file has no Ribbon Bin objects, so some coverage is still lacking.
* Add Support to Chart/Axis and Gridlines for Shadow
Continuing the work from #2865. Support is added for Shadow properties for Axis and Gridlines, and Glow and SoftEdges are extended to Gridlines. Tests are added. Some chart tests are moved from Reader/Xlsx and Writer/Xlsx so that most chart tests are under a single directory.
This is a minor breaking change. Since the support for these properties was just added, it can't really affect much in userland. Some properties had been stored in the form which the XML requires them rather than as the user would enter them to Excel. So, for example, setting the Glow size to 10 points would have caused it to be stored internally as 127,000. This change will store the size internally as 10, obviously making the appropriate conversion when reading from or writing to XML. This makes unit tests much simpler, and I think this is also what a user would expect, especially considering the difficulties in keeping track of the trailing zeros.
* More Tests
Confirm value change between internal and xml.
* Still More Tests
Add a little more coverage, and use a neat trick suggested by @MarkBaker in the discussion of PR #2724 to greatly simplify MultiplierTest.
Mostly new tests, some code annotations, some minor code changes:
- RichText clone logic is wrong
- TextElement doesn't have object properties, doesn't need clone
Continuing the work from #2828, #2841, #2846, and #2852. This is probably my last change in this area for a while.
Bubble charts can have bubbles of different sizes. Phpspreadsheet had not supported this. Openpyxl comes with sample code to generate such a chart. I was especially drawn to that solution because its namespace usage would have been unexpected before 2852. And it turned out to come with other surprises - use of absolute paths in the .rels files (PhpSpreadsheet expected only relative), use of a one-cell anchor to place the chart (PhpSpreadsheet expected two-cell anchor or absolute positioning), plaintext in the legend (Phpspreadsheet expected RichText), no cached values for chart data. Excel handles the file okay, and this PR makes sure PhpSpreadsheet does as well. This file is now Samples/Templates/32readwriteBubbleChart2, and is used for both generating a sample output file and in formal tests. A new sample in the 33* series demonstrates how to code these.
Disabled it earlier because its reliance on an external site not under our control was causing problems. URL in spreadsheet is now changed to point to an image in phpspreadsheet.readthedocs.io, which should be more reliable. Test is re-enabled.
Fix#2840 (and also #2839 but that's Q&A, not an issue). Csv Reader does not populate cells which contain null string. This PR provides an option for the reader to store null strings as it does with any other string.
This test has always been problematic in that it depends on an external site not under the control of PhpSpreadsheet, and can therefore break at any time. As it has done this morning. Disable it until a better test is available.
Fix#2810. Repairing some Phpstan diagnostics, used `?:` rather than `??` in a few places.
2 different Html modules are affected. Also, Ods Reader, but its problem is with sheet title rather than cell contents. And, as it turns out, Ods Reader was already not handling sheets with a title of `0` correctly - it made a truthy test before setting sheet title. That is now changed to truthy or numeric. Other readers are not susceptible to this problem. Tests are added.
See #1285, which went stale and was closed, but recently received some positive feeback. It seems easy to implement, and the only other plaintext file format, Html, allows loading from a string. Those two reasons combined suggest that we should do it.
* Real Errors Identified in Calculation by Scrutinizer
Before Scrutinizer broke, I took a look at the remaining 43 errors which it categorized as 'major'. Most of these were false positives, but, in the case of Calculation and Reader/Xlsx/Chart, I was able to determine that its analysis of some of the problems was correct. There is little point addressing the false positives until it starts working again, but we should fix the real errors.
This PR addresses the real errors in Calculation.
- A test for `$pCellParent` should have been a test for `$pCellWorksheet`.
- A test for `$operand1Data['reference']` should have been a test for `$operand1Data['value']`.
- A test for `$operand2Data['reference']` should have been a test for `$operand2Data['value']`.
* Fix Attempt to Erroneously Call trim on Array
Fix#2780. Warning message is being issued when getting calculated value of cell with value `=INDIRECT(ADDRESS(ROW(),COLUMN()-2))/$C$4`. This appears to be the case for all recent (and probably not so recent) releases; it is not a result of changes to the code base. Fix added to this PR because the erring section of code was proximate to code already changed in the PR. Test added.
* Minor Code Changes
Apply some suggestions from @MarkBaker
PHP 8.2 is supposed to deprecate the use of `['self', 'functionname']` for callables, suggesting the use of `[self::class, 'functionname']` instead. We made this change in a recent PR, and, while I'm thinking about it, I'll fix the remaining 2 modules with this construction. Vlookup is already adequately covered in unit tests. Reader/Xls/MD5 is not; a unit test is added.
* Fix reading of files in the root of a zip
Xlsx.php relies in dirname($filename) for path generation. When path is a bare filename (i.e. files in the root of the zip file), dirname($filename) returns a relative path to the current directory ("."). This is ok for filesystems, but not when accesing contents in a zip file.
Xlsx documents with files in the root of the zip container are not common, but legit. I've found it to happen in files generated by Google Campaign Manager 360.
* Update Xlsx.php
* Update Xlsx.php
* Update CHANGELOG.md
* Add files via upload
* Create XlsxRootZipFilesTest.php
* Update XlsxRootZipFilesTest.php
* Add files via upload
* Delete rootZipFiles.xlsx
* Update XlsxRootZipFilesTest.php
* Update Xlsx.php
These changes have already been implemented twice, and been regressed twice. I'll try once more (with a different approach), then give up ...
As configured, Phpstan running under Php7 reports no errors. However, running under Php8, it reports 100 (!) errors. The vast majority of these are due to two reasons:
- renaming parameters in Php builtin functions in preparation for named parameters.
- using the new class GdImage rather than type resource as the argument type for many image-based functions.
Regardless of the cause, this will be a problem sooner or later. This PR is an attempt to get ahead of that problem. For source members, it mostly adds annotations or updates doc-blocks. Only 2 members have changes to executable code, and these are very minor - BitWise and Writer/Xlsx. For test members, all baseline errors are deleted and the code is fixed. Php7 and Php8 both report no errors with this configuration.
With the deprecation of `auto_detect_line_endings` in Php8.1, there have been some tickets (issue #2609 and PR #2438). Although the deprecation message is suppressed, users with a homegrown error handler may still see it. I am not very concerned about that symptom, but I imagine that there will be more similar tickets in future. This PR adds a new property/method to Reader/CSV to allow the user to avoid the deprecated code, at the negligible cost of being unable to read a CSV with Mac line endings even on a Php version that could support it.
Fix#2499, which see for details of an obscure problem affecting both PhpSpreadsheet and Excel. Add support for palette contained in workbook styles. This seems to be a very rare occurrence, so allow it only when the palette contains exactly 64 entries. If there are other possibilities, we'll presumably have a new workbook to guide us how to handle them. Also add some tests for specification of indexed color without palette, another rarity (no in-range examples amongst our current files). Also change one private static array, initialized once at run-time and never changed, to a constant.
Fix#2542. Xlsx Reader is expecting a `sz` tag when reading RichText, but it is not required, and PhpSpreadsheet issues a warning message when it is missing.
* WIP Namespacing Phase 2 - Styles
This is part 2 of a several-phase process to permit PhpSpreadsheet to handle input Xlsx files which use unexpected namespacing. The first phase, introduced as part of release 1.19.0, essentially handled the reading of data. This phase handles the reading of styles. More phases are planned.
It is my intention to leave this in draft status for at least a month. This will give time for additional testing, by me and, I hope, others who might be interested.
This fixes the same problem addressed by PR #2458, if it reaches mergeable status before I am ready to take this out of draft status. I do not anticipate any difficult merge conflicts if the other change is merged first.
This change is more difficult than I'd hoped. I can't get xpath to work properly with the namespaced style file, even though I don't have difficulties with others. Normally we expect:
```xml
<stylesheet xmlns="http://whatever" ...
```
In the namespaced files, we typically see:
```xml
<x:stylesheet xmlns:x="http://whatever" ...
```
Simplexml_load_file specifying a namespace handles the two situations the same, as expected. But, for some reason that I cannot figure out, there are significant differences when xpath processes the result. However, I can manipulate the xml if necessary; I'm not proud of doing that, and will gladly accept any suggestions. In the meantime, it seems to work.
My major non-standard unit test file had disabled any style-related tests when phase 1 was installed. These are now all enabled.
* Scrutinizer
Its analysis is wrong, but the "errors" it pointed out are easy to fix.
* Eliminate XML Source Manipulation
Original solution required XML manipulation to overcome what appears to be an xpath problem. This version replaces xpath with iteration, eliminating the need to manipulate the XML.
* Handle Some Edge Cases
For example, Style file without a Fills section.
* Restore RGB/ARGB Interchangeability
Fix#2494. Apparently EPPlus outputs fill colors as `<fgColor rgb="BFBFBF">` while most output fill colors as `<fgColor rgb="FFBFBFBF">`. EPPlus actually makes more sense. Regardless, validating length of rgb/argb is a recent development for PhpSpreadsheet, under the assumption that an incorrect length is a user error. This development invalidates that assumption, so restore the previous behavior.
In addition, a comment in Colors.php says that the supplied color is "the ARGB value for the colour, or named colour". However, although named colors are accepted, nothing sensible is done with them - they are passed unchanged to the ARGB value, where Excel treats them as black. The routine should either reject the named color, or convert it to the appropriate ARGB value. This change implements the latter.
* Xlsx Reader Merge Range For Entire Column(s) or Row(s)
Fix#2501. Merge range can be supplied as entire rows or columns, e.g. `1:1` or `A:C`. PhpSpreadsheet is expecting a row and a column to be specified for both parts of the range, and fails when the unexpected format shows up.
The code to clear cells within the merge range is very inefficient in terms of both memory and time, especially when the range is large (e.g. for an entire row or column). More efficient code is substituted. It is possible that we can get even more efficient by deleting the cleared cells rather than setting them to null. However, that needs more research, and there is no reason to delay this fix while I am researching.
When Xlsx Writer encounters a null cell, it writes it to the output file. For cell merges (especially involving whole rows or columns), this results in a lot of useless output. It is changed to skip the output of null cells when (a) the cell style matches its row's style, or (b) the row style is not specified and the cell style matches its column's style.
* Scrutinizer
See if these changes appease it.
* Improved CellIterators
Finally figured out how to improve efficiency here, meaning that there is no longer a reason to change Writer/Xlsx, so restore that.
* No Change for CellIterator
I had thought a change was needed for CellIterator, but it isn't.
* Allow single-cell checks on conditional styles, even when the style is configured for a range of cells
* Work on the CellMatcher logic to evaluate Conditionals for a cell based on its value, and identify which conditional styles should be applied
* Refactor style merging and cell matching for conditional formatting into separate classes; this should make it easier to test, and easier to extend for other CF expressions subsequently
* Added support for containsErrors and notContainsErrors
* Initial work on a wizard to help simplify created Conditional Formatting rules, to ensure that the correct expressions are set
* Further work on extending the Conditional Formatting rules to cover more of the options that are available in MS Excel
* Prevent phpcs-fixer from removing class @method annotations, used to identify the signature for magic methods used in Wizard classes
* Implement `fromConditional()`` method to allow the creation of a CF Wizard from an existing Conditional
* Ensure that xlsx Reader picks up the timePeriod attribute for DatesOccurring CF Rules
* Allow Duplicates/Uniques CF Rules to be recognised in the Xlsx Reader
* Basic Xlsx reading of CF Rules/Styles from <extLst><ext><ConditinalFormattings> element, and not just the <ConditinalFormatting> element of the worksheet
* Add some validation for operands passed to the CF Wizards
- remove any leading ``=` from formulae, because they'll be embedded into other formulae
- unwrap any string literals from quotes, because that's also handled internally
Handle cross-worksheet cell references in cellReferences and Formulae/Expressions
* re-baseline phpstan
* Update Change Log with details of the CF Improvements
Fix#2488. When Excel sees this situation, it leaves the value of the cell as null rather than casting to the specified DataType. It doesn't really make sense to change setValueExplicit to adopt this convention; it should be sufficient to recognize the situation in the Reader and act there. The same sort of situation might apply to strings, but I don't see any practical difference between null string and null even if so.