Commit Graph

373 Commits

Author SHA1 Message Date
Mark Baker 23e2d702ff
Merge branch 'master' into Xls-Reader-Conditional-Formatting 2022-03-19 16:21:14 +01:00
MarkBaker be8c444951 More minor tweaks 2022-03-19 16:15:48 +01:00
MarkBaker c73bb612e0 Unit tests for Xls Reader DataValidation 2022-03-19 12:04:14 +01:00
MarkBaker 45c08d6cd4 Initial work on reading conditional styles for the Xls Reader
Successfully reading the CF ranges and CF rules; not yet reading the styles
2022-03-18 21:09:32 +01:00
MarkBaker 21b784f200 Add basic support for Error functions when the error is #SPILL! or #CALC! 2022-03-18 17:44:26 +01:00
MarkBaker 9b3c3f4adf Merge branch 'master' into Xls-Reader-Conditional-Formatting 2022-03-17 12:11:19 +01:00
MarkBaker ec15c7a6de more minor tweaks 2022-03-16 23:03:19 +01:00
MarkBaker 4881e2ae9e Validte that lookup arrays are actually arrays 2022-03-16 21:30:19 +01:00
MarkBaker 6f84780bb9 Some work on refactoring the ReferenceHelper to extract the logic for updating cell references. This is a preliminary step toward allowing updates to absolute cell references, required to update Conditional Formatting rules. 2022-03-16 14:07:13 +01:00
MarkBaker cb5a451aaf Initial work on Reading Conditional Formatting from Xls files 2022-03-15 13:25:23 +01:00
oleibman 68158c8120
Phpstan Differences from Php7 to Php8, Again (#2665)
These changes have already been implemented twice, and been regressed twice. I'll try once more (with a different approach), then give up ...

As configured, Phpstan running under Php7 reports no errors. However, running under Php8, it reports 100 (!) errors. The vast majority of these are due to two reasons:
- renaming parameters in Php builtin functions in preparation for named parameters.
- using the new class GdImage rather than type resource as the argument type for many image-based functions.

Regardless of the cause, this will be a problem sooner or later. This PR is an attempt to get ahead of that problem. For source members, it mostly adds annotations or updates doc-blocks. Only 2 members have changes to executable code, and these are very minor - BitWise and Writer/Xlsx. For test members, all baseline errors are deleted and the code is fixed. Php7 and Php8 both report no errors with this configuration.
2022-03-11 23:28:30 -08:00
MarkBaker cba0e13b2a Escape double quotes in a string value unless it's an empty string value 2022-03-05 12:44:44 +01:00
MarkBaker f3d5028518 Work on setting up locale-aware formatted number conversion for the Csv Reader
Unit tests for locale-aware boolean conversion for Csv Reader
2022-03-02 08:53:29 +01:00
MarkBaker 99f488efc6 Resolve `translateSeparator()` method to handle separators (row and column) for array functions as well as for function argument separators; and cleanly handle nesting levels
Note that this method is used when translating Excel functions between en and other locale languages, as well as when converting formulae between different spreadsheet formats (e.g. Ods to Excel)

Nor is this a perfect solution, as there may still be issues when function calls have array arguments that themselves contain function calls; but it's still better than the current logic
2022-02-27 13:01:09 +01:00
MarkBaker 444d0fd77c Unit tests for merge ranges for Ods Reader/Writer 2022-02-26 22:51:15 +01:00
MarkBaker ef4029df63 Refactor ISO data type validation from cell to shared date; add extra checks for invalid dates; and appropriate unit tests 2022-02-26 14:13:12 +01:00
oleibman 9cf526a920
Reading Xlsx With Supplied Palette (#2595)
Fix #2499, which see for details of an obscure problem affecting both PhpSpreadsheet and Excel. Add support for palette contained in workbook styles. This seems to be a very rare occurrence, so allow it only when the palette contains exactly 64 entries. If there are other possibilities, we'll presumably have a new workbook to guide us how to handle them. Also add some tests for specification of indexed color without palette, another rarity (no in-range examples amongst our current files). Also change one private static array, initialized once at run-time and never changed, to a constant.
2022-02-23 22:09:22 -08:00
Thorsten Ho 0cb60a5098
Fix XLSX broken vertical align font style (#2619)
* Fix XLSX broken vertical align font style

* Add fix information to changelog

* Fix phpcs issues
2022-02-23 20:23:59 -08:00
Mark Baker 0ee4d96576
Array-enable the ISFORMULA() function (#2610)
Implement Array-enabled for  ERROR.TYPE() function
Extract ERROR.TYPE() function tests into separate test file
Extract error function tests into separate test files

And thus complete the implemented Information functions
2022-02-20 19:32:13 +01:00
Mark Baker 35b65bef8c
First steps toward array-enabling the information functions (#2608)
* First steps toward array-enabling the information functions

Also includes moving unit tests out from Functions and into a separate, dedicated Information folder

* Resolve issue with IF(), branch pruning and calculation cache (ensure that we don't convert the if condition to a bool before we've tested to see if it evaluates to an error)
More refactoring
2022-02-20 16:46:25 +01:00
Mark Baker c10d86eb9c
Start work on array-enabling the Lookup and Reference functions (#2602)
* Start work on array-enabling the Lookup and Reference functions

Requires a new method (`evaluateArrayArgumentsSubsetFrom()`) in the `ArrayEnabled` Trait to handle functions where the arguments that need special array handling are trailing rather than leading arguments
2022-02-19 18:36:50 +01:00
Mark Baker d5dc58d20e
Extract information functions (#2605)
* Split Information functions into a dedicated class and namespace and categorise as Value or Error
* Refactor all error functions into the new ExcelError class
2022-02-19 13:53:17 +01:00
Mark Baker 0371ccb686
Convert all relevant Logical functions to support array arguments (#2600) 2022-02-18 02:56:23 +01:00
oleibman 5bf0656e92
Xlsx Reader Warning When No sz Tag for RichText (#2550)
Fix #2542. Xlsx Reader is expecting a `sz` tag when reading RichText, but it is not required, and PhpSpreadsheet issues a warning message when it is missing.
2022-02-12 06:43:29 -08:00
oleibman ad5532e2f4
Namespacing Phase 2 - Styles (#2471)
* WIP Namespacing Phase 2 - Styles

This is part 2 of a several-phase process to permit PhpSpreadsheet to handle input Xlsx files which use unexpected namespacing. The first phase, introduced as part of release 1.19.0, essentially handled the reading of data. This phase handles the reading of styles. More phases are planned.

It is my intention to leave this in draft status for at least a month. This will give time for additional testing, by me and, I hope, others who might be interested.

This fixes the same problem addressed by PR #2458, if it reaches mergeable status before I am ready to take this out of draft status. I do not anticipate any difficult merge conflicts if the other change is merged first.

This change is more difficult than I'd hoped. I can't get xpath to work properly with the namespaced style file, even though I don't have difficulties with others. Normally we expect:
```xml
<stylesheet xmlns="http://whatever" ...
```
In the namespaced files, we typically see:
```xml
<x:stylesheet xmlns:x="http://whatever" ...
```

Simplexml_load_file specifying a namespace handles the two situations the same, as expected. But, for some reason that I cannot figure out, there are significant differences when xpath processes the result. However, I can manipulate the xml if necessary; I'm not proud of doing that, and will gladly accept any suggestions. In the meantime, it seems to work.

My major non-standard unit test file had disabled any style-related tests when phase 1 was installed. These are now all enabled.

* Scrutinizer

Its analysis is wrong, but the "errors" it pointed out are easy to fix.

* Eliminate XML Source Manipulation

Original solution required XML manipulation to overcome what appears to be an xpath problem. This version replaces xpath with iteration, eliminating the need to manipulate the XML.

* Handle Some Edge Cases

For example, Style file without a Fills section.

* Restore RGB/ARGB Interchangeability

Fix #2494. Apparently EPPlus outputs fill colors as `<fgColor rgb="BFBFBF">` while most output fill colors as `<fgColor rgb="FFBFBFBF">`. EPPlus actually makes more sense. Regardless, validating length of rgb/argb is a recent development for PhpSpreadsheet, under the assumption that an incorrect length is a user error. This development invalidates that assumption, so restore the previous behavior.

In addition, a comment in Colors.php says that the supplied color is "the ARGB value for the colour, or named colour". However, although named colors are accepted, nothing sensible is done with them - they are passed unchanged to the ARGB value, where Excel treats them as black. The routine should either reject the named color, or convert it to the appropriate ARGB value. This change implements the latter.
2022-02-11 06:42:04 -08:00
MarkBaker f577dde178 Fix for DOLLARDE() and DOLLARFR() with negative dollar values
Additional argument validations
2022-02-11 13:19:44 +01:00
Sebastian Nohn 454c01be51
Add support for one digit decimals (FORMAT_NUMBER_0, FORMAT_PERCENTAGE_0) (#2525)
* Add support for one digit decimals (FORMAT_NUMBER_0)

* Add support for one digit decimals (FORMAT_NUMBER_0, FORMAT_PERCENTAGE_0)

* adding tests for one digit numbers

* cleanup

* add failing test to block merge of this PR until #2555 has been merged

* fix code style

* fix test
2022-02-05 12:46:50 -08:00
Sebastian Nohn b5c03fc61f
Fix error in PercentageFormatter rounding (#2555)
* fix error in rounding percentages

* add tests for FORMAT_PERCENTAGE

* fix code style
2022-02-05 12:19:05 -08:00
Sebastian Nohn fe169dcd0a
Improve test coverage for NumberFormat (#2556)
* add tests for NumberFormat::FORMAT_NUMBER

* add tests for NumberFormat::FORMAT_NUMBER_00

* add tests for FORMAT_NUMBER_COMMA_SEPARATED1

* add tests for FORMAT_NUMBER_COMMA_SEPARATED2

* add tests for FORMAT_CURRENCY_USD_SIMPLE

* add tests for FORMAT_CURRENCY_USD

* add tests for FORMAT_CURRENCY_EUR, FORMAT_CURRENCY_EUR_SIMPLE

* add tests for FORMAT_ACCOUNTING_EUR, FORMAT_ACCOUNTING_USD
2022-02-05 12:05:26 -08:00
Mark Baker 4d82df2bc6
Add unit test for erroneous translations from Russian to English, and a quick/dirty fix (#2534)
* Add unit test for erroneous translations from Russian to English, and a quick/dirty fix
* Additional translation unit tests with accented characters from Spanish, Bulgarian, Czech and Turkish
* Update Change Log
2022-02-04 16:22:22 +01:00
Mark Baker 6b746dc05f
Extract some methods from the Calculation Engine into dedicated classes (#2537)
* Move binary comparisons out into a dedicated class
2022-02-04 16:02:29 +01:00
Mark Baker 26079174a0
Implementation of the SEQUENCE() Excel365 function (#2536)
* Implementation of the SEQUENCE() Excel365 function

Note that the Calculation Engine does not yet support the Spill operator, or spilling functions

* Handle the use-case of step = 0; and tests for exception handling for invalid arguments

* Update Change Log
2022-01-29 14:32:40 +01:00
mix5003 e7b0497237
fix warning when open xlsx file with thumbnail (#2517) 2022-01-24 14:17:53 -08:00
oleibman b6bd822b9c
Xlsx Reader Merge Range For Entire Column(s) or Row(s) (#2504)
* Xlsx Reader Merge Range For Entire Column(s) or Row(s)

Fix #2501. Merge range can be supplied as entire rows or columns, e.g. `1:1` or `A:C`. PhpSpreadsheet is expecting a row and a column to be specified for both parts of the range, and fails when the unexpected format shows up.

The code to clear cells within the merge range is very inefficient in terms of both memory and time, especially when the range is large (e.g. for an entire row or column). More efficient code is substituted. It is possible that we can get even more efficient by deleting the cleared cells rather than setting them to null. However, that needs more research, and there is no reason to delay this fix while I am researching.

When Xlsx Writer encounters a null cell, it writes it to the output file. For cell merges (especially involving whole rows or columns), this results in a lot of useless output. It is changed to skip the output of null cells when (a) the cell style matches its row's style, or (b) the row style is not specified and the cell style matches its column's style.

* Scrutinizer

See if these changes appease it.

* Improved CellIterators

Finally figured out how to improve efficiency here, meaning that there is no longer a reason to change Writer/Xlsx, so restore that.

* No Change for CellIterator

I had thought a change was needed for CellIterator, but it isn't.
2022-01-23 10:44:09 -08:00
Mark Baker 4a04499bff
Read conditional styling for cell (#2491)
* Allow single-cell checks on conditional styles, even when the style is configured for a range of cells
* Work on the CellMatcher logic to evaluate Conditionals for a cell based on its value, and identify which conditional styles should be applied
* Refactor style merging and cell matching for conditional formatting into separate classes; this should make it easier to test, and easier to extend for other CF expressions subsequently
* Added support for containsErrors and notContainsErrors
* Initial work on a wizard to help simplify created Conditional Formatting rules, to ensure that the correct expressions are set
* Further work on extending the Conditional Formatting rules to cover more of the options that are available in MS Excel
* Prevent phpcs-fixer from removing class @method annotations, used to identify the signature for magic methods used in Wizard classes
* Implement `fromConditional()`` method to allow the creation of a CF Wizard from an existing Conditional
* Ensure that xlsx Reader picks up the timePeriod attribute for DatesOccurring CF Rules
* Allow Duplicates/Uniques CF Rules to be recognised in the Xlsx Reader
* Basic Xlsx reading of CF Rules/Styles from <extLst><ext><ConditinalFormattings> element, and not just the <ConditinalFormatting> element of the worksheet

* Add some validation for operands passed to the CF Wizards
 - remove any leading ``=` from formulae, because they'll be embedded into other formulae
 - unwrap any string literals from quotes, because that's also handled internally

Handle cross-worksheet cell references in cellReferences and Formulae/Expressions

* re-baseline phpstan

* Update Change Log with details of the CF Improvements
2022-01-22 19:18:26 +01:00
Igor dbaafba6c6
Fix loading drawing size (#2492) 2022-01-16 21:59:31 -08:00
oleibman 06ea9ead2b
Xlsx Reader Cell DataType Numeric or Boolean Without Value (#2489)
Fix #2488. When Excel sees this situation, it leaves the value of the cell as null rather than casting to the specified DataType. It doesn't really make sense to change setValueExplicit to adopt this convention; it should be sufficient to recognize the situation in the Reader and act there. The same sort of situation might apply to strings, but I don't see any practical difference between null string and null even if so.
2022-01-16 21:19:09 -08:00
oleibman 95d9cc965d
Refinement for XIRR (#2487)
* Refinement for XIRR

Fix #2469. The algorithm used for XIRR is known not to converge in some cases, some of which are because the value is legitimately unsolvable; for others, using a different guess might help.

The algorithm uses continual guesses at a rate to hopefully converge on the solution. The code in Python package xirr (https://github.com/tarioch/xirr/) suggests a refinement when this rate falls below -1. Adopting this refinement solves the problem for the data in issue 2469 without any adverse effect on the existing tests. My thanks to @tarioch for that refinement.

The data from 2469 is, of course, added to the test cases. The user also mentions that an initial guess equal to the actual result doesn't converge either. A test is also added to confirm that that case now works.

The test cases are changed to run in the context of a spreadsheet rather than by direct calls to XIRR calculation routine. This revealed some data validation errors which are also cleaned up with this PR. This suggests that other financial tests might benefit from the same change; I will look into that.

* More Unit Tests

From https://github.com/RayDeCampo/java-xirr/blob/master/src/test/java/org/decampo/xirr/XirrTest.java
https://github.com/tarioch/xirr/blob/master/tests/test_math.py

Note that there are some cases where the PHP tests do not converge, but the non-PHP tests do. I have confirmed in each of those cases that Excel does not converge, so the PhpSpreadsheet results are good, at least for now. The discrepancies are noted in comments in the test member.
2022-01-13 19:31:46 -08:00
oleibman 8ab834520d
Handle Explicit "Date" Type for Cell (#2485)
Fix #2373. Excel can handle DateTime/Date/Time as a string if the datatype of the cell is set to "d". The string is, apparently, supposed to follow the ISO8601 spec. Openpyxl can be configured to generate a file with such values, so I've added support and set up unit tests. Excel, naturally, converts such a string input into its numeric representation of the date/time stamp. So will PhpSpreadsheet, so a call to setValueExplicit specifying Date format will actually see the cell wind up with Numeric format - there is no way (and no reason) for the Date type to 'stick'.
2022-01-13 18:40:18 -08:00
oleibman f24dcc7911
Another Undefined Index in Xls Reader (#2470)
Fix #2463. These continue to dribble in regularly.
2021-12-31 13:43:59 -08:00
oleibman 5d1ab39def
Replace Tests With Unneeded Mocking (#2465)
Replace mock tests with real ones when possible. The original tests are all still present; they just take place in a more representative scenario.

After this, there will be 4 remaining uses of mocking. Of these, 3 are needed for scenarios which are otherwise hard to test - WebServiceTest, CellsTest, and SampleCoverageTest. For the other one, AutoFilterTest, I just can't figure out what it's trying to accomplish, so have left it alone.

This change is almost entirely restricted to tests. There is a one-line change in src. When the first argument passed to OFFSET is null or nullstring, the returned value is currently 0. However, according to the documentation for Excel, it should be `#VALUE!`. The code is changed accordingly.
2021-12-31 13:24:43 -08:00
oleibman 3a6558625d
General Style Specified in Uppercase in Input Xlsx (#2451)
* General Style Specified in Uppercase in Input Xlsx

Fix #2450. Treat input style GENERAL as if it were expected upper/lowercase.

* Declare Method as Static

Surprised neither Phpstan nor Scrutinizer flagged this.

* Remove Duplicated Statement

Don't know why Scrutinizer didn't flag this the first time.
2021-12-18 09:25:08 -08:00
leo-bsv a7f687fe5c
Xlsx image background in comments #1547 (#2422)
* XLSX Image background in comments

* XLSX-Image-Background-In-Comments (#1547)

* Test fixes, convertion for comment sizes from px to pt, fix for setting image sizes from zip, set image type

* Merge remote-tracking branch 'origin/XLSX-Image-Background-In-Comments' into XLSX-Image-Background-In-Comments

* Tests to check reloaded document.

Co-authored-by: Burkov Sergey
2021-12-17 06:10:59 -08:00
oleibman ea74c96e98
Name Clashes Between Parsed and Unparsed Drawings (#2423)
* Name Clashes Between Parsed and Unparsed Drawings

This is at least a partial fix for #2396 and #1767 (which has been around for a long time). PhpSpreadsheet renames drawing XML files when it reads them from a spreadsheet. However, when it writes unparsed drawing files, it uses the original names, which can result in a clash with the renamed files. The solution in this PR is to write the unparsed files using the same renaming convention as the the others.

This is an incredibly simple fix, basically a one-line change, for such a long-lived problem. It is conceivable that this PR breaks a more sophisticated file than I have come across, e.g. with multiple unparsed files associated with a single worksheet. However, this PR does fix at least part of the problem for both issues, and causes no regression issues. The changed code was covered in only 2 tests - Reader/XlsxTest testLoadSaveWithEmptyDrawings and Writer/Xlsx/UnparsedDataTest testLoadSaveXlsxWithUnparsedData.

2396 is covered by a new test Unparsed2396Test. I had trouble figuring out what to test for 1767. Since it is a problem that becomes evident only when the output file is opened in Excel, I added a new sample to cover it.

* Sloppy Errors

I neglected to run php-cs-fixer and phpstan, and it bit me.

* Scrutinizer

It's not as good as Phpstan at recognizing problems that can't happen due to previous assertions.

* Scrutinizer Again

It can be really stupid sometimes.
2021-12-09 23:37:15 -08:00
lucasnetau e01a81ec5e
Fixes #2430 (#2431)
* Handle a wildcard match that contains a forward slash in the pattern by adding / to the delimiter list of preg_quote
* Fix SUMIF doing a wildcard match on empty cells (NULL)
* Fix compare logic to return false when value is an empty string or NULL (Verified against LibreOffice SUMIF and MATCH handling of empty cells)
2021-12-04 08:07:18 -08:00
oleibman d5825a6682
Read Spreadsheet with # in Name (#2409)
Fix #2405. Treat last, rather than first, `#` as separator between zip file name and member name, by finding it with strrpos rather than strpos.
2021-11-30 07:39:50 -08:00
oleibman 290c18e4db
Xlsx Reader Theme Support Broken After 17.1 (#2403)
Fix #2387. Fix #2075. There was substantial refactoring of Writer Xlsx styles in 18.0. An existing static property `$theme` was intended to be shared by both Writer Xlsx and the new Writer Xlsx Styles. However, the initialization of the property in the latter happened later than it should have. This PR makes that initialization happen as soon as the theme has been read. Also, declaring that property as static seems questionable; I have made it an instance member. This small re-factoring makes it possible to now support Themes in tab colors.

Since this PR changes Reader/Xlsx/Styles, add type-hinting throughout that module to eliminate Phpstan/Scrutinizer problems. I also removed method readStyle from Reader/Xlsx, since it was essentially duplicated in Reader/Xlsx/Styles. And I added a small number of tests to ensure that Styles is 100% covered. All of this is necessary in preparation for Namespacing phase 2.
2021-11-26 09:38:09 -08:00
oleibman 4ac0c47ac7
Support Data Validations in More Versions of Excel (#2377)
* Support Data Validations in More Versions of Excel

Attempt to deal with #2368, this time for good. Some deleted code was accidentally restored just before release 19, causing errors in spreadsheets with Data Validations. PR #2369 removed the duplicated code, and the fix was confirmed in current versions of Excel for Windows, Google sheets, and other versions of Excel. However, there were problems reported in earlier version of Excel for Windows, and some, versions of Excel for Mac, not all but including a recent one. This change, which is simpler than the original (no need for extLst) fix for DataValidations, is tested with Excel 2007 and Excel 2003 as well as more recent versions. I do not have a Mac on which to test.

* Multiple Identical Data Validation Lists

Using the same Data Validation List in multiple places on a worksheet caused them all to be merged into the same range. This was because sqref was not part of the hash code; it is now, avoiding this problem.

* Must Write Data Validations Before Hyperlinks

See discussion in #2389.
2021-11-14 10:06:46 -08:00
oleibman f831f48b71
ZipArchive and "Inconsistent" Zip File (#2376)
* ZipArchive and "Inconsistent" Zip File

Fix #2362. I added test for zip file inconsistency when dealing with a particularly nasty PHP/libzip bug affecting zero-length files. However, we also now verify that the file starts with a valid zip signature, so the consistency test is not really needed, and, from what I've read on the web, isn't particularly useful. The file with a problem, for example, opens just fine with Excel and zip, despite Php reporting it as inconsistent (when asked to check consistency). So, remove the consistency check.

* Update Issue2362Test.php

Latest Phpstan does not allow cast from 'mixed' to 'string'.

* Update Issue2362Test.php
2021-11-12 01:18:57 -08:00
oleibman 2f1f3a19b8
Csv, Boolean, and StringValueBinder (#2374)
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet.

When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows:
- If StringValueBinder is not in effect, convert to boolean.
- If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string.
- Otherwise, convert to boolean.

This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
2021-11-12 00:04:08 -08:00