Fix#2488. When Excel sees this situation, it leaves the value of the cell as null rather than casting to the specified DataType. It doesn't really make sense to change setValueExplicit to adopt this convention; it should be sufficient to recognize the situation in the Reader and act there. The same sort of situation might apply to strings, but I don't see any practical difference between null string and null even if so.
* Refinement for XIRR
Fix#2469. The algorithm used for XIRR is known not to converge in some cases, some of which are because the value is legitimately unsolvable; for others, using a different guess might help.
The algorithm uses continual guesses at a rate to hopefully converge on the solution. The code in Python package xirr (https://github.com/tarioch/xirr/) suggests a refinement when this rate falls below -1. Adopting this refinement solves the problem for the data in issue 2469 without any adverse effect on the existing tests. My thanks to @tarioch for that refinement.
The data from 2469 is, of course, added to the test cases. The user also mentions that an initial guess equal to the actual result doesn't converge either. A test is also added to confirm that that case now works.
The test cases are changed to run in the context of a spreadsheet rather than by direct calls to XIRR calculation routine. This revealed some data validation errors which are also cleaned up with this PR. This suggests that other financial tests might benefit from the same change; I will look into that.
* More Unit Tests
From https://github.com/RayDeCampo/java-xirr/blob/master/src/test/java/org/decampo/xirr/XirrTest.javahttps://github.com/tarioch/xirr/blob/master/tests/test_math.py
Note that there are some cases where the PHP tests do not converge, but the non-PHP tests do. I have confirmed in each of those cases that Excel does not converge, so the PhpSpreadsheet results are good, at least for now. The discrepancies are noted in comments in the test member.
Fix#2373. Excel can handle DateTime/Date/Time as a string if the datatype of the cell is set to "d". The string is, apparently, supposed to follow the ISO8601 spec. Openpyxl can be configured to generate a file with such values, so I've added support and set up unit tests. Excel, naturally, converts such a string input into its numeric representation of the date/time stamp. So will PhpSpreadsheet, so a call to setValueExplicit specifying Date format will actually see the cell wind up with Numeric format - there is no way (and no reason) for the Date type to 'stick'.
Replace mock tests with real ones when possible. The original tests are all still present; they just take place in a more representative scenario.
After this, there will be 4 remaining uses of mocking. Of these, 3 are needed for scenarios which are otherwise hard to test - WebServiceTest, CellsTest, and SampleCoverageTest. For the other one, AutoFilterTest, I just can't figure out what it's trying to accomplish, so have left it alone.
This change is almost entirely restricted to tests. There is a one-line change in src. When the first argument passed to OFFSET is null or nullstring, the returned value is currently 0. However, according to the documentation for Excel, it should be `#VALUE!`. The code is changed accordingly.
* General Style Specified in Uppercase in Input Xlsx
Fix#2450. Treat input style GENERAL as if it were expected upper/lowercase.
* Declare Method as Static
Surprised neither Phpstan nor Scrutinizer flagged this.
* Remove Duplicated Statement
Don't know why Scrutinizer didn't flag this the first time.
* XLSX Image background in comments
* XLSX-Image-Background-In-Comments (#1547)
* Test fixes, convertion for comment sizes from px to pt, fix for setting image sizes from zip, set image type
* Merge remote-tracking branch 'origin/XLSX-Image-Background-In-Comments' into XLSX-Image-Background-In-Comments
* Tests to check reloaded document.
Co-authored-by: Burkov Sergey
* Name Clashes Between Parsed and Unparsed Drawings
This is at least a partial fix for #2396 and #1767 (which has been around for a long time). PhpSpreadsheet renames drawing XML files when it reads them from a spreadsheet. However, when it writes unparsed drawing files, it uses the original names, which can result in a clash with the renamed files. The solution in this PR is to write the unparsed files using the same renaming convention as the the others.
This is an incredibly simple fix, basically a one-line change, for such a long-lived problem. It is conceivable that this PR breaks a more sophisticated file than I have come across, e.g. with multiple unparsed files associated with a single worksheet. However, this PR does fix at least part of the problem for both issues, and causes no regression issues. The changed code was covered in only 2 tests - Reader/XlsxTest testLoadSaveWithEmptyDrawings and Writer/Xlsx/UnparsedDataTest testLoadSaveXlsxWithUnparsedData.
2396 is covered by a new test Unparsed2396Test. I had trouble figuring out what to test for 1767. Since it is a problem that becomes evident only when the output file is opened in Excel, I added a new sample to cover it.
* Sloppy Errors
I neglected to run php-cs-fixer and phpstan, and it bit me.
* Scrutinizer
It's not as good as Phpstan at recognizing problems that can't happen due to previous assertions.
* Scrutinizer Again
It can be really stupid sometimes.
* Handle a wildcard match that contains a forward slash in the pattern by adding / to the delimiter list of preg_quote
* Fix SUMIF doing a wildcard match on empty cells (NULL)
* Fix compare logic to return false when value is an empty string or NULL (Verified against LibreOffice SUMIF and MATCH handling of empty cells)
Fix#2387. Fix#2075. There was substantial refactoring of Writer Xlsx styles in 18.0. An existing static property `$theme` was intended to be shared by both Writer Xlsx and the new Writer Xlsx Styles. However, the initialization of the property in the latter happened later than it should have. This PR makes that initialization happen as soon as the theme has been read. Also, declaring that property as static seems questionable; I have made it an instance member. This small re-factoring makes it possible to now support Themes in tab colors.
Since this PR changes Reader/Xlsx/Styles, add type-hinting throughout that module to eliminate Phpstan/Scrutinizer problems. I also removed method readStyle from Reader/Xlsx, since it was essentially duplicated in Reader/Xlsx/Styles. And I added a small number of tests to ensure that Styles is 100% covered. All of this is necessary in preparation for Namespacing phase 2.
* Support Data Validations in More Versions of Excel
Attempt to deal with #2368, this time for good. Some deleted code was accidentally restored just before release 19, causing errors in spreadsheets with Data Validations. PR #2369 removed the duplicated code, and the fix was confirmed in current versions of Excel for Windows, Google sheets, and other versions of Excel. However, there were problems reported in earlier version of Excel for Windows, and some, versions of Excel for Mac, not all but including a recent one. This change, which is simpler than the original (no need for extLst) fix for DataValidations, is tested with Excel 2007 and Excel 2003 as well as more recent versions. I do not have a Mac on which to test.
* Multiple Identical Data Validation Lists
Using the same Data Validation List in multiple places on a worksheet caused them all to be merged into the same range. This was because sqref was not part of the hash code; it is now, avoiding this problem.
* Must Write Data Validations Before Hyperlinks
See discussion in #2389.
* ZipArchive and "Inconsistent" Zip File
Fix#2362. I added test for zip file inconsistency when dealing with a particularly nasty PHP/libzip bug affecting zero-length files. However, we also now verify that the file starts with a valid zip signature, so the consistency test is not really needed, and, from what I've read on the web, isn't particularly useful. The file with a problem, for example, opens just fine with Excel and zip, despite Php reporting it as inconsistent (when asked to check consistency). So, remove the consistency check.
* Update Issue2362Test.php
Latest Phpstan does not allow cast from 'mixed' to 'string'.
* Update Issue2362Test.php
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet.
When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows:
- If StringValueBinder is not in effect, convert to boolean.
- If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string.
- Otherwise, convert to boolean.
This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
Example: right shift shared formula: IF(A$1=0,0,A1/A$1)
Expected value: IF(B$1=0,0,B1/B$1)
Actual value: IF(B$1=0,0,A1/B$1)
Similar behavior is observed when copying formulas vertically.
This issue occurs because a fixed and a non-fixed cell hit the same element of the $newCellTokens array by index $cellIndex
PR #1844 fixes it, but changes were requested. It has been almost 3 months and those changes have not been made. This PR replaces that one; it should be suitable for all supported releases of PHP through 8.1, and includes a formal unit test.
Fixes#1685Closes#1844
See issue #2331. Timestamp is expected in format yyyy-mm-dd (plus other information), with the expectation that month and day are 2 digits zero-filled on the left if needed. The user's file instead used a space rather than zero as filler. Although I don't know how the unexpected timestamp was created, it was easy enough to alter the timestamp in an otherwise normal spreadsheet, and use that file as a test case.
See issue #2123. HLOOKUP needs to do some conversions between column numbers and letters which it had not been doing.
HLOOKUP tests were performed using direct calls to the function in question rather than in the context of a spreadsheet. This contributed to keeping this error obscured even though there were, in theory, sufficient test cases. The tests are changed to perform in spreadsheet context. For the most part, the test cases are unchanged. One of the expected results was wrong; it has been changed, and a new case added to cover the case it was supposed to be testing.
After getting the HLOOKUP tests in order, it turned out that a test using literal arrays which had been succeeding now failed. The array constructed by the literals are considerably different than those constructed using spreadsheet cells; additional code was added to handle this situation.
This change was suggested by issue #2316. There was a problem reading Xlsx comments which appeared with release 18.0 but which was already fixed in master. So no source change was needed to fix the issue, but I thought we should at least add the test case to our unit tests.
In developing that case, I discovered that, although comment text was read correctly, there was a problem with comment author. In fact, there were two problems. One was new, with the namespacing changes - as in several other cases, the namespaced attribute `authorId` needed some special handling. However, another problem was much older - the code was checking `!empty($comment['authorId'])`, eliminating consideration of authorId=0, and should instead have been checking `isset`. Both problems are now fixed, and tested.
* isFormula Referencing Sheet With Space in Title
See issue #2304. User sets a cell to `ISFORMULA(cell)`, where `cell` exists on a sheet whose title contains a space, and receives an error. Coordinates are not being passed correctly to Functions::isFormula; in particular, the sheet name is not enclosed in apostrophes, e.g. `Sheet Name!A1` rather than `'Sheet Name'!A1`. (Note that sheet name was not specified in Cell; PhpSpreadsheet adds it before calling isFormula.) Sheets without embedded spaces (or other non-word characters) are handled correctly with or without apostrophes, but spaces require surrounding apostrophes.
As part of this investigation, I determined that Excel handles defined names and cell ranges in ISFORMULA (subject to spills), and that PhpSpreadsheet does not. It is changed to handle them. In the absence of spill support, it will use only the first cell in the range.
Existing tests for ISFORMULA used mocking unneccesarily. They are moved to a separate test member, and mocking is no longer used.
* PhpUnit and Jpgraph
35_Char_render.php had previously been a problem only for PHP8+. It is now a problem for PHP7.4, and will therefore be skipped all the time.
* Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix
See issue #2301. The main problem in that issue had been introduced with 18.0 and had already ben fixed in master. However there was a subsequent problem that had been introduced in master, an undotted i uncrossed t with namespace handling. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in parseRichText caused fonts declared in Rich Text elements to be ignored.
* Add An Assertion
Addresses problem in 2301 that had already been fixed.
Fixes issue #2266. Writer/Xlsx fails when there is no longer a sheet which corresponds to the definition of a local defined name. The code is changed to drop such an orphaned name. Writer/Xls does not fail under the same cicrcumstances, so no correction is needed there. Writer/Ods fails in a different manner, and is corrected to no longer do so.
See issues #1432 and #2149. Data validations on an Xlsx worksheet can be specified in two manners - one (henceforth "internal") if a list is specified from the same sheet, and a different one (henceforth "external") if a list is specified from a different sheet. Xlsx worksheet reader formerly processed only the internal format; PR #2150 fixed this so that both would be processed correctly on read. However, Xlsx worksheet writer outputs data validators only in the internal format, and that does not work for external data validations; it appears, however, that internal data validations can be specified in external format.
This PR changes Xlsx worksheet writer to use only the external format. Somewhat surprisingly, this must come after most of the other XML tags that constitute a worksheet. It shares this characteristic (and XML tag) with conditional formatting. The new test case DataValidator2Test includes a worksheet which has both internal and external data validation, as well as conditional formatting.
There is some additional namespacing work supporting Data Validations that needs to happen on Xlsx reader. Since that is substantially unchanged with this PR, that work will happen in a future namespacing phase, probably phase 2. However, there are some non-namespace-related changes to Xlsx reader in this PR:
- Cell DataValidation adds support for a new property sqref, which is initialized through Xlsx reader using a setSqref method. If not initialized at write time, the code will work as it did before the introduction of this property. In particular, before this change, data validation applied to an entire column (as in the sample spreadsheet) would be applied only through the last populated row. In addition, this also allows a user to extend a Data Validation over a range of cells rather than just a single cell; the new method is added to the documentation.
- The topLeft property had formerly been used only for worksheets which use "freeze panes". However, as luck would have it, the sample dataset provided to demonstrate the Data Validations problem uses topLeft without freeze panes, slightly affecting the view when the spreadsheet is initially opened; PhpSpreadsheet will now do so as well.
It is worth noting issue #2262, which documents a problem with the hasValidValue method involving the calculation engine. That problem existed before this PR, and I do not yet have a handle on how it might be fixed.
* Fraction Formatting
See issue #2253. User's analysis was correct - leading zeros in the decimal portion were being stripped out, so 0.0625 and 0.625 were being treated the same. As it turns out, integers also aren't handled well (`0 0/1` anyone?). The latter problem had been hidden because caller tested for integer first and skipped call if true; but FractionFormatter::format is public and should work correctly regardless. All Phpstan baseline entries for FractionFormatter and NumberFormatter are eliminated. New test data is added; no need for changes to test code.
* Scrutinizer
Ensure result is string.
See issue #2239. Problem is dealt with at the source, by making sure that Reader Xls checks for use of 'GENERAL' rather than 'General'. There doesn't seem to be a reason to test in other places, or to test for other casing variants.
* Initial adjustments to Xlsx Reader for two possible locations for AutoFiter information, either on the sheet itself for older files, or in the tables/tableX file for more recent files
* Refactor AutoFilter Reader logic into separate methods; preparatory work toward the eventual goal of moving it into its own dedicated AutoFilterTables class
* Basic unit tests to verify that the Xlsx Reader can read both the older and Office365 variants of the files used to store AutoFilter structure
* Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen
Fixes issue #549 and https://github.com/Maatwebsite/Laravel-Excel/issues/989 (which is the source of the new test file). Some systems accept MACCENTRALEUROPE as the name for the appropriate encoding, and some accept MAC-CENTRALEUROPE. I fortunately have access to at least one of each type, and have run the tests on each.
CodePage.php has an array of translations from codepage number to string. I now allow the value to itself be an array; if so, the code will test each in turn to see if it can be used in iconv. I did not go fishing for other similar problems. If such show up, they can be dealt with in the same manner as this one. I don't really expect others, since this is a problem not merely for Xls, but, even then, it applies only to BIFF5 and earlier.
I also moved XlsTest from Reader to Reader/Xls.
* Cache Successful Result For Future Use
Per suggestion from @MarkBaker
Fix for issue #1897.
The existing hashing code seems to work correctly almost all the time, but there are exceptions. It is replaced by an exact implementation of the spec, including a link to the spec in the comments. Cases known to fail are added to the unit test suite.
The spec expects the string to be at most 255 bytes (yes, bytes not characters). The program had permitted any length; it will now throw an exception when the maximum length is exceeded.
Xls does not support any hashing algorithm except basic. The Xls writer had, nevertheless, accepted the results of any of the other possible algorithms. This leads to (a) a worksheet that can't be unprotected, and (b) deprecation notices during the write (because it is using hexdec, which expects only hex characters, and the other algorithms generate non-hex characters). I have changed Xls writer to ignore passwords generated by other algorithms. An alternative would be to have the password hasher generate both an algorithmic password (for use by Xlsx) and a basic password (for use by Xls); I think that is too complex a solution, but can look into it if you think it worthwhile.
I do not see any current support for Worksheet passwords in ODS Reader or Writer. I did not add support in this PR.
I added a new test to confirm the password for reading a spreadsheet is consistent with the one used for writing it. As you can see from the comments for the new test, it had an unusual problem with a somewhat unusual solution.
* Xlsx Reader Better Namespace Handling Phase 1 Try2
This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.
As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.
The original commit message follows.
There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.
Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)
This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure
The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.
While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.
Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
Per suggestion from @MarkBaker.
WildcardMatch did not handle double tilde correctly. It has been changed to do so and its logic simplified (and commented).
Existing AutoFilter test covered this situation, but I added a test for MATCH as well.
* Read data validations for drop down list in another sheet.
* Add function testLoadXlsxDataValidationOfAnotherSheet() in class tests/PhpSpreadsheetTests/Reader/XlsxTest.php for unit test.
* Add sample xlsx for unit tests.
* Modifiy call function isset() for warnings.
* Additional assertions to ensure that the worksheet has been read correctly for DataValidation that references a list on a different worksheet
* This should resolve the phpstan issues
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
* Improve Identification of Samples in Coverage Report
The Phpunit coverage report currently contains bullet items like `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "49"`. This extremely simple change takes advantage of Phpunit's ability to accept an array with keys which are either strings or integers, by using the sample filenames as the array keys rather than sequential but otherwise meaningless integers (e.g. `49` in the earlier cited item). The bullet item will now read `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "Basic/38_Clone_worksheet.php"`.
* Fix for Issue 2158 (AverageIf Calculation Problem)
Issue #2158 reports an error calculating AverageIf because a function returns null rather than a string. There turn out to be several components to this problem:
- The nominal fix to the problem is to add some null-to-nullstring coercion in DatabaseAbstract.
- This fixes the error, but does not necessarily lead to the correct result because buildQuery treats values of null and null-string identically, whereas Excel does not. So change that to treat null-string as any other string.
- But that doesn't lead to the correct result either. That's because Functions/ifCondition recognizes a null string, but then continues to (over-)process it until it returns the wrong result. Fix this problem in conjunction with the other two, and we finally get the correct result.
A new unit test is added for AVERAGEIF, and new test cases are added for SUMIF. In each case, there are complementary tests for conditions of null and null-string, and the results agree with Excel. There may or may not be value in adding new tests to other functions, and I will be glad to do so for any functions which you care to identify, but no existing tests broke as a result of these changes.
* PHP8.1 Deprecation Passing Null to String Function
For each of the files in this PR, one or more statements can pass a null to string functions like strlower. This is deprecated in PHP8.1, and, when deprecated messages are enabled, causes many tests to error out. In every case, use coercion to pass null string rather than null.
* TextData - Minor Changes, Test Coverage
Per agreement on a previous push, I looked into standardizing the initialization of the TextData functions (like Engineering and MathTrig), with particular regard for avoiding multiple later null coercions. This simplifies the code quite a bit. This PR also increases coverage to 100% for all TextData modules. All entries in Phpstan baseline for non-deprecated TEXTDATA functions are removed. There were some minor bugfixes.
Whereas Excel (and Gnumeric) treat booleans when supplied as strings as 'TRUE' or 'FALSE', ODS treats them as '1' or '0'. Unlike Excel, ODS generally does not allow bool for int arguments; it does, however, allow them for FIND and SEARCH. ODS allows boolean for into for SUBSTITUTE even though Excel doesn't. ODS allows bool for string for NUMBERVALUE and VALUE even though Excel doesn't. ODS accepts 0 as an argument for CHAR; Excel doesn't. Most of this seems like random decisions on the part of the developers; I've done my best to follow the products in each case. There is a new test member devoted to ODS tests.
Gnumeric has an anomaly vis-a-vis the others - if length is supplied to LEFT/MID/RIGHT as null, Gnumeric treats it as 0 rather than 1.
All tests now take place in the context of a spreadsheet ...
Except for RETURNSTRING, which is not the implementation of an Excel function, and is referred to in the rest of PhpSpreadsheet only in the unit tests for itself. It should probably be deprecated, but that is not part of this PR, just in case there is some reason for it that I couldn't discern.
I have tried to make the first line of each doc block identify the Excel function name rather than its name in PhpSpreadsheet. I think it makes things more comprehensible.
Some tests call Settings::setLocale, but there was no Settings::getLocale. At the end of the tests which do it, they invoke setLocale('EN-US'), which, in a practical sense, is sufficient. However, in theory it would be better for them to get the current locale before changing it, then changing it back to the original when the time came. I have added getLocale and made the appropriate testing change.
The CHAR function took an interesting turn. One can set the value of a cell to, say, CHAR(2), the ASCII/UTF-8 representation of a control character, which is not legal in certain contexts. The only Reader/Writer that could handle this without problems is Xls, which deals with binary data all the time. However, if you tried to write it to Xlsx, Excel would not be able to open the resulting file because of what it considers an illegal character. I changed the Xlsx writer to escape such characters when writing the value of a string function. I did not make any other changes to the Xlsx writer - it seems to me that setting a cell to CHAR(2) is legitimate, but setting it to say `"\x02"` seems less likely to be legitimate, so the latter will still fail (although `="\x02"` should work). The Xlsx reader already supports the escape mechanism that I added to the writer.
CHAR control character and Ods - not supported by either Reader or Writer. I did not attempt to add this now. There is lots still missing from ODS, and this item just can't be a high priority amongst all of those.
CHAR control character and Csv - it is supported by reader and writer if the file has a csv extension. However, trying to guess the mime type without an extension - the control character makes mime_get_type guess application/octet-stream, and PhpSpreadsheet therefore thinks that Csv can't read it.
CHAR control character and Html. Actual use of the control character in the file is subject to the same problems as Xml (i.e. Xlsx and Ods). It wasn't terribly difficult to get the Html Writer to change `"\x02"` to "``". I believe that this is technically legal; however, DOMDocument.loadHTML rejects it as an illegal entity, and I am not convinced that it is wrong to do so, so I haven't changed the Html writer.
* Scrutinizer
Correct 3 minor errors.
* Fix for the BIFF-8 Xls colour mappings in the Reader
* Unit test for reading colours, writing hen rereading and ensuring that the RGB values have not changed
* Initia work on differentiating between empty arguments and null arguments passed to Excel functions
Previously we always passed a null value for an empty argument (i.e. where there was an argument separator in the function call without an argument.... PHP doesn't support empty arguments, so we needed to provide some value but then it wasn't possible to differentiate between a genuine null argument (either a literal null, or a null cell value) and the null that we were passing to represent an empty argument value.
This change evaluates empty arguments within the calculation engine, and instead of passing a null, it reads the signature of the required Excel function, and passes the default value for that argument; so now a null argument really does mean a null value argument.
* If the Excel function implementation doesn't accept any arguments; or once we reach a variadic argument, or try to pass more arguments than the method supports in its signature, then there's no point in checking for defaults, and to do so will lead to PHP errors, so break out of the default replacement loop
See issue #2116. Code for handling end of month (method couponFirstPeriodDate) needed a fix. Fixed it, confirmed it covered the reported issue with no regression problems. Then added some extra similar tests to all the callers of couponFirstPeriodDate, and ...
One new test, in COUPDAYSNC, does not agree with Excel. It also does not agree with LibreOffice. It does, however, agree with Gnumeric, and with my (hardly guaranteed) hand calculation of what the result should be. So, I'm going with it (and have added an appropriate comment to the test data). I'm glad to discuss the matter with anyone more familiar than I with how this is supposed to work - those 360-day years are killers.
* Additional unit tests for VLOOKUP() and HLOOKUP()
* Additional unit tests for CHOOSE()
* Unit tests for HYPERLINK() function
* Fix CHOOSE() test for spillage
As issue #2042 documents, SUM behaves differently with invalid strings depending on whether they come from a cell or are used as literals in the formula. SUM is not alone in this regard; COUNTA is another function within this behavior, and the solution to this one is modeled on COUNTA. New tests are added for SUM, and the resulting tests are duplicated to confirm correct behavior for both cells and literals.
Samples 16 (CSV), 17 (Html), and 21 (PDF) were adversely affected by this problem. 17 and 21 were immediately fixed, but 16 had another problem - Excel was not interpreting the UTF8 currency symbols correctly, even though the file was saved with a BOM. After some experimenting, it appears that the `sep=;` line generated by setExcelCompatibility(true) causes Excel to mis-handle the file. This seems like a bug - there is apparently no way to save a UTF-8 CSV with non-ASCII characters which specifies a non-standard separator which Excel will open correctly. I don't know if this is a recent change or if it is just the case that nobody noticed this problem till now. So, I changed Sample 16 to use setUseBom rather than setExcelCompatibility, which solved its problem. I then added new tests for setExcelCompatibility, with documentation of this problem.
* Defined names/formulae in ODS are prefixed by $$ when used in a formula; so we need to strip this out to fully convert them to an Excel formula
* Test for ODS Writer for DefinedNames
* First steps in the implementation of AutoFilters for ODS Reader and Writer, starting with reading a basic AutoFilter range (ignoring row visibility, filter types and active filters for the moment).
And also some additional refactoring to extract the DefinedNames Reader into its own dedicated class as a part of overall code improvement... on the principle of "when working on a class, always try to leave the library codebase in a better state than you found it"
* Provide a basic Ods Writer implementation for AutoFilters
* AutoFilter Reader Test
* AutoFilter Writer Test
* Update Change Log
* Pattern Fill style should default to 'solid' if there is a pattern fill style for a conditional; though may need to check if there are defined fg/bg colours as well; and only set a fill style if there are defined colurs
* Improved Support for INDIRECT, ROW, and COLUMN Functions
This should address issues #1913 and #1993. INDIRECT had heretofore not supported an optional parameter intended to support addresses in R1C1 format which was introduced with Excel 2010. It also had not supported the use of defined names as an argument. This PR is a replacement for #1995, which is currently in draft status and which I will close in a day or two.
The ROW and COLUMN functions also should support defined names. I have added that, and test cases, with the latest push. ROWS and COLUMNS already supported it correctly, but there had been no test cases. Because ROW and COLUMN can return arrays, and PhpSpreadsheet does not support dynamic arrays, I left the existing direct-call tests unchanged to demonstrate those capabilities.
The unit tests for INDIRECT had used mocking, and were sorely lacking (tested only error conditions). They have been replaced with normal, and hopefully adequate, tests. This includes testing globally defined names, as well as locally defined names, both in and out of scope.
The test case in 1913 was too complicated for me to add as a unit test. The main impediments to it are now removed, and its complex situation will, I hope, be corrected with this fix.
INDIRECT can also support a reference of the form Sheetname!localName when localName on its own would be out of scope. That functionality is added. It is also added, in theory, for ROW and COLUMN, however such a construction is rejected by the Calculation engine before passing control to ROW or COLUMN. It might be possible to change the engine to allow this, and I may want to look into that later, but it seems much too risky, and not nearly useful enough, to attempt to address that as part of this change.
Several unusual test cases (leading equals sign, not-quite-as-expected name definition in file, complex indirection involving concatenation and a dropdown list) were suggested by @MarkBaker and are included in this request.
Openpyxl can generate the xml tag `<patternFill/>`, possibly even as a default style. Excel has no problem with this, treating it as "fill none", but PhpSpreadsheet has a glitch because it treats it as "fill solid white". So, when PhpSpreadsheet loads and saves such a file, the result at first appears as if gridlines are disabled; in fact, the gridlines are merely invisible behind the cells with their solid white fill. This PR makes PhpSpreadsheet behave the same as Excel in this circumstance.
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
* Use validation classes rather than traits for Statistical functions, and some verification of nullable arguments
* Eliminate more of the issues resolved in phpstan baseline
* Additional unit tests and rationalisation for Financial Functions
* Providing a series of sample files for Financial functions
* Refactor the last of the existing Financial functions
* Some more unit tests with default assignments from null arguments
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
* More Financial function extracts, this time looking at the Periodic Cashflow functions
* Initial extract of Constant Periodic Interest and Payment functions
* Continue MathTrig Breakup - Completion!
Continuing the process of breaking MathTrip.php up into smaller classes. This round takes care of everything that was left:
- ABS
- DEGREES
- EXP
- RADIANS
- SQRT
- SQRTPI
- SUMSQ, SUMX2MY2, SUMX2PY2, SUMXMY2
The only notable logic change was that the 3 SUMX* functions had accepted arrays of unlike length; in that condition, they now return N/A, as Excel does. There had been no tests for this condition.
All the functions in MathTrig.php are now deprecated. Except for COMBIN, the test suite executes them only from MathTrig MovedFunctionsTest. COMBIN is still directly called by some Statistics Binomial functions which have not yet had the opportunity to be re-coded for the new location.
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
* Start extracting CashFlow functions from Financial, beginning with the simple Single Rate flows
* Extracting Variable Periodic and NonPeriodic CashFlow functions from Financial
* Some more unit tests for exception cases
* Extract Normal and Standard Normal Distributions from the Statistical Class
* Extract ZTest from the Statistical Class, and move it to the Standard Normal Distribution class
Additional unit tests for NORMINV()
* Extract LogNormal distribution functions from Statistical
* Continue MathTrig Breakup - Penultimate?
Continuing the process of breaking MathTrip.php up into smaller classes. This round takes care of about half of what is left, so perhaps one round after this one will finish the job:
- ARABIC
- COMBIN; also implemented COMBINA
- FACTDOUBLE
- GCD (which accepts and ignores empty cells as arguments, but returns VALUE if all the arguments are that way; LCM does the same)
- LOG_BASE, LOG10, LN
- implemented MUNIT
- MOD
- POWER
- RAND, RANDBETWEEN (RANDARRAY is too complicated to implement with this ticket)
As you can see from the description, there are some functions which were combined in a single class. When not combined, I adopted PowerKiki's suggestion of using "execute" as the function name.
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
* Resolution for [#Issue 1972](https://github.com/PHPOffice/PhpSpreadsheet/issues/1972) where format masks with a leading and trailing quote were always treated as literal strings, even when they masks containing quoted characters.
Also resolves issue with colour name case-sensitivity
* Extract a few more Distribution functions from Statistical; this time EXPONDIST() and HYPGEOMDIST()
* Extract the F Distribution (although only F.DIST() is implemented so far
* Updae docblocks
* PHPCS
* Extract Binomial Distribution functions from Statistical
Replace the old MS algorithm for CRITBINOM() (which has now been replaced with te BINOM.INV() function) with a brute force approach - I'll look to refine it later. The MS algorithm is no longer documented, and the implementation produced erroneous results anyway
* Exract the NEGBINOMDIST() function as well; still need to add a cumulative flag to support the additional argument for the newer NEGBINOM.DIST() function
* Rationalise validation of probability arguments
* Extract Percentile-type functions from Statistics (e.g. PERCENTILE(), PERCENTRANK(), QUARTILE(), and RANK())
* Unit test for PERCENTILE() with an empty (of numbers) dataset
* Difference in variance calculations between Excel/Gnumeric and Open/LibreOffice
* Simplify STDEV() function logic by remembering that STDEV() is simply the square root of VAR(), so we can simply use the VAR() calculaion rather than duplicating the basic logic... and also allow for the differences between Excel/Gnumeric and Open/LibreOffice
* Start implementing Newton-Raphson for the inverse of Statistical Distributions, starting with the two-tailed Student-T
* Additional unit tests and validations
* Use the new Newton Raphson class for calculating the Inverse of ChiSquared
* Extract Weibull distribution, and provide unit tests
* Extract ACCRINT() and ACCRINTM() Financial functions into their own class
Implement additional validations, with additional unit tests
Add support for the new calculation method argument for ACCRINT()
* Additional tests for Amortization functions
Continuing the process of breaking MathTrip.php up into smaller classes. This round takes care of all functions which might be an impediment to installing due to either uncovered code or "complexity":
- BASE
- FACT
- LCM
- MDETERM, MINVERSE, MMULT
- MULTINOMIAL
- PRODUCT
- QUOTIENT
- SERIESSUM
- SUM
- SUMPRODUCT
MathTrig and the members in directory MathTrig are now 100% covered. Many tests have been added, and some edge-case bugs are corrected. Some cases where PhpSpreadsheet had rejected numeric values stored as strings have been changed to accept them whenever Excel does; there had been no tests for that condition.
Boolean arguments are now accepted as arguments wherever Excel accpets them. Taking a cue from what has been done in Engineering, the parameter validation now happens in a routine which issues Exceptions for invalid values; this simplifies the code in the functions themselves. Thank you for doing that; I did not foresee how useful that was when I first looked at it.
Consistent with earlier changes of this nature, the versions in the MathTrig class remain, with a doc block indicating deprecation, and a stub call to the new routines.
All tests except for MINVERSE and MMULT are now handled in the context of a spreadsheet rather than a direct call to the calculation function which implements it. PhpSpreadsheet would need to handle dynamic arrays in order to test MINVERSE and MMULT in a spreadsheet context. Implementing that looks like it might be *very* challenging. It is not something I plan to look at, at least not in the near future.
One parsing problem turned up in the test conversion. It is in one of the SUMIF tests. It takes me to an area in Calculation where the comment says "I don't even want to know what you did to get here". It did not show up in the previous incarnation because, by using a direct call, the previous test managed to bypass the parsing. I have confirmed that this problem shows up in earlier releases of PhpSpreadsheet, so the changes in this PR did not cause it - they merely exposed it. I have left the test intact, but marked it "incomplete" for documentation purposes. I have not been able to get a handle on what's going wrong yet. I will probably open an issue on it if I can't resolve it soon. However, the test in question isn't a "real world" issue, and the error wasn't caused by this change, so I see no reason to delay this pending a resolution of the problem.
SUM had an idiosyncratic moment of its own. It had been ignoring non-numeric values, but Excel returns VALUE in that situation. So I changed it and wrote some new tests, which worked, but ... SUMIF uses several levels of indirection to get to SUM, and SUMIF *does* ignore non-numeric values, so a SUMIF test broke. SUM is a really simple function; the most practical approach seemed to be to clone it, with the string-accepting version being used by the Legacy version (which is called by SUMIF), and the non-string-accepting version being used in the Calculation Function table. That seems far easier and more practical than, for instance, adding a boolean parameter to the variable parameter list. As a follow-up, I will change SUMIF to explicitly call the appropriate new version, but I did not want to add that to this already large change.
SUM again - although it was fully covered beforehand, there was not a specific test member for it. There is now.
FACT had been coded to fail Gnumeric requests where the numeric argument has a decimal portion. However, Gnumeric does accept such an argument, and, unlike Excel and ODS, does not truncate it, but returns the result of a Gamma function call instead. This has been corrected.
When LCM included arguments which contained both 0 and a negative number, it returned 0 or NUM, whichever it found first. It is changed to always return NUM in that circumstance, as Excel does.
QUOTIENT had been documented as taking a variadic list of arguments. In fact, it takes exactly 2 - numerator and denominator - and the docblock and signature is fixed, even in the deprecated version.
The SERIESSUM docbock and signature are more accurate, even in the deprecated version. It is changed to ignore nulls, as Excel does, rather than return VALUE, and is one of the routines which previously rejected numbers in string form.
SUBTOTAL tests had used mocking for some reason. These are replaced with normal tests. And SUBTOTAL had a big surprise in store. That part of it which deals with hidden cells cares only whether the row is hidden, and doesn't care about the column's visibility.
I struggled with whether it should be SubTotal or Subtotal. I think the latter is correct, so that's how I proceeded. I don't think there are likely to be any other capitalization controversies.
* First steps toward refactoring Statistical Distributions into smaller classes: BETA() and GAMMA() (and related functions) to start with... they all need a lot of tidying up, and more testing; but it's a start
* Add basic datatype validations to Beta and Gamma Excel function implementations
* Switch to using a trait with the validation methods to provide easier sharing between distribution classes
* Additional unit tests for Beta and Gamma functions, including unhappy path for validations
* Extract ChiSquared functions
* Additional argument validation checks with unit tests for Chi Squared functions
* Extract Fisher
* Move MEDIAN() and MODE() to the Averages class
* Extract filters for Median and Mode for common usage
* New Bessel Algorithm, providing a higher degree of precision (12 decimal places) and still matching/exceeding MS Excel's precision across the range of values