Commit Graph

263 Commits

Author SHA1 Message Date
oleibman d27b6a672a
Cleanup 3 LookupRef Tests (#3097)
Scrutinizer had previously suggested annotations for 3 LookupRef tests, but it no longer accepts its own annotation for those cases. This PR cleans them up. ColumnsTest and RowsTest are extremely straightforward.

IndexTest is a bit more complicated, but only because, unlike the other two, it had no test which executed in the context of a spreadsheet. And, when I added those, I discovered a couple of bugs. INDEX always requires at least 2 parameters (row# is always required), but its entry in the function table specified 1-4 parameters, now changed to 2-4. And, omitting col# is not handled the same way as specifying 0 for col#, though the code had treated them identically. (The same would have been true for row# but, because it is now required, ...)
2022-10-01 07:05:54 -07:00
oleibman 53e0828d49
Sync composer.lock (#3075)
When I cloned this morning, composer gave me a message that the lock file was not up to date with the latest changes in composer.json. I do not understand why, but it suggested to run `composer update`, which I did. This led to a handful of problems with php-cs-fixer, all fixed with changes to doc-blocks, and phpstan (only Writer/Xls/Worksheet required a change to code). We would presumably have had these problems at the start of next month when dependabot did its thing, so fix them now.
2022-09-20 08:37:00 -07:00
oleibman 90422bf1d2
R1C1 Format and Internationalization, plus Relative Offsets (#3052)
* R1C1 Format and Internationalization, plus Relative Offsets

Fix #1704, albeit imperfectly. Excel's implementation of this feature makes it impossible to fix perfectly. I don't know why it was necessary to internationalize R1C1 in the first place - the benefits are so minimal,and the result is worksheets that break when opened in different locales. Ugh. I can't even find complete documentation about the format in different languages; I am using https://answers.microsoft.com/en-us/officeinsider/forum/all/indirect-function-is-broken-at-least-for-excel-in/1fcbcf20-a103-4172-abf1-2c0dfe848e60 as my definitive reference.

This fix concentrates on the original report, using the INDIRECT function; there may be other areas similarly affected. As with ambiguous date formats, PhpSpreadsheet will do a little better than Excel itself when reading spreadsheets with internationalized R1C1 by trying all possibilities before giving up. When it does give up, it will now return `#REF!`, as Excel does, rather than throwing an exception, which is certainly friendlier. Although read now works better, when writing it will use whatever the user specified, so spreadsheets breaking in the wrong locale will still happen.

There were some bugs that turned up as I added test cases, all of them concerning relative addressing in R1C1 format, e.g. `R[+1]C[-1]`. The regexp for validating the format allowed for minus signs, but not plus signs. Also, the relevant functions did not allow for passing the current cell address, which made relative addressing impossible. The code now allows these, and suitable test cases are added.

* Use Locale for Formats, but Not for XML

Implementing a suggestion from @MarkBaker to use the system locale for determining R1C1 format rather than looping through a set of regexes and accepting any that work. This is closer to how Excel itself operates. The assumption we are making is to use the first character of the translated ROW and COLUMN functions. This will not work for Russian or Bulgarian, where each starts with the same letter, but it appears that Russian, at least, still uses R1C1. So our algorithm will not use non-ASCII characters, nor characters where ROW and COLUMN start with the same letter, falling back to R/C in those cases. Turkish falls into that category. Czech uses an accented character for one of the functions, and I'm guessing to use the unaccented character in that case. Polish COLUMN function is NR.KOLUMNY, and I'm guessing to use K in that case.

The function that converts R1C1 references is also used by the XML reader *where the format is always R1C1*, not locale-based (confirmed by successfully opening in Excel an XML spreadsheet when my language is set to French). The conversion code now handles that distinction through the use of an extra parameter. Xml Reader Load Test is duplicated to confirm that spreadsheet is loaded properly whether the locale is English or French. (No, I did not add an INDIRECT function to the Xml spreadsheet.)

Tests CsvIssue2232Test and TranslationTest both changed locale without resetting it when done. That omission was exposed by the new code, and both are now corrected.

* OpenOffice and Gnumeric

OpenOffice and Gnumeric make it much easier to test with other languages - they can be handled with an environment variable. Sensibly, they require R and C as the characters for R1C1 notation regardless of the language. Change code to recognize this difference from Excel.

* Handle Output of ADDRESS Function

One other function has to deal with R1C1 format as a string. Unlike INDIRECT, which receives the string on input, ADDRESS generates the string on output. Ensure that the ADDRESS output is consistent with the INDIRECT input.

ADDRESS expects its 4th arg to be bool, but it can also accept int, and many examples on the net supply it as an int. This had not been handled properly, but is now corrected.

* More Structured Test

I earlier introduced a new test for relative R1C1 addressing. Rewrite it to be clearer.

* Add Row for This to Locale Spreadsheet

It took a while for me to figure out how it all works. I have added a new row (with English value `*RC`) to Translations.xlsx, in the "Lookup and Reference" section of sheet "Excel Functions". By starting the "function name" with an asterisk, it will not be confused with a "real" function (confirmed by a new test). This approach also gives us the flexibility to do something similar if another surprise case occurs in future; in particular, I think this is more flexible than adding this as another option on the "Excel Localisation" sheet. It also means that any errors or omissions in the list below will be handled as with any other translation problem, by updating the spreadsheet without needing to touch any code.

The spreadsheet has the following entries in the *RC row:
- first letter of ROW/COLUMN functions for da, de, es, fi, fr, hu, nl, nb, pt, pt_br, sv
- no value for locales where ROW/COLUMN functions start with same letter - bg, ru, tr
- no value for locales with a multi-part name for ROW and/or COLUMN - it, pl (I had not previously noted Italian as an exception)
- no value for locales where ROW and/or COLUMN starts with a non-ASCII character - cs (this would also apply to bg and ru which are already included under "same letter")
- it does nothing for locales which are defined on the "Excel Localisation" sheet but have no entries yet on the "Excel Functions" sheet (e.g. eu)

Note that all but the first bullet item will continue to use R/C, which leaves them no worse off than they were before this change.
2022-09-16 08:25:26 -07:00
oleibman 6c1651e995
Floating-point Equality in Two Tests (#3064)
* Floating-point Equality in Two Tests

Merging a change today, Git reported failures that did not occur during "normal" unit testing. The merge still succeeded, but ... The problem was an error comparing float values for equal, and the inequality occurred beyond the 14th decimal digit. Change the tests in question, which incidentally were not part of the merged changed, to use assertEqualsWithDelta.

* Egad - 112 More Precision-related Problems

Spread across 9 test members.
2022-09-14 09:05:01 -07:00
oleibman 252474c1bd
Scrutinizer Clean Up Tests (#3061)
* Scrutinizer Clean Up Tests

No source code involved.

* Scrutinizer Whack-a-mole

Fixed 17, added 10. Trying again.

* Simplify Some Tests

Eliminate some null assertions.

* Dead Code

Remove 2 statements.
2022-09-14 07:11:20 -07:00
oleibman adbda63912
Phpstan and Xlsx Reader (#3043)
Eliminate most Phpstan messages in Xlsx Reader. In combination with similar changes to Xlsx Writer, baseline will shrink to just over 3,000 lines.
2022-09-04 10:43:31 -07:00
MarkBaker f7a3534928 Implementation of the `VALUETOTEXT()` Excel Function 2022-08-27 16:23:55 +02:00
oleibman d55978cf93
Correct Namespaces in 11 Tests (#3020)
The setup for unit testing in Github in the "Install dependencies" log reports 11 members as "does not comply with with psr-4 autoloading standard." In each case, it is because the test namespace does not match the directory; in most cases, it was caused by the member being moved from one directory to another without changing the namespace declaration. No harm results from these problems, but there's also no reason to not correct them.
2022-08-20 19:58:43 -07:00
MarkBaker 269e9ba14d Bugfix for Issue #3013, Named ranges not usable as anchors in OFFSET function 2022-08-19 11:28:51 +02:00
MarkBaker 71b2c5ae89 Expand [PR #2964](https://github.com/PHPOffice/PhpSpreadsheet/pull/2964) to cover all arithmetic operators, not just multiplication, and both left and right side values 2022-08-07 13:59:26 +02:00
oleibman eb76c3c0ff
Code Coverage >90% (#2973)
No source code changes, just additional tests. FormulaParser appears unused, replaced by newer code in Calculation. However, it's a public interface, so probably shouldn't be deleted without first deprecating it. I have no strong feelings about whether that should happen. However, as long as it's part of the package, we may as well have some formal unit tests for it.
2022-08-06 17:56:30 -07:00
Jonathan Goode 7f0ca404fc
Ensure multiplication is performed on a non-array value (#2964)
* Ensure multiplication is performed on a non-array value

* Simplify formula
Numbers should be numbers

* Provide test coverage for SUM combined with INDEX/MATCH

* PHPStan
2022-08-06 17:28:26 -07:00
MarkBaker 4724c8f7e9 Initial work on the ARRAYTOTEXT() Excel Function 2022-08-04 22:43:36 +02:00
Mark Baker fe3ec55341
Merge branch 'master' into TextFunctions-New-TextSplit 2022-08-02 19:36:47 +02:00
MarkBaker 07f4fbe396 Initial implementation of the `TEXTSPLIT()` Excel Function 2022-08-02 19:05:43 +02:00
MarkBaker 290d0731fe Allow multiple delimiters for `TEXTBEFORE()` and `TEXTAFTER()` functions 2022-07-30 10:27:31 +02:00
Mark Baker 345c0ebdfc
Merge branch 'master' into TextFunctions-New 2022-07-28 19:14:06 +02:00
MarkBaker 88bfa98291 Initial Implementation of the new Excel TEXTBEFORE() and TEXTAFTER() functions 2022-07-28 16:05:18 +02:00
Jonathan Goode e460c82606
Fully flatten an array (#2956)
* Fully flatten an array

* Provide test coverage for CONCAT combined with INDEX/MATCH
2022-07-27 19:29:02 -07:00
oleibman 4bf4278a39
VLOOKUP Breaks When Array Contains Null Cells (#2939)
Fix #2934. Null is passed to StringHelper::strtolower which expects string. Same problem appears to be applicable to HLOOKUP.

I noted the following problem in the code, but will document it here as well. Excel's results are not consistent when a non-numeric string is passed as the third parameter. For example, if cell Z1 contains `xyz`, Excel will return a REF error for function `VLOOKUP(whatever,whatever,Z1)`, but it returns a VALUE error for function `VLOOKUP(whatever,whatever,"xyz")`. I don't think PhpSpreadsheet can match both behaviors. For now, it will return VALUE for both, with similar results for other errors.

While studying the returned errors, I realized there is something that needs to be deprecated. `ExcelError::$errorCodes` is a public static array. This means that a user can change its value, which should not be allowed. It is replaced by a constant. Since the original is public, I think it needs to stay, but with a deprecation notice; users can reference and change it, but it will be unused in the rest of the code. I suppose this might be considered a break in functionality (that should not have been allowed in the first place).
2022-07-17 06:27:56 -07:00
oleibman faf6d819c6
Keep Calculated String Results Below 32K (#2921)
* Keep Calculated String Results Below 32K

This is the result of an investigation into issue #2884 (see also PR #2913). It is, unfortunately, not a fix for the original problem; see the discussion in that PR for why I don't think there is a practical fix for that specific problem at this time.

Excel limits strings to 32,767 characters. We already truncate strings to that length when added to the spreadsheet. However, we have been able to exceed that length as a result of the concatenation operator (Excel truncates); as a result of the CONCATENATE or TEXTJOIN functions (Excel returns #CALC!); or as a result of the REPLACE, REPT, SUBSTITUTE functions (Excel returns #VALUE!). This PR changes PhpSpreadsheet to return the same value as Excel in these cases. Note that Excel2003 truncates in all those cases; I don't think there is a way to differentiate that behavior in PhpSpreadsheet.

However, LibreOffice and Gnumeric do not have that limit; if they have a limit at all, it is much higher. It would be fairly easy to use existing settings to differentiate between Excel and LibreOffice/Gnumeric in this respect. I have not done so in this PR because I am not sure how useful that is, and I can easily see it leading to problems (read in a LibreOffice spreadsheet with a 33K cell and then output to an Excel spreadsheet). Perhaps it should be handled with an additional opt-in setting.

I changed the maximum size from a literal to a constant in the one place where it was already being enforced (Cell/DataType). I am not sure that is the best place for it to be defined; I am open to suggestions.

* Implement Some Suggestions

... from @MarkBaker.
2022-07-04 08:30:46 -07:00
oleibman a89572107a
Handling of #REF! Errors in Subtotal, and More (#2902)
* Handling of #REF! Errors in Subtotal, and More

This PR derives from, and supersedes, PR #2870, submitted by @ndench. The problem reported in the original is that SUBTOTAL does not handle #REF! errors in its arguments properly; however, my investigation has enlarged the scope.

The main problem is in Calculation, and it has a simple fix. When the calculation engine finds a reference to an uninitialized cell, it uses `null` as the value. This is appropriate when the cell belongs to a defined sheet; however, for an undefined sheet, #REF! is more appropriate.

With that fix in place, SUBTOTAL still needs a small fix of its own. It tries to parse its cell reference arguments into an array, but, if the reference does not match the expected format (as #REF! will not), this results in referencing undefined array indexes, with attendant messages. That assignment is changed to be more flexible, eliminating the problem and the messages.

Those 2 fixes are sufficient to ensure that the original problem is resolved. It also resolves a similar problem with some other functions (e.g. SUM). However, it does not resolve it for all functions. Or, to be more particular, many functions will return #VALUE! rather than #REF! if this arises, and the same is true for other errors in the function arguments, e.g. #DIV/0!. This PR does not attempt to address all functions; I need to think of a systematic way to pursue that. However, at least for most MathTrig functions, which validate their arguments using a common method, it is relatively easy to get the function to propagate the proper error result.

* Arrange Array The Way call_user_func_array Wants

Problem with Php8.0+ - array passed to call_user_func_array must have int keys before string keys, otherwise Php thinks we are passing positional parameters after keyword parameters.

7 other functions use flattenArrayIndexed, but Subtotal is the only one which uses that result to subsequently pass arguments to call_user_func_array. So the others should not require a change. A specific test is added for SUM to validate that conclusion.

* Change Needed for Hidden Row Filter

Same as change made to Formula Args filter.
2022-06-25 22:08:32 -07:00
Mark Baker 9f172733c7
Merge branch 'master' into Issue-2874_Calculation-Engine-Quoted-Worksheet-Regexp 2022-06-17 13:57:28 +02:00
MarkBaker 02c6e8cfa8 Escape double quotes in worksheet names for column range and row range references 2022-06-17 13:34:20 +02:00
MarkBaker 23e207ccb4 Adjust calc engine identification of quoted worksheet names to fix bug with multiple quoted worksheets in formula 2022-06-17 13:08:06 +02:00
MarkBaker 00dae1bbda Relax validation on merge cells to allow input of a single cell 2022-06-10 01:15:57 +02:00
MarkBaker adf4531c99 Adjust worksheet regexps for CellRef and Row/Column Ranges for improved handling of quoted worksheet names 2022-06-09 19:54:25 +02:00
MarkBaker 716964eeec Resolve Calculation Engine bug with row and column ranges being identified as named ranges, adding overhead with the additional validation to process that named range 2022-04-15 14:53:33 +02:00
MarkBaker 483ef53855 Basic unit tests for formula parsing, in preparation for work on fully supporting the Union Operator, and for providing support for Structured References 2022-04-14 22:38:19 +02:00
MarkBaker 40730c6023 Handle defined names with the range operator.
It gets awkward when the defined name is for an actual range rather than for an individual named cell; because we need to manipulate the stack when that happens.

The code is ugly, and this is a rather simplistic approach, but it works as long as the named range is a cell, a cell range, or even a "chained" range - it won't work if we have union or intersection operators in the defined range - but it does provide formula support that never existed before.
2022-04-13 17:55:51 +02:00
MarkBaker 8c84ce4399 Support for chained range operators in the Calculation Engine (e.g. `A3:B1:C2` which gives an effective combined range of `A1:C3` or `A5:C10:C20:F1` which gives an effective combined range of `A1:F20`).
Next step will be allowing Named Cells/Ranges to be chained in the same way.
2022-04-13 12:29:59 +02:00
MarkBaker 2ab582a707 Bugfix to returned column indexes for FILTER() by row 2022-03-24 22:12:08 +01:00
MarkBaker 9019523efc Additional unit testing
And a quick bugfix for cell ranges applied to both sort functions and to FILTER()
2022-03-24 17:51:06 +01:00
MarkBaker 7f00049fe8 Initial work on the SORT() and SORTBY() Lookup/Reference functions
The code could stil do with some cleaning up, and better optimisation for memory usage; but all tests are passing... that's for full multi-level sorting (including direction), and allowing for correct sorting of sting/numeric datatypes.
2022-03-24 12:21:27 +01:00
oleibman c112802023
Eliminate Most Scrutinizer Problems in Test Suite (#2699)
* Eliminate Most Scrutinizer Problems in Test Suite

Mostly minor code changes, with some annotations.

* Missed 2 php-cs-fixer Problems

They should be fixed now.
2022-03-21 13:58:42 -07:00
MarkBaker f3bb61f3e4 Initial work implementing the FILTER() Lookup/Reference function
Tighten up on vector formats; and provide a couple of helper methods for testing row/column vectors
2022-03-20 18:01:30 +01:00
MarkBaker 75e45057f3 Fix scrutinizer issue complaint 2022-03-18 15:07:13 +01:00
MarkBaker c8cf193301 Initial work implementing the new UNIQUE() Lookup/Reference array function 2022-03-18 14:18:37 +01:00
oleibman 68158c8120
Phpstan Differences from Php7 to Php8, Again (#2665)
These changes have already been implemented twice, and been regressed twice. I'll try once more (with a different approach), then give up ...

As configured, Phpstan running under Php7 reports no errors. However, running under Php8, it reports 100 (!) errors. The vast majority of these are due to two reasons:
- renaming parameters in Php builtin functions in preparation for named parameters.
- using the new class GdImage rather than type resource as the argument type for many image-based functions.

Regardless of the cause, this will be a problem sooner or later. This PR is an attempt to get ahead of that problem. For source members, it mostly adds annotations or updates doc-blocks. Only 2 members have changes to executable code, and these are very minor - BitWise and Writer/Xlsx. For test members, all baseline errors are deleted and the code is fixed. Php7 and Php8 both report no errors with this configuration.
2022-03-11 23:28:30 -08:00
MarkBaker 044836b426 Unit test for the Calculation Engine debug log 2022-03-06 09:47:17 +01:00
oleibman e87be5e16d
Failing Test Requiring Too Much Precision (#2647)
The new array tests for IMCSC fail on my system because of a rounding error in the 14th (!) decimal position. This is not a real failure. Change the test to use only the first 8 decimal positions.
2022-03-04 00:11:04 -08:00
Mark Baker 579145eb3c
Wrap actual file loading in protected methods, while just providing a stub `load()` method in the BaseReader abstract. This is to allow a wrapper in the file loader logic across all Readers, with a try/catch block that can be used to handle special loader settings. (#2620) 2022-02-23 21:31:57 +01:00
Mark Baker 16953e27d8
Adjust Cell Reference regexp in Calculation Engine to handle worksheet names containing quotes (#2617)
* Capture of worksheet name in Calculation Engine cell references modified to handle apostrophe and quote marks, and made non-greedy to avoid ensure that multiple quoted worksheet names in a formula aren't all captured in one go

* Split some of the unit tests into separate test classes
2022-02-22 15:55:59 +01:00
Mark Baker 3c57d9e291
Fix for issue #2614 (#2615)
* Fix for issue #2614

* Unit test (in ISREF() tests) that covers issue #2614
2022-02-22 09:06:10 +01:00
Mark Baker c9f948bd91
Initial work on implementing Array-enabled for the HLOOKUP() and VLOOKUP() functions (#2611)
* Initial work on implementing Array-enabled for the HLOOKUP() and VLOOKUP() functions

* In the MATCH() function, we should also use `evaluateArrayArgumentsIgnore()` because the lookupvalue and matchType arguments can be array arguments, but lookupArray is always a dataset matrix
2022-02-21 19:56:21 +01:00
Mark Baker db21d043fe
Implementation of the ISREF() information function (#2613) 2022-02-21 18:47:23 +01:00
Mark Baker 0ee4d96576
Array-enable the ISFORMULA() function (#2610)
Implement Array-enabled for  ERROR.TYPE() function
Extract ERROR.TYPE() function tests into separate test file
Extract error function tests into separate test files

And thus complete the implemented Information functions
2022-02-20 19:32:13 +01:00
Mark Baker 35b65bef8c
First steps toward array-enabling the information functions (#2608)
* First steps toward array-enabling the information functions

Also includes moving unit tests out from Functions and into a separate, dedicated Information folder

* Resolve issue with IF(), branch pruning and calculation cache (ensure that we don't convert the if condition to a bool before we've tested to see if it evaluates to an error)
More refactoring
2022-02-20 16:46:25 +01:00
oleibman 9893926ff9
Add Test For NULL=0 (#2607)
Fix #2523. This isn't actually a fix; the problem was reported and confirmed for 1.21, but had already been fixed in master (and remains fixed in 1.22). This PR just adds a unit test for the original problem.
2022-02-19 13:12:44 -08:00
Mark Baker 71927f5591
Issue 2551 array enable lookup ref functions (#2606)
* Start work on array-enabling the Lookup and Reference functions

Requires a new method (`evaluateArrayArgumentsSubsetFrom()`) in the `ArrayEnabled` Trait to handle functions where the arguments that need special array handling are trailing rather than leading arguments
2022-02-19 18:49:01 +01:00