Commit Graph

2975 Commits

Author SHA1 Message Date
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
Owen Leibman 034ac5a7c7 Use Strict Checking for in_array
Per suggestion from Mark Baker.
2021-06-25 08:32:35 +02:00
Owen Leibman ab26cbcb6d Reader/Gnumeric vs. Scrutinizer
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

I believe this is the only one with which will involve more than 2 or 3 changes. It fixes 5 items ascribed to me, and 4 to others.
2021-06-25 08:32:35 +02:00
dependabot-preview[bot] 257204d8bc
Upgrade to GitHub-native Dependabot (#2044)
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-06-24 12:08:41 +02:00
jarrett jordaan 795992835f
When image source is a URL, store the URL for use during extraction. (#2072)
When image source is a link store the link.
Add url mutator.

Update section in documentation on image extraction.
2021-06-24 10:50:44 +02:00
Owen Leibman d0dd5b4594 Use WildcardMatch
Per suggestion from @MarkBaker.

WildcardMatch did not handle double tilde correctly. It has been changed to do so and its logic simplified (and commented).

Existing AutoFilter test covered this situation, but I added a test for MATCH as well.
2021-06-24 10:09:21 +02:00
Owen Leibman d88af46ab5 Scrutinizer
24 minor problems, almost all of them unused code in tests.
2021-06-24 10:09:21 +02:00
Owen Leibman a735afc088 Autofilter Part 2
Most of the remaining 32-bit-unsafe date handling that remains in PhpSpreadsheet is in AutoFilter. Cleaning this up demonstrated that there are a lot of problems with AutoFilter, and I will do it in two pieces. Part 1 was PR #2141 which I have just merged.

In this PR:
- Fix remaining 32-bit dates in filterTestInDateGroupSet.
- Also in some of the existing AutoFilter samples. Note that the comments in two of those said the filter was being set for the first day of each month, but the code specifies the last day - I have corrected the comments.
- Remove mocking in unit tests for AutoFilter in favor of 'real' tests.
- Code coverage is now 100% in all of AutoFilter, AutoFilter/Column, and AutoFilter/Common/Rule.
- No remaining AutoFilter(/Column(/Rule)) exceptions in Phpstan baseline.
- Documentation for escaping of asterisk, question mark, and tilde in text filters included spurious backslashes which are now removed.
- Text filter escaping of question mark did not work. There had been no unit tests for any text filtering.
- Likewise there had been no testing for TopTen.
- Above- and below- average filters were not working because they acquired their Calculation instance incorrectly. There had been no tests.
- Several unchanging private static arrays in Rule were changed to private const arrays.
- Clones are now tested.
- RuleTest is moved to same directory as other tests.
2021-06-24 10:09:21 +02:00
Mark Baker 5769885802
Changes to the default arguments for `htmlspecialchars()` and `html_entity_decode()` requires setting of the argument value explicitly to prevent changes in behaviour. (#2176)
Specifically, the default for these two functions has been changed from `ENT_COMPAT` to `ENT_QUOTES | ENT_SUBSTITUTE`

This PR configures the argument used for those functions in Settings, and then explicitly applies it everywhere they are used in the codebase.
2021-06-21 12:56:03 +02:00
oleibman d200c5363f
Locale Generator - Change to Use Unix Line Endings Even on Windows (#2174)
See issue #2172. The locale files are regenerated whenever the test suite is run.

The use of PHP_EOL in LocaleGenerator.php is awkward on Windows systems, since it causes git to think the file has changed. Change to use `"\n"` instead.
2021-06-19 10:20:16 +02:00
oleibman 33ef03d1fa
Merge pull request #2171 from oleibman/xlsxtest2
Move Reader Xlsx Tests from Reader to Reader/Xlsx Try 2
2021-06-18 07:42:47 -07:00
Owen Leibman 83c0f02c95 Move Reader Xlsx Tests from Reader to Reader/Xlsx Try 2
PR #2088 is having major merge problems. This is partly because it moves some tests from Reader to Reader/Xlsx. Making this move beforehand may help. Or it may make things worse, but they are already bad enough that I am contemplating redoing the PR. If I do that, having this done beforehand will make things easier.

This PR does nothing but move some tests. This will make it easier to test changes to Xlsx Reader without having to run each test individually, or without having to run tests for all the other readers at the same time.
2021-06-17 09:45:11 -07:00
MarkBaker 16d0075640 Change log updates 2021-06-17 16:35:17 +02:00
Mark Baker 6f88d1b54e
Only calculate column autosize for a cell if it contains data (#2167)
Only calculate column autosize for a cell if it contains data
2021-06-16 22:38:41 +02:00
Mark Baker d2076fefab
Additional unit tests for negative interest rates in the financial functions, and also tests using negative present/future value arguments (#2166) 2021-06-16 14:16:48 +02:00
Mark Baker ebdeb231eb
Allow negative interest rate in PPMT() Financial function (#2164) 2021-06-15 22:35:04 +02:00
Olivier TARGET 803737a893
Fix for #2149 / Read data validations for drop down list in another sheet. (#2150)
* Read data validations for drop down list in another sheet.

* Add function testLoadXlsxDataValidationOfAnotherSheet() in class tests/PhpSpreadsheetTests/Reader/XlsxTest.php for unit test.

* Add sample xlsx for unit tests.

* Modifiy call function isset() for warnings.

* Additional assertions to ensure that the worksheet has been read correctly for DataValidation that references a list on a different worksheet

* This should resolve the phpstan issues

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-06-15 13:28:10 +02:00
oleibman 1e74282259
Fix for Issue 2158 (AverageIf Calculation Problem) (#2160)
* Improve Identification of Samples in Coverage Report

The Phpunit coverage report currently contains bullet items like `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "49"`. This extremely simple change takes advantage of Phpunit's ability to accept an array with keys which are either strings or integers, by using the sample filenames as the array keys rather than sequential but otherwise meaningless integers (e.g. `49` in the earlier cited item). The bullet item will now read `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "Basic/38_Clone_worksheet.php"`.

* Fix for Issue 2158 (AverageIf Calculation Problem)

Issue #2158 reports an error calculating AverageIf because a function returns null rather than a string. There turn out to be several components to this problem:
- The nominal fix to the problem is to add some null-to-nullstring coercion in DatabaseAbstract.
- This fixes the error, but does not necessarily lead to the correct result because buildQuery treats values of null and null-string identically, whereas Excel does not. So change that to treat null-string as any other string.
- But that doesn't lead to the correct result either. That's because Functions/ifCondition recognizes a null string, but then continues to (over-)process it until it returns the wrong result. Fix this problem in conjunction with the other two, and we finally get the correct result.

A new unit test is added for AVERAGEIF, and new test cases are added for SUMIF. In each case, there are complementary tests for conditions of null and null-string, and the results agree with Excel. There may or may not be value in adding new tests to other functions, and I will be glad to do so for any functions which you care to identify, but no existing tests broke as a result of these changes.
2021-06-15 09:54:57 +02:00
oleibman 4f06d84248
TextData - Minor Changes, Test Coverage (#2151)
* PHP8.1 Deprecation Passing Null to String Function

For each of the files in this PR, one or more statements can pass a null to string functions like strlower. This is deprecated in PHP8.1, and, when deprecated messages are enabled, causes many tests to error out. In every case, use coercion to pass null string rather than null.

* TextData - Minor Changes, Test Coverage

Per agreement on a previous push, I looked into standardizing the initialization of the TextData functions (like Engineering and MathTrig), with particular regard for avoiding multiple later null coercions. This simplifies the code quite a bit. This PR also increases coverage to 100% for all TextData modules. All entries in Phpstan baseline for non-deprecated TEXTDATA functions are removed. There were some minor bugfixes.

Whereas Excel (and Gnumeric) treat booleans when supplied as strings as 'TRUE' or 'FALSE', ODS treats them as '1' or '0'. Unlike Excel, ODS generally does not allow bool for int arguments; it does, however, allow them for FIND and SEARCH. ODS allows boolean for into for SUBSTITUTE even though Excel doesn't. ODS allows bool for string for NUMBERVALUE and VALUE even though Excel doesn't. ODS accepts 0 as an argument for CHAR; Excel doesn't. Most of this seems like random decisions on the part of the developers; I've done my best to follow the products in each case. There is a new test member devoted to ODS tests.

Gnumeric has an anomaly vis-a-vis the others - if length is supplied to LEFT/MID/RIGHT as null, Gnumeric treats it as 0 rather than 1.

All tests now take place in the context of a spreadsheet ...

Except for RETURNSTRING, which is not the implementation of an Excel function, and is referred to in the rest of PhpSpreadsheet only in the unit tests for itself. It should probably be deprecated, but that is not part of this PR, just in case there is some reason for it that I couldn't discern.

I have tried to make the first line of each doc block identify the Excel function name rather than its name in PhpSpreadsheet. I think it makes things more comprehensible.

Some tests call Settings::setLocale, but there was no Settings::getLocale. At the end of the tests which do it, they invoke setLocale('EN-US'), which, in a practical sense, is sufficient. However, in theory it would be better for them to get the current locale before changing it, then changing it back to the original when the time came. I have added getLocale and made the appropriate testing change.

The CHAR function took an interesting turn. One can set the value of a cell to, say, CHAR(2), the ASCII/UTF-8 representation of a control character, which is not legal in certain contexts. The only Reader/Writer that could handle this without problems is Xls, which deals with binary data all the time. However, if you tried to write it to Xlsx, Excel would not be able to open the resulting file because of what it considers an illegal character. I changed the Xlsx writer to escape such characters when writing the value of a string function. I did not make any other changes to the Xlsx writer - it seems to me that setting a cell to CHAR(2) is legitimate, but setting it to say `"\x02"` seems less likely to be legitimate, so the latter will still fail (although `="\x02"` should work). The Xlsx reader already supports the escape mechanism that I added to the writer.

CHAR control character and Ods - not supported by either Reader or Writer. I did not attempt to add this now. There is lots still missing from ODS, and this item just can't be a high priority amongst all of those.

CHAR control character and Csv - it is supported by reader and writer if the file has a csv extension. However, trying to guess the mime type without an extension - the control character makes mime_get_type guess application/octet-stream, and PhpSpreadsheet therefore thinks that Csv can't read it.

CHAR control character and Html. Actual use of the control character in the file is subject to the same problems as Xml (i.e. Xlsx and Ods). It wasn't terribly difficult to get the Html Writer to change `"\x02"` to "`&#2;`". I believe that this is technically legal; however, DOMDocument.loadHTML rejects it as an illegal entity, and I am not convinced that it is wrong to do so, so I haven't changed the Html writer.

* Scrutinizer

Correct 3 minor errors.
2021-06-15 08:37:17 +02:00
oleibman 55b95201ab
Merge pull request #2141 from oleibman/moredatefilter
Autofilter Part 1
2021-06-14 20:28:03 -07:00
oleibman 9b6e4f9bac
Merge branch 'master' into moredatefilter 2021-06-14 20:13:36 -07:00
MarkBaker 66cd68daea Update Change log 2021-06-13 22:41:10 +02:00
Mark Baker 74b02fb31c
Fix for the BIFF-8 Xls colour mappings in the Reader (#2156)
* Fix for the BIFF-8 Xls colour mappings in the Reader
* Unit test for reading colours, writing hen rereading and ensuring that the RGB values have not changed
2021-06-13 21:46:49 +02:00
oleibman b98b9c761c
Improve Identification of Samples in Coverage Report (#2153)
The Phpunit coverage report currently contains bullet items like `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "49"`. This extremely simple change takes advantage of Phpunit's ability to accept an array with keys which are either strings or integers, by using the sample filenames as the array keys rather than sequential but otherwise meaningless integers (e.g. `49` in the earlier cited item). The bullet item will now read `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "Basic/38_Clone_worksheet.php"`.
2021-06-11 22:29:44 +02:00
Mark Baker 9c2ce22505
This should fix png files with transparency in the Xls reader (#2155)
* This should fix png files with transparency in the Xls reader
2021-06-11 22:00:26 +02:00
Mark Baker 05466e99ce
Html import dimension conversions (#2152)
Allows basic column width conversion when importing from Html that includes UoM... while not overly-sophisticated in converting units to MS Excel's column width units, it should allow import without errors

Also provides a general conversion helper class, and allows column width getters/setters to specify a UoM for easier usage
2021-06-11 17:29:49 +02:00
Mark Baker a911e9bb7b
Calculation engine empty arguments (#2143)
* Initia work on differentiating between empty arguments and null arguments passed to Excel functions

Previously we always passed a null value for an empty argument (i.e. where there was an argument separator in the function call without an argument.... PHP doesn't support empty arguments, so we needed to provide some value but then it wasn't possible to differentiate between a genuine null argument (either a literal null, or a null cell value) and the null that we were passing to represent an empty argument value.

This change evaluates empty arguments within the calculation engine, and instead of passing a null, it reads the signature of the required Excel function, and passes the default value for that argument; so now a null argument really does mean a null value argument.

* If the Excel function implementation doesn't accept any arguments; or once we reach a variadic argument, or try to pass more arguments than the method supports in its signature, then there's no point in checking for defaults, and to do so will lead to PHP errors, so break out of the default replacement loop
2021-06-10 08:49:53 +02:00
oleibman a340240a3f
PHP8.1 Deprecation Passing Null to String Function (#2137)
For each of the files in this PR, one or more statements can pass a null to string functions like strlower. This is deprecated in PHP8.1, and, when deprecated messages are enabled, causes many tests to error out. In every case, use coercion to pass null string rather than null.
2021-06-05 15:14:23 +02:00
Mark Baker 19724e3217
Reader writer flags (#2136)
* Use of passing flags with Readers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a reader and manually enable it before loading any more.

This is in preparation for supporting new "boolean" Reaer/Writer features, such as pivot tables

* Use of passing flags with Writers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a writer and manually enable it before loading any more.

* Update documentation with details of changes to the StringValueBinder
2021-06-04 13:45:32 +02:00
MarkBaker 488701b748 Update documentation with details of changes to the StringValueBinder 2021-06-03 21:42:20 +02:00
MarkBaker 504ed9a87c Ok! Let's try again with phpstan now 2021-06-03 21:42:20 +02:00
MarkBaker af85f888be Now it's Scrutinizer's turn 2021-06-03 21:42:20 +02:00
MarkBaker da9fbd6c8d PHPCS appeasement again 2021-06-03 21:42:20 +02:00
MarkBaker 8cea3a94df Unit test for RichText object 2021-06-03 21:42:20 +02:00
MarkBaker 883115a079 phpstan appeasement 2021-06-03 21:42:20 +02:00
MarkBaker f135da0b11 phpstan appeasement 2021-06-03 21:42:20 +02:00
MarkBaker 8e41445fbd Allow more control over what non-string datatypes are converted to strings in the StringValueBinder 2021-06-03 21:42:20 +02:00
MarkBaker 642fc7dee7 PHPCS and PHStan appeasement 2021-06-03 21:42:20 +02:00
MarkBaker 3297f503c2 Addtional unit tests for vlue binders 2021-06-03 21:42:20 +02:00
MarkBaker 89eac0feed Further documentation update with reference to phone numbers with a leading sign 2021-06-03 21:42:20 +02:00
MarkBaker 85b9022f5b Updates to documentation concerning Value Binders
Fix a couple of checks on characters at positions within a string before checking the length of the string when the offset may not exist
2021-06-03 21:42:20 +02:00
Owen Leibman 4ab6439de1 Scrutinizer
Fix 5 minor errors.
2021-06-02 21:49:12 -07:00
oleibman 300ea99e93
Merge branch 'master' into moredatefilter 2021-06-02 20:50:05 -07:00
Owen Leibman b19fcef51f Autofilter Part 1
Most of the remaining 32-bit-unsafe date handling that remains in PhpSpreadsheet is in AutoFilter. Cleaning this up demonstrated that there are a lot of problems with AutoFilter, and I will do it in (probably two) pieces.

In this PR:
- Dynamic date processing was really wrong. There were no tests nor samples to exercise this code. (If you need details, you can try running the new sample against old code.) It is completely re-written.
- ThisYear/Month/Week/Quarter had been omitted.
- Rules such as AUTOFILTER_RULETYPE_DYNAMIC_MONTH_2 were almost correct, but showed some off-by-1 errors. I suspect these were timezone-related, and therefore more obvious to those of us far away from Greenwich.
- All Autofilter tests are moved to a single directory.
- The documentation suggested using null with the Dynamic Date setup, but Phpstan did not like that in my new tests/samples. Rather than change the doc block, I changed the documentation to suggest null string.
- I created a new sample to generate sheets using all the dynamic filters.
- I have added some new unit tests for each of the dynamic filters. I would love to be able to add some "time travel" tests because the dynamic nature of the filter makes most of the results change from day to day, which presents significant challenges in writing comprehensive unit tests (the same is true for code coverage). I was not able to find a good way to simulate time within PhpUnit, but the Linux 'faketime' package was extraordinarily easy and helpful in allowing me to confirm some edge cases. I had less satisfactory results with some Windows equivalents, but was still able to run some tests.
- Code coverage increases from below 60% to above 80%.

To be done:
- Some 32-bit unsafe dates remain in filterTestInDateGroupSet.
- Also in some of the existing AutoFilter samples.
- Study existing unit tests for AutoFilter which use mocking to see if they can/should be replaced with 'real' tests.
- Improve code coverage in AutoFilter, AutoFilter/Column, and AutoFilter/Common/Rule.
2021-06-02 20:46:14 -07:00
Adrien Crivelli e8ebf11707
Update phpDocumentor
This will include docs for `Spreadsheet` class which was incorrectly missing
2021-06-02 23:08:02 +09:00
MarkBaker 7f7ad28403 Prepare stub changelog for next release 2021-05-31 20:49:15 +02:00
MarkBaker 418cd304e8 Update changelog for release 2021-05-31 20:21:15 +02:00
MarkBaker 3168cbfb3e Select the correct TestCase 2021-05-31 13:59:51 +02:00
MarkBaker d51e4ec75a phpstan appeasement 2021-05-31 13:59:51 +02:00
MarkBaker 53362991a8 Additional unit tests to confirm behaviour when formulae reference cells within a merge range 2021-05-31 13:59:51 +02:00