Commit Graph

1230 Commits

Author SHA1 Message Date
oleibman 81dd743d3f
Mpdf With Very Many Styles (#2434)
Fix #2432. Probably for memory reasons, PhpSpreadsheet divides its data into chunks when writing to Mpdf. However, if the first chunk has so many styles that the `body` tag is not included in the chunk, Mpdf will not handle it correctly. Code is changed to ensure that the first chunk always contains the body tag.

Because this error becomes evident only when opening the PDF file itself, it is difficult to write a test case. I have instead added a new sample file which meets the conditions which would have led to the error, and which can be examined to show that it is created correctly.
2021-12-07 10:46:21 -08:00
oleibman 5174a4ab15
Special Characters in Image File Name (#2416)
* Special Characters in Image File Name

Fix #2415. Fix #1470. If path name of image contains anything other than ASCII, or if it contains # or space or probably other exceptions, PhpSpreadsheet creates a file that Excel cannot, for whatever reason, read (it is valid xml). When adding an image to a spreadsheet, Excel does not retain the original path name; PhpSpreadsheet does, but probably shouldn't. It is changed to save the image file in the zip as the MD5 hash of the original path name. This produces a file that Excel can read. In addition, it ensures that, if the image is used in multiple places, it is saved in the Excel file only once.

Because this error becomes evident only when opening the file in Excel, it is difficult to write a test case. I have instead duplicated sample Basic/05... using image files whose names match the reported error conditions.

* Scrutinizer Minor Error

Remove some newly dead code.
2021-12-06 06:50:09 -08:00
oleibman 9c8eeef96d
Xlsx Writer Unhides Explicitly Hidden Row in Filter Range - Minor Breaking Change (#2414)
Fix #1641. Excel allows explicit hiding of row after filter is applied, but PhpSpreadsheet automatically invokes showHideRows on all auto-filters, preventing users from doing the same. Change to invoke showHideRows only if it hasn't already been invoked, or if filter criteria have changed since it was last invoked. Autofilters read in from an existing spreadsheet are assumed to be already invoked.

This is potentially a breaking change, probably a minor one. The conditions to set up 1641 are probably uncommon, but users who meet those conditions and are happy with the current behavior will see a break. The new behavior is closer to how Excel itself behaves. A new method `reevaluateAutoFilters` is added to `Spreadsheet`; this can be used to restore the old behavior if desired. The new method is added to the documentation, along with a description of how the situation described in 1641 is handled in Excel and PhpSpreadsheet.

While examining Excel's behavior, it became evident that, although a filter is applied to an entire column, it is actually applied only to the rows that are populated when the filter is defined, as can be verified by examining the XML definition of the filter. When you re-apply the filter, rows that have been added since are considered. It would be useful to provide PhpSpreadsheet with a method to do the same. I have added, and documented, `setRangeToMaxRow` to `AutoFilter`.
2021-12-05 07:26:24 -08:00
Michael Fürnschuß aa91abc0d8
Respect DataType in insertNewBefore (#2433)
`ReferenceHelper::insertNewBefore` copies data from one cell to another when adding/removing rows or columns.
It now also respects the data type set for that cell and does not use value binder again.
2021-12-04 08:31:38 -08:00
lucasnetau e01a81ec5e
Fixes #2430 (#2431)
* Handle a wildcard match that contains a forward slash in the pattern by adding / to the delimiter list of preg_quote
* Fix SUMIF doing a wildcard match on empty cells (NULL)
* Fix compare logic to return false when value is an empty string or NULL (Verified against LibreOffice SUMIF and MATCH handling of empty cells)
2021-12-04 08:07:18 -08:00
oleibman 580c741c8e
Improve PDF Support for Page Size and Orientation (#2410)
Fix #1691. PhpSpreadsheet allows the setting of different page size and orientation on each worksheet. It also allows the setting of page size and orientation on the PDF writer. It isn't clear which is supposed to prevail when the two are in conflict. In the cited issue, the user expects the PDF writer setting to prevail, and I tend to agree. Code is changed to do this, and handling things in this manner is now explicitly documented.

PhpSpreadsheet uses a default paper size of Letter, and a default orientation of Default (which Excel treats as Portrait). New static routines are added to change the default for sheets created subsequent to such calls. This could allow users to configure these defaults better for their environments. The new functions are added to the documentation.
2021-12-01 23:00:02 -08:00
oleibman d5825a6682
Read Spreadsheet with # in Name (#2409)
Fix #2405. Treat last, rather than first, `#` as separator between zip file name and member name, by finding it with strrpos rather than strpos.
2021-11-30 07:39:50 -08:00
oleibman 290c18e4db
Xlsx Reader Theme Support Broken After 17.1 (#2403)
Fix #2387. Fix #2075. There was substantial refactoring of Writer Xlsx styles in 18.0. An existing static property `$theme` was intended to be shared by both Writer Xlsx and the new Writer Xlsx Styles. However, the initialization of the property in the latter happened later than it should have. This PR makes that initialization happen as soon as the theme has been read. Also, declaring that property as static seems questionable; I have made it an instance member. This small re-factoring makes it possible to now support Themes in tab colors.

Since this PR changes Reader/Xlsx/Styles, add type-hinting throughout that module to eliminate Phpstan/Scrutinizer problems. I also removed method readStyle from Reader/Xlsx, since it was essentially duplicated in Reader/Xlsx/Styles. And I added a small number of tests to ensure that Styles is 100% covered. All of this is necessary in preparation for Namespacing phase 2.
2021-11-26 09:38:09 -08:00
oleibman 3257ae5c90
Rounding in NumberFormatter (#2399)
Fix #2385. NumberFormatter is using sprintf on a float, and is seeing inconsistent rounding as a result (it will also occasionally result in `-0`). Change to round the number before presenting it to sprintf.
2021-11-26 09:05:35 -08:00
Vladislav Lyshenko 9dcfd9a5c2
Use standard temporary file for internal use of HTMLPurifier (#2383) 2021-11-23 23:45:43 +09:00
Roman Devman a2be574f36 Restore imperfect array formula values in xlsx writer
oleibman said:

The results of uncommenting the statements will often not be successful.
In Excel, if I enter `=MINVERSE({2,0;0,1})` into cell A1, you will get a
`dynamic array` (which we do not yet support) - A1 will contain 0.5, A2
and B1 will contain 0, and B2 will contain 1. There are also `spill`
implications for such a formula. The XML for these cells will be:

``` xml
<row r="1" spans="1:2" x14ac:dyDescent="0.3">
<c r="A1" cm="1">
<f t="array" ref="A1:B2">MINVERSE({2,0;0,1})</f>
<v>0.5</v>
</c>
<c r="B1">
<v>0</v>
</c>
</row>
<row r="2" spans="1:2" x14ac:dyDescent="0.3">
<c r="A2">
<v>0</v>
</c>
<c r="B2">
<v>1</v>
</c>
</row>
```

I believe that the PhpSpreadsheet equivalent of doing this (with the statements
uncommented) is:

```php
        $spreadsheet = new Spreadsheet();
        $calculation = Calculation::getInstance($spreadsheet);
        $calculation::setArrayReturnType(Calculation::RETURN_ARRAY_AS_ARRAY);
        $sheet = $spreadsheet->getActiveSheet();
        $sheet->getCell('A1')->setValue('=MINVERSE({2,0;0,1})');
        $writer = new Xlsx($spreadsheet);
        $oufil = 'issue.2343.xlsx';
        $writer->save($oufil);
```

But our output file only fills in A1:

```xml
<row r="1" spans="1:1">
<c r="A1">
<f>MINVERSE({2,0;0,1})</f>
<v>0.5</v>
</c>
</row>
```

And, even though A1 has its correct value, note that its `f` tag does not have
a `t` attribute. This is because we never set any formula attributes, except
in Xlsx Reader (see next paragraph), so we do not encounter the `'array'`
condtion for a formula newly added to a spreadsheet.

We do slightly better when we read the first file (as opposed to creating a new
spreadsheet), but we succeed only by accident. Because B1, A2, and B2 are
assigned values in the XML, all 4 cells will have the expected values. But they
are now independent of each other, not part of a dynamic array. When we write
this out, it is almost correct:

```xml
<row r="1" spans="1:2">
<c r="A1">
<f>MINVERSE({2,0;0,1})</f>
<v>0.5</v>
</c>
<c r="B1">
<v>0</v>
</c>
</row>
<row r="2" spans="1:2">
<c r="A2">
<v>0</v>
</c>
<c r="B2">
<v>1</v>
</c>
</row>
```

Again, the `f` tag has no `t` attribute, and it doesn't seem to matter whether we set
RETURN_TYPE_ARRAY or not. I think this particular aspect of the problem might be
relatively easy to fix.
2021-11-23 23:42:15 +09:00
Tiago Malheiro 0c93bbaaa3 Don't corrupt file when using chart with fill color
When the fill color property of `DataSeries.plotLabel` using a
DataSeriesValues on a line chart is set, the XLSX file written
is corrupted, and MSExcel2016 removes the drawing1.xml if forced open.

This problem was already documented on issue #589 along with a possible
solution. So all credits go to @madrussa. I am only submitting the PR.

Fixes #589
Closes #1930
2021-11-23 23:34:48 +09:00
Adrien Crivelli f46e3a1916
Use native typing for objects that were already documented as such 2021-11-18 11:08:29 +09:00
oleibman 2a12587f05
AutoFilter Improvements (#2393)
* AutoFilter Improvements

Fix issue #2378. The following changes are made:
- NotEqual tests must be part of a custom filter. Documentation has been changed to indicate that.
- Method setAndOr was replaced by setJoin some time ago. Documentation now reflects that change.
- Documentation to indicate that string filters are not case-sensitive, same as in Excel.
- Filters testing against numeric value now include a numeric test (not numeric for not equal, numeric for all others).
- String filter had previously treated everything as a test for "equal". It now handles "not equal" and the variants of "greater/less" with or without "equal".
- Documentation correctly stated that no more than 2 rules are allowed in a custom filter. Code did not enforce this restriction. It now does, throwing an exception if an attempt is made to add a third rule.
- Deleted a lot of comments in Rule.php to make it easier to see what is not yet implemented (between, begins with, etc.). I may take these on in future.
- Added a number of tests for the new functionality.

* Not Sure Why Phpstan Results Differ Local vs Github

Let's see if this change suffices.

* Phpstan Still

Not sure how to convince it. Let's try this.

* Phpstan Solved

Figured out the problem on my local machine. Expect this to work.
2021-11-16 07:46:07 -08:00
oleibman 52585a9d7c
Hyperlinks and Namespacing (#2391)
Fix #2389. Hyperlinks referring to cells in the spreadsheet itself are not being handled properly. This is the first namespacing regression identified for release 19. Usual cause and fix - need to take greater care with attributes than was previously the case.
2021-11-14 10:27:59 -08:00
oleibman 4ac0c47ac7
Support Data Validations in More Versions of Excel (#2377)
* Support Data Validations in More Versions of Excel

Attempt to deal with #2368, this time for good. Some deleted code was accidentally restored just before release 19, causing errors in spreadsheets with Data Validations. PR #2369 removed the duplicated code, and the fix was confirmed in current versions of Excel for Windows, Google sheets, and other versions of Excel. However, there were problems reported in earlier version of Excel for Windows, and some, versions of Excel for Mac, not all but including a recent one. This change, which is simpler than the original (no need for extLst) fix for DataValidations, is tested with Excel 2007 and Excel 2003 as well as more recent versions. I do not have a Mac on which to test.

* Multiple Identical Data Validation Lists

Using the same Data Validation List in multiple places on a worksheet caused them all to be merged into the same range. This was because sqref was not part of the hash code; it is now, avoiding this problem.

* Must Write Data Validations Before Hyperlinks

See discussion in #2389.
2021-11-14 10:06:46 -08:00
oleibman f831f48b71
ZipArchive and "Inconsistent" Zip File (#2376)
* ZipArchive and "Inconsistent" Zip File

Fix #2362. I added test for zip file inconsistency when dealing with a particularly nasty PHP/libzip bug affecting zero-length files. However, we also now verify that the file starts with a valid zip signature, so the consistency test is not really needed, and, from what I've read on the web, isn't particularly useful. The file with a problem, for example, opens just fine with Excel and zip, despite Php reporting it as inconsistent (when asked to check consistency). So, remove the consistency check.

* Update Issue2362Test.php

Latest Phpstan does not allow cast from 'mixed' to 'string'.

* Update Issue2362Test.php
2021-11-12 01:18:57 -08:00
oleibman 2f1f3a19b8
Csv, Boolean, and StringValueBinder (#2374)
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet.

When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows:
- If StringValueBinder is not in effect, convert to boolean.
- If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string.
- Otherwise, convert to boolean.

This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
2021-11-12 00:04:08 -08:00
oleibman ffdae8efac
Update Some Doc Block Annotations (#2351)
* Update Some Doc Block Annotations

See PR #2010. That PR was never completed, and has gone stale. However, it was correct in identifying a situation where the doc block was not entirely accurate. It did not go far enough - several closely-related methods have similar problems. This PR attempts to fix the original problem and its close relations. Aside from the doc block changes, there are very minor changes to executable code. It also changes some of the unit tests targeted at the methods in question to eliminate mocking in favor of 'real' tests.

* Change Method to Static

Otherwise Scrutinizer will complain, even though Phpstan doesn't.

* Scrutinizer

Various clean-up activities.

* Scrutinizer

@#&$(*#&$ Got complexity down from 53 to (I think) 50. Don't really know what the target is.

* Code Changes Suggested By Review

Some improvements suggested in review by @PowerKiKi.

* Update Cells.php

* Merge Conflict

A change to a parameter name caused several problems when trying to fix it on Github. Fixing it locally should do the trick.

* Merge Conflicts in Phpstan Baseline

PR #2382 made a large number of changes to Phpstan Baseline, some of which conflicted with the Phpstan Baseline changes in this PR. This should resolve them all.
2021-11-11 23:38:05 -08:00
Adrien Crivelli 5a704158e1 Declare key of generic ArrayObject 2021-11-10 10:26:09 +09:00
Owen Leibman 3b5e65e611 Preparation For php-cs-fixer Upgrade
Dependabot opened PR #2365 to upgrade php-cs-fixer from 2.19.2 to 3.2.1. Changes are required before that can be merged successfully. I believe all the necessary changes are in this PR.

One of the changes is to replace .php_cs.dist with .php-cs-fixer.dist.php. I have made those two identical for this PR so that there will be a meaningful delta listing. After this change is merged, master can be merged into 2365, which will hopefully pass all tests and be mergeable at that point. We can delete the unneeded file after that merge.

Spacing is changed in a handful of source members because of extra stringency in 3.2.1.
2021-11-08 00:03:11 +09:00
Adrien Crivelli 045db43d50 Rename even more parameters 2021-11-07 23:57:14 +09:00
Adrien Crivelli 1b877abe54 Rename more parameters 2021-11-07 23:57:14 +09:00
Adrien Crivelli 9d701d48ed Rename $pCell parameters 2021-11-07 23:57:14 +09:00
Zaytcev Ivan 89edc5b267
Bug in shared formulas: non-fixed cells are not updated if the formula has a fixed cell (#2354)
Example: right shift shared formula: IF(A$1=0,0,A1/A$1)
Expected value: IF(B$1=0,0,B1/B$1)
Actual value: IF(B$1=0,0,A1/B$1)

Similar behavior is observed when copying formulas vertically.
This issue occurs because a fixed and a non-fixed cell hit the same element of the $newCellTokens array by index $cellIndex
2021-11-04 08:39:58 -07:00
oleibman b1c9f0a1bc
Update Doc Blocks to Discourage Use of Unix Timestamps (#2350)
* Update Doc Blocks to Discourage Use of Unix Timestamps

This was suggested by issue #2347. Unix timestamps have clear disadvantages compared with the alternate methods of supplying date and time to PhpSpreadsheet - DateTime objects, Excel date time fields, and strings. In particular, Unix timestamp is not Y2038-safe on a 32-bit system, and it reflects the time in UTC, which may come as a surprise to the end-user (as it did in the cited issue). The alternate methods do not come with such baggage. This change updates some doc blocks to note that Unix timestamps are discoburage (N.B. - not deprecated). No executable code is changed.

* Document in Code As Well as Commmit Message

Per suggestion from @PowerKiKi.

* Missed One DocBlock

Including it now.

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2021-11-04 08:12:47 -07:00
oleibman da5c2b1c22
Fix Xlsx Writer Data Validation (#2369)
Fix issue #2368. PR #2265 moved the place where data validations were written to the worksheet. PR #1694 was installed afterwards, and accidentally restored the original location, so validations are now being written twice.
2021-11-03 08:05:16 -07:00
oleibman ca5bd9b1d3
Xls Reader Array Offset Null (#2338)
See issue #2315. It is nominally solved by PR #2312, but that PR is completely unsuitable for merging. This one-line change is a replacement for that PR.

As with many problems of this type, it is not clear how how to create a spreadsheet with this sort of harmless corruption in the wild. An example was supplied with the issue, and I have tested manually against it. However, the file is huge and not suitable for a formal unit test. I do not understand BIFF well enough to try and craft a suitable example on my own.

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2021-11-02 09:16:47 -07:00
oleibman 26c26ae8df
Xlsx Writer Support for WMF Files (#2339)
PR #1844 fixes it, but changes were requested. It has been almost 3 months and those changes have not been made. This PR replaces that one; it should be suitable for all supported releases of PHP through 8.1, and includes a formal unit test.

Fixes #1685
Closes #1844
2021-11-01 13:28:51 +09:00
Adrien Crivelli 858e073063 Drop PHP 7.2
This is according to our formal, published, policy to only support
eol PHP after 6 months.

See https://phpspreadsheet.readthedocs.io/en/latest/#php-version-support
2021-11-01 12:01:54 +09:00
Roland Eigelsreiter 4c4ae2634f
fixed null conversation for strToUpper (#2292)
fixed in the same way as it already has been done for strToLower

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2021-11-01 11:45:21 +09:00
Adrien Crivelli 59c706ebe7
Fix warnings in PHP 8.1 2021-10-31 23:12:33 +09:00
Adrien Crivelli 8cef8c0cfb
Help Scrutinizer 2021-10-31 22:35:14 +09:00
Adrien Crivelli c3c93c56d6
Fix CI 2021-10-31 22:17:07 +09:00
Adrien Crivelli 69f633420b
Merge branch 'master' into PHP8-Sane-Property-Names 2021-10-31 15:25:01 +09:00
Einar 7635b3f91a
Optimize applyFromArray by caching existing styles (#1785)
Prevent calling clone and getHashCode when not needed
because these calls are very expensive.

When applying styles to a range of cells can we cache the
styles we encounter along the way so we don't need to look
them up with getHashCode later.
2021-10-31 00:55:00 +09:00
Richard van Velzen 5b4b12f77b
Skip inner loops for empty/default read filter (#2223)
The optimization from #773 was not copied along in #1033, so restore it

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2021-10-31 00:09:02 +09:00
Adrien Crivelli e550528c02
Lock our deps with our minimum PHP 7.2, instead of PHP 7.3 2021-10-30 12:54:26 +09:00
oleibman 241e82b284
Unexpected Format in Timestamp (#2332)
See issue #2331. Timestamp is expected in format yyyy-mm-dd (plus other information), with the expectation that month and day are 2 digits zero-filled on the left if needed. The user's file instead used a space rather than zero as filler. Although I don't know how the unexpected timestamp was created, it was easy enough to alter the timestamp in an otherwise normal spreadsheet, and use that file as a test case.
2021-10-23 18:23:53 -07:00
oleibman 3db0b2a2de
Corrections for HLOOKUP Function (#2330)
See issue #2123. HLOOKUP needs to do some conversions between column numbers and letters which it had not been doing.

HLOOKUP tests were performed using direct calls to the function in question rather than in the context of a spreadsheet. This contributed to keeping this error obscured even though there were, in theory, sufficient test cases. The tests are changed to perform in spreadsheet context. For the most part, the test cases are unchanged. One of the expected results was wrong; it has been changed, and a new case added to cover the case it was supposed to be testing.

After getting the HLOOKUP tests in order, it turned out that a test using literal arrays which had been succeeding now failed. The array constructed by the literals are considerably different than those constructed using spreadsheet cells; additional code was added to handle this situation.
2021-10-23 17:59:42 -07:00
oleibman 9512f54cca
Corrections for Xlsx Read Comments (#2329)
This change was suggested by issue #2316. There was a problem reading Xlsx comments which appeared with release 18.0 but which was already fixed in master. So no source change was needed to fix the issue, but I thought we should at least add the test case to our unit tests.

In developing that case, I discovered that, although comment text was read correctly, there was a problem with comment author. In fact, there were two problems. One was new, with the namespacing changes - as in several other cases, the namespaced attribute `authorId` needed some special handling. However, another problem was much older - the code was checking `!empty($comment['authorId'])`, eliminating consideration of authorId=0, and should instead have been checking `isset`. Both problems are now fixed, and tested.
2021-10-23 17:32:44 -07:00
ayacoo 1f08f160ad
[FEATURE] Ability to stream to an Amazon S3 bucket (#2326)
Related #2249
2021-10-16 09:11:03 -07:00
ayacoo 86a8bbdd63
[TASK] Lowercase calibri fontnames (#2325)
Related #2273
2021-10-16 08:50:36 -07:00
oleibman 4001c89aaa
isFormula Referencing Sheet With Space in Title (#2306)
* isFormula Referencing Sheet With Space in Title

See issue #2304. User sets a cell to `ISFORMULA(cell)`, where `cell` exists on a sheet whose title contains a space, and receives an error. Coordinates are not being passed correctly to Functions::isFormula; in particular, the sheet name is not enclosed in apostrophes, e.g. `Sheet Name!A1` rather than `'Sheet Name'!A1`. (Note that sheet name was not specified in Cell; PhpSpreadsheet adds it before calling isFormula.) Sheets without embedded spaces (or other non-word characters) are handled correctly with or without apostrophes, but spaces require surrounding apostrophes.

As part of this investigation, I determined that Excel handles defined names and cell ranges in ISFORMULA (subject to spills), and that PhpSpreadsheet does not. It is changed to handle them. In the absence of spill support, it will use only the first cell in the range.

Existing tests for ISFORMULA used mocking unneccesarily. They are moved to a separate test member, and mocking is no longer used.

* PhpUnit and Jpgraph

35_Char_render.php had previously been a problem only for PHP8+. It is now a problem for PHP7.4, and will therefore be skipped all the time.
2021-10-03 10:04:48 -07:00
Georgiy fea0e34e90
fix #1114 issue (#2308)
* fix #1114 issues

* fixed code style

* update for all version

* add test for bag 1114

* remove comment

Co-authored-by: georgio <georgiokot@gmail.cim>
2021-10-03 09:29:23 -07:00
oleibman 90b9decb8e
Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix (#2303)
* Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix

See issue #2301. The main problem in that issue had been introduced with 18.0 and had already ben fixed in master. However there was a subsequent problem that had been introduced in master, an undotted i uncrossed t with namespace handling. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in parseRichText caused fonts declared in Rich Text elements to be ignored.

* Add An Assertion

Addresses problem in 2301 that had already been fixed.
2021-09-27 16:59:45 -07:00
oleibman cc14a48604
Permit CSV Delimiter to be Set to Null (#2288)
* Permit CSV Delimiter to be Set to Null

See issue #2287. A 1-character change. The delimiter variable is defined as nullable, and getDelimiter can return null; setDelimiter should follow suit.

* Scrutinizer Inanity

Are you sure the test always returns null?????
Yes, I'm sure, that's why it's part of the test.
Let's see if we can recode it and miss this "problem".
2021-09-15 12:40:03 -07:00
oleibman 4dd5c06c7b
Deleting Sheet with Local Defined Name (#2284)
Fixes issue #2266. Writer/Xlsx fails when there is no longer a sheet which corresponds to the definition of a local defined name. The code is changed to drop such an orphaned name. Writer/Xls does not fail under the same cicrcumstances, so no correction is needed there. Writer/Ods fails in a different manner, and is corrected to no longer do so.
2021-09-15 12:14:13 -07:00
oleibman e02eab29f1
Validate Input to SetSelectedCells (#2280)
* Validate Input to SetSelectedCells

See issue #2279. User requests an enhancement so that you can set a Style on a Named Range. The attempt is failing because setting the style causes a call to setSelectedCells, which does not account for Named Ranges. Although not related to the issue, it is worth noting that setSelectedCells does nothing to attempt to validate its input.

The request seems reasonable, even if it is probably more than Excel itself offers. I have added code to setSelectedCells to recognize Named Ranges (if and only if they are defined on the sheet in question). It will throw an exception if the string passed as coordinates cannot be parsed as a range of cells or an appropriate Named Range, e.e.g. a Named Range on a different sheet, a non-existent named range, named formulas, formulas, use of sheet name qualifiers (even for the same sheet). Tests are, of course, added for all of those and for the original issue. The code in setSelectedCells is tested in a very large number of cases in the test suite, none of which showed any problems after this change.

* Scrutinizer

2 minor (non-fatal) corrections, including 1 where Phpstan and Scrutinizer have a different idea about return values from preg_replace.
2021-09-11 06:55:00 -07:00
oleibman bc9234e5a5
Process Comments in Sylk File (#2277)
Fixes issue #2276.
2021-08-26 11:56:13 -07:00
oleibman de5f450856
Data Validations Referencing Another Sheet (#2265)
See issues #1432 and #2149. Data validations on an Xlsx worksheet can be specified in two manners - one (henceforth "internal") if a list is specified from the same sheet, and a different one (henceforth "external") if a list is specified from a different sheet. Xlsx worksheet reader formerly processed only the internal format; PR #2150 fixed this so that both would be processed correctly on read. However, Xlsx worksheet writer outputs data validators only in the internal format, and that does not work for external data validations; it appears, however, that internal data validations can be specified in external format.

This PR changes Xlsx worksheet writer to use only the external format. Somewhat surprisingly, this must come after most of the other XML tags that constitute a worksheet. It shares this characteristic (and XML tag) with conditional formatting. The new test case DataValidator2Test includes a worksheet which has both internal and external data validation, as well as conditional formatting.

There is some additional namespacing work supporting Data Validations that needs to happen on Xlsx reader. Since that is substantially unchanged with this PR, that work will happen in a future namespacing phase, probably phase 2. However, there are some non-namespace-related changes to Xlsx reader in this PR:
- Cell DataValidation adds support for a new property sqref, which is initialized through Xlsx reader using a setSqref method. If not initialized at write time, the code will work as it did before the introduction of this property. In particular, before this change, data validation applied to an entire column (as in the sample spreadsheet) would be applied only through the last populated row. In addition, this also allows a user to extend a Data Validation over a range of cells rather than just a single cell; the new method is added to the documentation.
- The topLeft property had formerly been used only for worksheets which use "freeze panes". However, as luck would have it, the sample dataset provided to demonstrate the Data Validations problem uses topLeft without freeze panes, slightly affecting the view when the spreadsheet is initially opened; PhpSpreadsheet will now do so as well.

It is worth noting issue #2262, which documents a problem with the hasValidValue method involving the calculation engine. That problem existed before this PR, and I do not yet have a handle on how it might be fixed.
2021-08-24 08:58:38 -07:00
oleibman 710f9f17a7
Fraction Formatting (#2254)
* Fraction Formatting

See issue #2253. User's analysis was correct - leading zeros in the decimal portion were being stripped out, so 0.0625 and 0.625 were being treated the same. As it turns out, integers also aren't handled well (`0 0/1` anyone?). The latter problem had been hidden because caller tested for integer first and skipped call if true; but FractionFormatter::format is public and should work correctly regardless. All Phpstan baseline entries for FractionFormatter and NumberFormatter are eliminated. New test data is added; no need for changes to test code.

* Scrutinizer

Ensure result is string.
2021-08-18 11:09:37 -07:00
Alayn Gortazar d0076343c4
Fix Reading XLSX files without styles.xml throws an exception. (#2247)
* Fix Reading XLSX files without styles.xml throws an exception.

* Bugfix, debugging code removed

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Style fixes

* Fix Spreadsheet loaded without styles cannot be written

* Replaced test files for empty styles.xml testing
2021-08-16 05:05:32 -07:00
oleibman d7ac7021c6
Apache OpenOffice Creates Xls Using Wrong Case for Number Format General (#2242)
See issue #2239. Problem is dealt with at the source, by making sure that Reader Xls checks for use of 'GENERAL' rather than 'General'. There doesn't seem to be a reason to test in other places, or to test for other casing variants.
2021-08-08 08:24:03 -07:00
oleibman de230fa899
Html Reader Comments (#2235)
* Html Reader Comments

See issue #2234. Html Reader processes Comment as comment, then processes it as part of cell contents. Change to only do the first. Comment Test checks that comment read by Html Reader is okay, but neglects to check the value of the cell to which the comment is attached. Added that check.

* Disconnect Worksheets

... at end of test.
2021-08-05 08:40:13 -07:00
oleibman 0cd20f3099
Csv Handling of Booleans (and an 8.1 Deprecation) (#2232)
* Csv Handling of Booleans (and an 8.1 Deprecation)

PhpSpreadsheet writes boolean values to a Csv as null-string/1, and treats input values of 'true' and 'false' as if they were strings. On the other hand, Excel writes boolean values to a Csv as TRUE/FALSE, and case-insensitively treats a matching string as boolean on read. This PR changes PhpSpreadsheet to match Excel.

A side-effect of this change is that it fixes behavior incorrectly reported as a bug in PR #2048. That issue was closed, correctly, as user error. The user had altered Csv Writer, including adding ```declare(strict_types=1);```; that declaration was the cause of the error. The "offending" statements, calls to strpbrk and str_replace, will now work correctly whether or not strict_types is in use.

And, just as I was getting ready to push this, the dailies for PHP 8.1 introduced a change deprecating auto_detect_line_endings. Csv Reader uses that setting; it allows it to process a Csv with Mac line endings, which happens to be something that Excel can do. As they say in https://wiki.php.net/rfc/deprecations_php_8_1, where the proposal passed without a single dissenting vote, "These newlines were used by “Classic” Mac OS, a system which has been discontinued in 2001, nearly two decades ago. Interoperability with such systems is no longer relevant." I tend to agree, but I don't know that we're ready to pull the plug yet. I don't see an easy way to emulate that functionality. For now, I have silenced the deprecation notices with at signs. I have also added a test case which will fail when support for that setting is pulled; this will give time to consider alternatives.

* Scrutinizer: Handling ini_set

This could be interesting. It doesn't like not handling an error condition for ini_set. Let's see if this satisfies it.
2021-08-04 07:00:17 -07:00
oleibman b9f6c70b86
New Looming Problems with PHP8.1 (#2231)
* New Looming Problems with PHP8.1

More deprecations. The following corrections are made in this PR:
- Calculation.php has a call to ctype_upper and apparently one of the samples manages to pass it an int. That function treats int differently from numeric strings, and that treatment is on the deprecation list. Enclosing the argument in quotes cannot cause a problem unless the int represents the ASCII value of an uppercase letter, which I cannot believe is the case; anyhow, if it is, the code will wind up with a nonsense result, e.g. if column is C and row is 1, the cell will be resolved as C1, but if column is int 67 (ASCII for C) and row is 1, the cell will be resolved as 671, not C1.
- Several Worksheet iterators need one or more functions to explicitly declare their return types. Thankfully, this does not seem to break earlier PHP versions.
- LocaleFloatsTest - see issue #1863. This was supposed to fail in PHP 8.0, but var_dump continued to support the old way (for 64-bit PHP only, not for 32-bit). PHP 8.1 appears to correct that omission, and the test will now fail. It doesn't show up as a failure in Github because of an accident - the attempt to set the locale to France in Github fails, so it skips the test before attempting the var_dump. But it does fail locally on my system. I have changed the test to use sprintf rather than var_dump; I think users are far more likely to use sprintf rather than var_dump in their applications. (They are, of course, even more likely to just cast to string, but the result of doing that is already different in 8.0 than in 7.4.) I would be equally happy to delete the test altogether.

There remain PHP 8.1 problems with Mpdf which are, of course, out of scope here.

There is one additional problem that I do not address in this ticket. The auto_detect_line_endings setting is being deprecated. This has some implications for Csv. I have another PR ready for Csv, and will discuss that problem there.

* Minor Scrutinizer Error

Hopefully fixed now.
2021-08-03 21:37:53 -07:00
oleibman 188d026615
Fix 112 Scrutinizer Problems in 1 Module (#2220)
* Fix 112 Scrutinizer Problems in 1 Module

The module is Reader/Xls/Escher - reading pictures from an Xls workbook. The errors fall into precisely 2 categories.
- Assigning a value to a variable which is not subsequently used (35). Although the statements therefore don't accomplish anything, I think they have documentary value for understanding the file layout. So, I have commented out the statements in question rather than deleting them.
- Class property `$this->object` can belong to any of several classes (77). When you invoke a method on it, Scrutinizer and Phpstan flag the statement if not all the candidate classes support the method. Neither has enough information to recognize that the method always exists for any object which reaches the statement. Scrutinizer is noisier about it - it issues a separate message for each class that doesn't support the method, while Phpstan issues a single message. Adding a `method_exists` test is sufficient for Phpstan. We'll see what Scrutinizer thinks when I push the change. If it still doesn't like it, we've eliminated only 35 problems. Phpunit coverage confirms that `method_exists` is always true at the appropriate point.

* Scrutinizer Can Be VERY Annoying

I wasn't looking to do a major rewrite. I was hoping 112 fixes would suffice. Oh well, let's see what happens now.
2021-07-26 20:13:26 -07:00
oleibman 51163713c7
Tweaks to Input File Validation (#2217)
* Tweaks to Input File Validation

This started as a response to issue #1718, for which it is a partial (not complete) solution. The following changes are made:
- canRead can currently throw an exception. This seems wrong. It should just return true/false.
- Breaking change of sorts. When AssertFile encounters a non-existent or unreadable file, it throws InvalidArgumentException. This does not make sense. I have changed it to throw PhpSpreadsheet/Reader/Exception.
- Since the previous bullet item required changing of most of the Reader files anyhow, this is a good time to add explicit typing for canRead in the function signature rather than the DocBlock. Since all the canRead functions inherit from an abstract version in IReader, they all have to be changed simulatneously. Except for Xlsx and Ods, most of the Reader files are otherwise unchanged.
- AssertFile is changed to add an optional "zip member" parameter. It will check for the existence of an appropriate member in what is supposed to be a zip file. It is used by Xlsx and Ods.
- Verifying that a given file is a valid zip ought to be a feature of ZipArchive. Thanks to a particularly nasty bug in php/libzip (see https://bugs.php.net/bug.php?id=81222), it is unsafe to attempt to open a zero-length file as a zip archive. There is a solution, but it does not apply to all the PHP releases which we support, and isn't even necessarily supported on all the point versions of the PHP versions which we do support. I have coded up a manual test for "valid zip", with a comment pointing to the spec.
- In theory, tests now cover 100% of the code in Shared/File. In practice ... One of the tests require that chmod works properly, which is not quite true on Windows systems, so that test is skipped on Windows. Another test requires that php.ini uses a non-default value for upload_temp_dir (can't be overridden in application code), which is probably not the case when Github runs the unit tests, so that test is skipped when appropriate. I have run tests for both on systems where they are not skipped.

* Update File.php

* Scrutinizer Timeout

It's not actually timing out, it's just waiting for something to finish that finished ages ago. Making a meaningless comment change in hopes that will clear the jam. Not particularly hopeful.
2021-07-24 20:44:04 -07:00
oleibman 3c5750bddc
Very Minor Simplification to Matrix Functions (#2222)
The external Matrix library has introduced some changes which permit the matrix functions to be slightly simplified.
2021-07-22 11:01:25 -07:00
James Lucas a818ce0c19 Revert showDropDown back to the previous inverted state. showDropDown is apparently incorrect documentation, it works as hideDropDrop if true. 2021-07-21 05:53:49 -07:00
James Lucas 0e54cf8b17 * Fix reading data validation flags allowBlank, showDropDown, showInputMessage, showErrorMessage in XLSX reader (was a loose comparison of SimpleXML object to integer), flag values may also be string true/false not just 0/1.
filter_var( $flag, FILTER_VALIDATE_BOOLEAN) chosen to handle both 0/1 and true/false values in the file being read.
* Fix writing data validation flag showDropDown (Inverted logic in reader was replicated to Writer)
2021-07-21 05:53:49 -07:00
oleibman e8966183d3
Merge branch 'master' into chartcaption 2021-07-16 06:11:26 -07:00
oleibman 5507b96d7a
Merge branch 'master' into sheetpasswd 2021-07-13 06:11:47 -07:00
Mark Baker 15170cf8cd
Issue 2216 resolve office365 auto filter structure move (#2218)
* Initial adjustments to Xlsx Reader for two possible locations for AutoFiter information, either on the sheet itself for older files, or in the tables/tableX file for more recent files
* Refactor AutoFilter Reader logic into separate methods; preparatory work toward the eventual goal of moving it into its own dedicated AutoFilterTables class
* Basic unit tests to verify that the Xlsx Reader can read both the older and Office365 variants of the files used to store AutoFilter structure
2021-07-12 03:19:40 +02:00
oleibman 8729a68338
Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen (#2213)
* Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen

Fixes issue #549 and https://github.com/Maatwebsite/Laravel-Excel/issues/989 (which is the source of the new test file). Some systems accept MACCENTRALEUROPE as the name for the appropriate encoding, and some accept MAC-CENTRALEUROPE. I fortunately have access to at least one of each type, and have run the tests on each.

CodePage.php has an array of translations from codepage number to string. I now allow the value to itself be an array; if so, the code will test each in turn to see if it can be used in iconv. I did not go fishing for other similar problems. If such show up, they can be dealt with in the same manner as this one. I don't really expect others, since this is a problem not merely for Xls, but, even then, it applies only to BIFF5 and earlier.

I also moved XlsTest from Reader to Reader/Xls.

* Cache Successful Result For Future Use

Per suggestion from @MarkBaker
2021-07-12 03:02:47 +02:00
oleibman 1ff2e50ed2
Merge branch 'master' into chartcaption 2021-07-02 14:35:29 -07:00
Owen Leibman ecb4a7fe27 DocBlock Changes for Chart/Title
This is a leftover Scrutinizer change, but it needed more attention than most others. Chart/Title DocBlocks define caption as `null|string`. However, in the wild, Excel usually presents the caption as an array, and not an array of strings but rather of RichText items. I am not sure why an array is needed since a RichText item can contain many text runs, but things are what they are.

Reader/Xlsx/ChartTitleTest reads a spreadsheet with the captions stored as a RichText array. Since it performs array operations on something the DocBlock says cannot be an array, Scrutinizer objects, although not seriously enough to fail the module. Phpstan also objects; its objection is silenced with an annotation. Aside from this test, there are other tests which do set the caption to a string, and Excel seems to handle that without a problem. So, I have changed the DocBlock to specify `array|RichText|String`. I have dropped null as a possibility; nullstring will do equally well.

Because getCaption can now return multiple datatypes, I think a new function which can return the text portion of the entire caption as a single string is needed. I have added it. This simplifies the test named above, and some code in Writer/Html. The latter is not part of unit testing because the version of JpGraph found in Composer is too antiquated. I verified the Html change manually by running samples/Chart/32_Chart_read_write_HTML.php using a recent version of JpGraph. It was as a result of this test that I uncovered issue #2203. I did not see anything about Charts in docs, so did not add a description of the new function there.

Phpstan is happy with the changes. We'll see how Scrutinizer feels when I push it.
2021-07-02 14:33:43 -07:00
oleibman 075cecd268
Xlsx Reader Better Namespace Handling Phase 1 First Bugfix (#2204)
See issue #2203. An undotted i uncrossed t. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in castToFormula caused problem for shared formulas.

Surprisingly, this didn't show up in unit tests. Perhaps sharing the same formula between two cells isn't common. It did show up in Chart Samples. I've added a test.

I was really inclined to merge this right away. Not to worry - I can control myself. It should be moved fairly quickly nevertheless.
2021-07-02 12:36:54 +02:00
oleibman 5523fc935b
Merge branch 'master' into sheetpasswd 2021-07-01 06:52:17 -07:00
Owen Leibman 560e9a885c CashFlow/Variable/NonPeriodic vs. Scrutinizer/Phpstan
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

This is the last of this set of changes. It corrects 2 problems according to Scrutinizer, and about 20 per Phpstan.
2021-07-01 12:11:41 +02:00
Owen Leibman 435ac30b47 Reader/Html vs. Scrutinizer/Phpstan
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

This PR corrects 2 problems according to Scrutinizer, and about 30 per Phpstan.
2021-07-01 10:15:06 +02:00
oleibman 2ae948a319
Reader/Slk vs. Scrutinizer/Phpstan (#2192)
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

This PR corrects 3 problems (2 mine) according to Scrutinizer, and 7 per Phpstan. It also moves the Reader Slk tests under their own directory, as is the case for all the other Reader types.
2021-06-29 20:48:31 +02:00
oleibman 49e97f0914
Correct Some Problems Which Will Show Up for PHP8.1 (#2191)
* Reader/Gnumeric vs. Scrutinizer

Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

I believe this is the only one with which will involve more than 2 or 3 changes. It fixes 5 items ascribed to me, and 4 to others.

* Use Strict Checking for in_array

* Correct Some Problems Which Will Show Up for PHP8.1

PHP8.1 wants to issue a message when you use a float where it thinks you ought to be using an int (it wants its implicit casts made explicit). This is causing unit tests to fail. The following corrections are made in this PR:
- Calculation.php tests `isset(self::binaryOperators[$token])`, where token can be a float. No numeric values are members of that array, so we can test for numeric before isset.
- SharedOle.php packs a float, intending it as an int, in 2 places. I simplified the logic here, and added explicit casts to avoid the problem. This is used by Xls Reader and Writer; as added confirmation, I added some timestamps from before 1970 (i.e. negative values) to Document/EpochTest. Because of this, the test suite has been verified for 32-bit PHP as well as PHP 8.1.
- Writer/Xlsx/StringTable tests `isset($aFlippedStringTable[$cellValue])`. This is the same problem as in Calculation, but requires a different solution. The same if statement here also tests that the datatype is string, but it does so after the isset test. Changing the order of these tests avoids the problem.

* Update OLE.php
2021-06-29 19:54:08 +02:00
Owen Leibman 36b328a9fa Fix Worksheet Passwords
Fix for issue #1897.

The existing hashing code seems to work correctly almost all the time, but there are exceptions. It is replaced by an exact implementation of the spec, including a link to the spec in the comments. Cases known to fail are added to the unit test suite.

The spec expects the string to be at most 255 bytes (yes, bytes not characters). The program had permitted any length; it will now throw an exception when the maximum length is exceeded.

Xls does not support any hashing algorithm except basic. The Xls writer had, nevertheless, accepted the results of any of the other possible algorithms. This leads to (a) a worksheet that can't be unprotected, and (b) deprecation notices during the write (because it is using hexdec, which expects only hex characters, and the other algorithms generate non-hex characters). I have changed Xls writer to ignore passwords generated by other algorithms. An alternative would be to have the password hasher generate both an algorithmic password (for use by Xlsx) and a basic password (for use by Xls); I think that is too complex a solution, but can look into it if you think it worthwhile.

I do not see any current support for Worksheet passwords in ODS Reader or Writer. I did not add support in this PR.

I added a new test to confirm the password for reading a spreadsheet is consistent with the one used for writing it. As you can see from the comments for the new test, it had an unusual problem with a somewhat unusual solution.
2021-06-29 09:11:51 -07:00
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
Owen Leibman 034ac5a7c7 Use Strict Checking for in_array
Per suggestion from Mark Baker.
2021-06-25 08:32:35 +02:00
Owen Leibman ab26cbcb6d Reader/Gnumeric vs. Scrutinizer
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

I believe this is the only one with which will involve more than 2 or 3 changes. It fixes 5 items ascribed to me, and 4 to others.
2021-06-25 08:32:35 +02:00
jarrett jordaan 795992835f
When image source is a URL, store the URL for use during extraction. (#2072)
When image source is a link store the link.
Add url mutator.

Update section in documentation on image extraction.
2021-06-24 10:50:44 +02:00
Owen Leibman d0dd5b4594 Use WildcardMatch
Per suggestion from @MarkBaker.

WildcardMatch did not handle double tilde correctly. It has been changed to do so and its logic simplified (and commented).

Existing AutoFilter test covered this situation, but I added a test for MATCH as well.
2021-06-24 10:09:21 +02:00
Owen Leibman d88af46ab5 Scrutinizer
24 minor problems, almost all of them unused code in tests.
2021-06-24 10:09:21 +02:00
Owen Leibman a735afc088 Autofilter Part 2
Most of the remaining 32-bit-unsafe date handling that remains in PhpSpreadsheet is in AutoFilter. Cleaning this up demonstrated that there are a lot of problems with AutoFilter, and I will do it in two pieces. Part 1 was PR #2141 which I have just merged.

In this PR:
- Fix remaining 32-bit dates in filterTestInDateGroupSet.
- Also in some of the existing AutoFilter samples. Note that the comments in two of those said the filter was being set for the first day of each month, but the code specifies the last day - I have corrected the comments.
- Remove mocking in unit tests for AutoFilter in favor of 'real' tests.
- Code coverage is now 100% in all of AutoFilter, AutoFilter/Column, and AutoFilter/Common/Rule.
- No remaining AutoFilter(/Column(/Rule)) exceptions in Phpstan baseline.
- Documentation for escaping of asterisk, question mark, and tilde in text filters included spurious backslashes which are now removed.
- Text filter escaping of question mark did not work. There had been no unit tests for any text filtering.
- Likewise there had been no testing for TopTen.
- Above- and below- average filters were not working because they acquired their Calculation instance incorrectly. There had been no tests.
- Several unchanging private static arrays in Rule were changed to private const arrays.
- Clones are now tested.
- RuleTest is moved to same directory as other tests.
2021-06-24 10:09:21 +02:00
Mark Baker 5769885802
Changes to the default arguments for `htmlspecialchars()` and `html_entity_decode()` requires setting of the argument value explicitly to prevent changes in behaviour. (#2176)
Specifically, the default for these two functions has been changed from `ENT_COMPAT` to `ENT_QUOTES | ENT_SUBSTITUTE`

This PR configures the argument used for those functions in Settings, and then explicitly applies it everywhere they are used in the codebase.
2021-06-21 12:56:03 +02:00
Mark Baker 6f88d1b54e
Only calculate column autosize for a cell if it contains data (#2167)
Only calculate column autosize for a cell if it contains data
2021-06-16 22:38:41 +02:00
Mark Baker d2076fefab
Additional unit tests for negative interest rates in the financial functions, and also tests using negative present/future value arguments (#2166) 2021-06-16 14:16:48 +02:00
Mark Baker ebdeb231eb
Allow negative interest rate in PPMT() Financial function (#2164) 2021-06-15 22:35:04 +02:00
Olivier TARGET 803737a893
Fix for #2149 / Read data validations for drop down list in another sheet. (#2150)
* Read data validations for drop down list in another sheet.

* Add function testLoadXlsxDataValidationOfAnotherSheet() in class tests/PhpSpreadsheetTests/Reader/XlsxTest.php for unit test.

* Add sample xlsx for unit tests.

* Modifiy call function isset() for warnings.

* Additional assertions to ensure that the worksheet has been read correctly for DataValidation that references a list on a different worksheet

* This should resolve the phpstan issues

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-06-15 13:28:10 +02:00
oleibman 1e74282259
Fix for Issue 2158 (AverageIf Calculation Problem) (#2160)
* Improve Identification of Samples in Coverage Report

The Phpunit coverage report currently contains bullet items like `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "49"`. This extremely simple change takes advantage of Phpunit's ability to accept an array with keys which are either strings or integers, by using the sample filenames as the array keys rather than sequential but otherwise meaningless integers (e.g. `49` in the earlier cited item). The bullet item will now read `PhpOffice\PhpSpreadsheetTests\Helper\SampleTest\testSample with data set "Basic/38_Clone_worksheet.php"`.

* Fix for Issue 2158 (AverageIf Calculation Problem)

Issue #2158 reports an error calculating AverageIf because a function returns null rather than a string. There turn out to be several components to this problem:
- The nominal fix to the problem is to add some null-to-nullstring coercion in DatabaseAbstract.
- This fixes the error, but does not necessarily lead to the correct result because buildQuery treats values of null and null-string identically, whereas Excel does not. So change that to treat null-string as any other string.
- But that doesn't lead to the correct result either. That's because Functions/ifCondition recognizes a null string, but then continues to (over-)process it until it returns the wrong result. Fix this problem in conjunction with the other two, and we finally get the correct result.

A new unit test is added for AVERAGEIF, and new test cases are added for SUMIF. In each case, there are complementary tests for conditions of null and null-string, and the results agree with Excel. There may or may not be value in adding new tests to other functions, and I will be glad to do so for any functions which you care to identify, but no existing tests broke as a result of these changes.
2021-06-15 09:54:57 +02:00
oleibman 4f06d84248
TextData - Minor Changes, Test Coverage (#2151)
* PHP8.1 Deprecation Passing Null to String Function

For each of the files in this PR, one or more statements can pass a null to string functions like strlower. This is deprecated in PHP8.1, and, when deprecated messages are enabled, causes many tests to error out. In every case, use coercion to pass null string rather than null.

* TextData - Minor Changes, Test Coverage

Per agreement on a previous push, I looked into standardizing the initialization of the TextData functions (like Engineering and MathTrig), with particular regard for avoiding multiple later null coercions. This simplifies the code quite a bit. This PR also increases coverage to 100% for all TextData modules. All entries in Phpstan baseline for non-deprecated TEXTDATA functions are removed. There were some minor bugfixes.

Whereas Excel (and Gnumeric) treat booleans when supplied as strings as 'TRUE' or 'FALSE', ODS treats them as '1' or '0'. Unlike Excel, ODS generally does not allow bool for int arguments; it does, however, allow them for FIND and SEARCH. ODS allows boolean for into for SUBSTITUTE even though Excel doesn't. ODS allows bool for string for NUMBERVALUE and VALUE even though Excel doesn't. ODS accepts 0 as an argument for CHAR; Excel doesn't. Most of this seems like random decisions on the part of the developers; I've done my best to follow the products in each case. There is a new test member devoted to ODS tests.

Gnumeric has an anomaly vis-a-vis the others - if length is supplied to LEFT/MID/RIGHT as null, Gnumeric treats it as 0 rather than 1.

All tests now take place in the context of a spreadsheet ...

Except for RETURNSTRING, which is not the implementation of an Excel function, and is referred to in the rest of PhpSpreadsheet only in the unit tests for itself. It should probably be deprecated, but that is not part of this PR, just in case there is some reason for it that I couldn't discern.

I have tried to make the first line of each doc block identify the Excel function name rather than its name in PhpSpreadsheet. I think it makes things more comprehensible.

Some tests call Settings::setLocale, but there was no Settings::getLocale. At the end of the tests which do it, they invoke setLocale('EN-US'), which, in a practical sense, is sufficient. However, in theory it would be better for them to get the current locale before changing it, then changing it back to the original when the time came. I have added getLocale and made the appropriate testing change.

The CHAR function took an interesting turn. One can set the value of a cell to, say, CHAR(2), the ASCII/UTF-8 representation of a control character, which is not legal in certain contexts. The only Reader/Writer that could handle this without problems is Xls, which deals with binary data all the time. However, if you tried to write it to Xlsx, Excel would not be able to open the resulting file because of what it considers an illegal character. I changed the Xlsx writer to escape such characters when writing the value of a string function. I did not make any other changes to the Xlsx writer - it seems to me that setting a cell to CHAR(2) is legitimate, but setting it to say `"\x02"` seems less likely to be legitimate, so the latter will still fail (although `="\x02"` should work). The Xlsx reader already supports the escape mechanism that I added to the writer.

CHAR control character and Ods - not supported by either Reader or Writer. I did not attempt to add this now. There is lots still missing from ODS, and this item just can't be a high priority amongst all of those.

CHAR control character and Csv - it is supported by reader and writer if the file has a csv extension. However, trying to guess the mime type without an extension - the control character makes mime_get_type guess application/octet-stream, and PhpSpreadsheet therefore thinks that Csv can't read it.

CHAR control character and Html. Actual use of the control character in the file is subject to the same problems as Xml (i.e. Xlsx and Ods). It wasn't terribly difficult to get the Html Writer to change `"\x02"` to "`&#2;`". I believe that this is technically legal; however, DOMDocument.loadHTML rejects it as an illegal entity, and I am not convinced that it is wrong to do so, so I haven't changed the Html writer.

* Scrutinizer

Correct 3 minor errors.
2021-06-15 08:37:17 +02:00
oleibman 9b6e4f9bac
Merge branch 'master' into moredatefilter 2021-06-14 20:13:36 -07:00
Mark Baker 74b02fb31c
Fix for the BIFF-8 Xls colour mappings in the Reader (#2156)
* Fix for the BIFF-8 Xls colour mappings in the Reader
* Unit test for reading colours, writing hen rereading and ensuring that the RGB values have not changed
2021-06-13 21:46:49 +02:00
Mark Baker 9c2ce22505
This should fix png files with transparency in the Xls reader (#2155)
* This should fix png files with transparency in the Xls reader
2021-06-11 22:00:26 +02:00
Mark Baker 05466e99ce
Html import dimension conversions (#2152)
Allows basic column width conversion when importing from Html that includes UoM... while not overly-sophisticated in converting units to MS Excel's column width units, it should allow import without errors

Also provides a general conversion helper class, and allows column width getters/setters to specify a UoM for easier usage
2021-06-11 17:29:49 +02:00
Mark Baker a911e9bb7b
Calculation engine empty arguments (#2143)
* Initia work on differentiating between empty arguments and null arguments passed to Excel functions

Previously we always passed a null value for an empty argument (i.e. where there was an argument separator in the function call without an argument.... PHP doesn't support empty arguments, so we needed to provide some value but then it wasn't possible to differentiate between a genuine null argument (either a literal null, or a null cell value) and the null that we were passing to represent an empty argument value.

This change evaluates empty arguments within the calculation engine, and instead of passing a null, it reads the signature of the required Excel function, and passes the default value for that argument; so now a null argument really does mean a null value argument.

* If the Excel function implementation doesn't accept any arguments; or once we reach a variadic argument, or try to pass more arguments than the method supports in its signature, then there's no point in checking for defaults, and to do so will lead to PHP errors, so break out of the default replacement loop
2021-06-10 08:49:53 +02:00
oleibman a340240a3f
PHP8.1 Deprecation Passing Null to String Function (#2137)
For each of the files in this PR, one or more statements can pass a null to string functions like strlower. This is deprecated in PHP8.1, and, when deprecated messages are enabled, causes many tests to error out. In every case, use coercion to pass null string rather than null.
2021-06-05 15:14:23 +02:00
Mark Baker 19724e3217
Reader writer flags (#2136)
* Use of passing flags with Readers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a reader and manually enable it before loading any more.

This is in preparation for supporting new "boolean" Reaer/Writer features, such as pivot tables

* Use of passing flags with Writers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a writer and manually enable it before loading any more.

* Update documentation with details of changes to the StringValueBinder
2021-06-04 13:45:32 +02:00
MarkBaker 504ed9a87c Ok! Let's try again with phpstan now 2021-06-03 21:42:20 +02:00
MarkBaker af85f888be Now it's Scrutinizer's turn 2021-06-03 21:42:20 +02:00
MarkBaker da9fbd6c8d PHPCS appeasement again 2021-06-03 21:42:20 +02:00
MarkBaker 8cea3a94df Unit test for RichText object 2021-06-03 21:42:20 +02:00