Commit Graph

171 Commits

Author SHA1 Message Date
oleibman 52585a9d7c
Hyperlinks and Namespacing (#2391)
Fix #2389. Hyperlinks referring to cells in the spreadsheet itself are not being handled properly. This is the first namespacing regression identified for release 19. Usual cause and fix - need to take greater care with attributes than was previously the case.
2021-11-14 10:27:59 -08:00
oleibman f831f48b71
ZipArchive and "Inconsistent" Zip File (#2376)
* ZipArchive and "Inconsistent" Zip File

Fix #2362. I added test for zip file inconsistency when dealing with a particularly nasty PHP/libzip bug affecting zero-length files. However, we also now verify that the file starts with a valid zip signature, so the consistency test is not really needed, and, from what I've read on the web, isn't particularly useful. The file with a problem, for example, opens just fine with Excel and zip, despite Php reporting it as inconsistent (when asked to check consistency). So, remove the consistency check.

* Update Issue2362Test.php

Latest Phpstan does not allow cast from 'mixed' to 'string'.

* Update Issue2362Test.php
2021-11-12 01:18:57 -08:00
oleibman 2f1f3a19b8
Csv, Boolean, and StringValueBinder (#2374)
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet.

When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows:
- If StringValueBinder is not in effect, convert to boolean.
- If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string.
- Otherwise, convert to boolean.

This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
2021-11-12 00:04:08 -08:00
Adrien Crivelli 69f633420b
Merge branch 'master' into PHP8-Sane-Property-Names 2021-10-31 15:25:01 +09:00
oleibman 241e82b284
Unexpected Format in Timestamp (#2332)
See issue #2331. Timestamp is expected in format yyyy-mm-dd (plus other information), with the expectation that month and day are 2 digits zero-filled on the left if needed. The user's file instead used a space rather than zero as filler. Although I don't know how the unexpected timestamp was created, it was easy enough to alter the timestamp in an otherwise normal spreadsheet, and use that file as a test case.
2021-10-23 18:23:53 -07:00
oleibman 9512f54cca
Corrections for Xlsx Read Comments (#2329)
This change was suggested by issue #2316. There was a problem reading Xlsx comments which appeared with release 18.0 but which was already fixed in master. So no source change was needed to fix the issue, but I thought we should at least add the test case to our unit tests.

In developing that case, I discovered that, although comment text was read correctly, there was a problem with comment author. In fact, there were two problems. One was new, with the namespacing changes - as in several other cases, the namespaced attribute `authorId` needed some special handling. However, another problem was much older - the code was checking `!empty($comment['authorId'])`, eliminating consideration of authorId=0, and should instead have been checking `isset`. Both problems are now fixed, and tested.
2021-10-23 17:32:44 -07:00
Georgiy fea0e34e90
fix #1114 issue (#2308)
* fix #1114 issues

* fixed code style

* update for all version

* add test for bag 1114

* remove comment

Co-authored-by: georgio <georgiokot@gmail.cim>
2021-10-03 09:29:23 -07:00
oleibman 90b9decb8e
Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix (#2303)
* Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix

See issue #2301. The main problem in that issue had been introduced with 18.0 and had already ben fixed in master. However there was a subsequent problem that had been introduced in master, an undotted i uncrossed t with namespace handling. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in parseRichText caused fonts declared in Rich Text elements to be ignored.

* Add An Assertion

Addresses problem in 2301 that had already been fixed.
2021-09-27 16:59:45 -07:00
oleibman cc14a48604
Permit CSV Delimiter to be Set to Null (#2288)
* Permit CSV Delimiter to be Set to Null

See issue #2287. A 1-character change. The delimiter variable is defined as nullable, and getDelimiter can return null; setDelimiter should follow suit.

* Scrutinizer Inanity

Are you sure the test always returns null?????
Yes, I'm sure, that's why it's part of the test.
Let's see if we can recode it and miss this "problem".
2021-09-15 12:40:03 -07:00
oleibman bc9234e5a5
Process Comments in Sylk File (#2277)
Fixes issue #2276.
2021-08-26 11:56:13 -07:00
Alayn Gortazar d0076343c4
Fix Reading XLSX files without styles.xml throws an exception. (#2247)
* Fix Reading XLSX files without styles.xml throws an exception.

* Bugfix, debugging code removed

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Style fixes

* Fix Spreadsheet loaded without styles cannot be written

* Replaced test files for empty styles.xml testing
2021-08-16 05:05:32 -07:00
oleibman d7ac7021c6
Apache OpenOffice Creates Xls Using Wrong Case for Number Format General (#2242)
See issue #2239. Problem is dealt with at the source, by making sure that Reader Xls checks for use of 'GENERAL' rather than 'General'. There doesn't seem to be a reason to test in other places, or to test for other casing variants.
2021-08-08 08:24:03 -07:00
oleibman 0cd20f3099
Csv Handling of Booleans (and an 8.1 Deprecation) (#2232)
* Csv Handling of Booleans (and an 8.1 Deprecation)

PhpSpreadsheet writes boolean values to a Csv as null-string/1, and treats input values of 'true' and 'false' as if they were strings. On the other hand, Excel writes boolean values to a Csv as TRUE/FALSE, and case-insensitively treats a matching string as boolean on read. This PR changes PhpSpreadsheet to match Excel.

A side-effect of this change is that it fixes behavior incorrectly reported as a bug in PR #2048. That issue was closed, correctly, as user error. The user had altered Csv Writer, including adding ```declare(strict_types=1);```; that declaration was the cause of the error. The "offending" statements, calls to strpbrk and str_replace, will now work correctly whether or not strict_types is in use.

And, just as I was getting ready to push this, the dailies for PHP 8.1 introduced a change deprecating auto_detect_line_endings. Csv Reader uses that setting; it allows it to process a Csv with Mac line endings, which happens to be something that Excel can do. As they say in https://wiki.php.net/rfc/deprecations_php_8_1, where the proposal passed without a single dissenting vote, "These newlines were used by “Classic” Mac OS, a system which has been discontinued in 2001, nearly two decades ago. Interoperability with such systems is no longer relevant." I tend to agree, but I don't know that we're ready to pull the plug yet. I don't see an easy way to emulate that functionality. For now, I have silenced the deprecation notices with at signs. I have also added a test case which will fail when support for that setting is pulled; this will give time to consider alternatives.

* Scrutinizer: Handling ini_set

This could be interesting. It doesn't like not handling an error condition for ini_set. Let's see if this satisfies it.
2021-08-04 07:00:17 -07:00
oleibman 51163713c7
Tweaks to Input File Validation (#2217)
* Tweaks to Input File Validation

This started as a response to issue #1718, for which it is a partial (not complete) solution. The following changes are made:
- canRead can currently throw an exception. This seems wrong. It should just return true/false.
- Breaking change of sorts. When AssertFile encounters a non-existent or unreadable file, it throws InvalidArgumentException. This does not make sense. I have changed it to throw PhpSpreadsheet/Reader/Exception.
- Since the previous bullet item required changing of most of the Reader files anyhow, this is a good time to add explicit typing for canRead in the function signature rather than the DocBlock. Since all the canRead functions inherit from an abstract version in IReader, they all have to be changed simulatneously. Except for Xlsx and Ods, most of the Reader files are otherwise unchanged.
- AssertFile is changed to add an optional "zip member" parameter. It will check for the existence of an appropriate member in what is supposed to be a zip file. It is used by Xlsx and Ods.
- Verifying that a given file is a valid zip ought to be a feature of ZipArchive. Thanks to a particularly nasty bug in php/libzip (see https://bugs.php.net/bug.php?id=81222), it is unsafe to attempt to open a zero-length file as a zip archive. There is a solution, but it does not apply to all the PHP releases which we support, and isn't even necessarily supported on all the point versions of the PHP versions which we do support. I have coded up a manual test for "valid zip", with a comment pointing to the spec.
- In theory, tests now cover 100% of the code in Shared/File. In practice ... One of the tests require that chmod works properly, which is not quite true on Windows systems, so that test is skipped on Windows. Another test requires that php.ini uses a non-default value for upload_temp_dir (can't be overridden in application code), which is probably not the case when Github runs the unit tests, so that test is skipped when appropriate. I have run tests for both on systems where they are not skipped.

* Update File.php

* Scrutinizer Timeout

It's not actually timing out, it's just waiting for something to finish that finished ages ago. Making a meaningless comment change in hopes that will clear the jam. Not particularly hopeful.
2021-07-24 20:44:04 -07:00
James Lucas 11bf051c94 Fix test names per `composer check` 2021-07-21 05:53:49 -07:00
James Lucas a9533b77ec Add unit tests for files with true/false (LibreOffice) in DataValidation boolean values and those with 1/0 (Excel, GoogleSheets) 2021-07-21 05:53:49 -07:00
oleibman e8966183d3
Merge branch 'master' into chartcaption 2021-07-16 06:11:26 -07:00
oleibman 8729a68338
Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen (#2213)
* Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen

Fixes issue #549 and https://github.com/Maatwebsite/Laravel-Excel/issues/989 (which is the source of the new test file). Some systems accept MACCENTRALEUROPE as the name for the appropriate encoding, and some accept MAC-CENTRALEUROPE. I fortunately have access to at least one of each type, and have run the tests on each.

CodePage.php has an array of translations from codepage number to string. I now allow the value to itself be an array; if so, the code will test each in turn to see if it can be used in iconv. I did not go fishing for other similar problems. If such show up, they can be dealt with in the same manner as this one. I don't really expect others, since this is a problem not merely for Xls, but, even then, it applies only to BIFF5 and earlier.

I also moved XlsTest from Reader to Reader/Xls.

* Cache Successful Result For Future Use

Per suggestion from @MarkBaker
2021-07-12 03:02:47 +02:00
oleibman 1ff2e50ed2
Merge branch 'master' into chartcaption 2021-07-02 14:35:29 -07:00
Owen Leibman ecb4a7fe27 DocBlock Changes for Chart/Title
This is a leftover Scrutinizer change, but it needed more attention than most others. Chart/Title DocBlocks define caption as `null|string`. However, in the wild, Excel usually presents the caption as an array, and not an array of strings but rather of RichText items. I am not sure why an array is needed since a RichText item can contain many text runs, but things are what they are.

Reader/Xlsx/ChartTitleTest reads a spreadsheet with the captions stored as a RichText array. Since it performs array operations on something the DocBlock says cannot be an array, Scrutinizer objects, although not seriously enough to fail the module. Phpstan also objects; its objection is silenced with an annotation. Aside from this test, there are other tests which do set the caption to a string, and Excel seems to handle that without a problem. So, I have changed the DocBlock to specify `array|RichText|String`. I have dropped null as a possibility; nullstring will do equally well.

Because getCaption can now return multiple datatypes, I think a new function which can return the text portion of the entire caption as a single string is needed. I have added it. This simplifies the test named above, and some code in Writer/Html. The latter is not part of unit testing because the version of JpGraph found in Composer is too antiquated. I verified the Html change manually by running samples/Chart/32_Chart_read_write_HTML.php using a recent version of JpGraph. It was as a result of this test that I uncovered issue #2203. I did not see anything about Charts in docs, so did not add a description of the new function there.

Phpstan is happy with the changes. We'll see how Scrutinizer feels when I push it.
2021-07-02 14:33:43 -07:00
oleibman 075cecd268
Xlsx Reader Better Namespace Handling Phase 1 First Bugfix (#2204)
See issue #2203. An undotted i uncrossed t. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in castToFormula caused problem for shared formulas.

Surprisingly, this didn't show up in unit tests. Perhaps sharing the same formula between two cells isn't common. It did show up in Chart Samples. I've added a test.

I was really inclined to merge this right away. Not to worry - I can control myself. It should be moved fairly quickly nevertheless.
2021-07-02 12:36:54 +02:00
oleibman 2ae948a319
Reader/Slk vs. Scrutinizer/Phpstan (#2192)
Just reviewing Scrutinizer's list of "bugs". There are 19 ascribed to me. For some, I will definitely take no action (e.g. use of bitwise operators in AND, OR, and XOR functions). However, where I can clean things up so that Scrutinizer is satisfied and the resulting code is not too contorted, I will make an attempt.

This PR corrects 3 problems (2 mine) according to Scrutinizer, and 7 per Phpstan. It also moves the Reader Slk tests under their own directory, as is the case for all the other Reader types.
2021-06-29 20:48:31 +02:00
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
jarrett jordaan 795992835f
When image source is a URL, store the URL for use during extraction. (#2072)
When image source is a link store the link.
Add url mutator.

Update section in documentation on image extraction.
2021-06-24 10:50:44 +02:00
Owen Leibman 83c0f02c95 Move Reader Xlsx Tests from Reader to Reader/Xlsx Try 2
PR #2088 is having major merge problems. This is partly because it moves some tests from Reader to Reader/Xlsx. Making this move beforehand may help. Or it may make things worse, but they are already bad enough that I am contemplating redoing the PR. If I do that, having this done beforehand will make things easier.

This PR does nothing but move some tests. This will make it easier to test changes to Xlsx Reader without having to run each test individually, or without having to run tests for all the other readers at the same time.
2021-06-17 09:45:11 -07:00
Olivier TARGET 803737a893
Fix for #2149 / Read data validations for drop down list in another sheet. (#2150)
* Read data validations for drop down list in another sheet.

* Add function testLoadXlsxDataValidationOfAnotherSheet() in class tests/PhpSpreadsheetTests/Reader/XlsxTest.php for unit test.

* Add sample xlsx for unit tests.

* Modifiy call function isset() for warnings.

* Additional assertions to ensure that the worksheet has been read correctly for DataValidation that references a list on a different worksheet

* This should resolve the phpstan issues

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-06-15 13:28:10 +02:00
Mark Baker 74b02fb31c
Fix for the BIFF-8 Xls colour mappings in the Reader (#2156)
* Fix for the BIFF-8 Xls colour mappings in the Reader
* Unit test for reading colours, writing hen rereading and ensuring that the RGB values have not changed
2021-06-13 21:46:49 +02:00
Mark Baker 05466e99ce
Html import dimension conversions (#2152)
Allows basic column width conversion when importing from Html that includes UoM... while not overly-sophisticated in converting units to MS Excel's column width units, it should allow import without errors

Also provides a general conversion helper class, and allows column width getters/setters to specify a UoM for easier usage
2021-06-11 17:29:49 +02:00
Mark Baker 19724e3217
Reader writer flags (#2136)
* Use of passing flags with Readers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a reader and manually enable it before loading any more.

This is in preparation for supporting new "boolean" Reaer/Writer features, such as pivot tables

* Use of passing flags with Writers to identify whether speacial features such as loading charts should be enabled; no need to instantiate a writer and manually enable it before loading any more.

* Update documentation with details of changes to the StringValueBinder
2021-06-04 13:45:32 +02:00
oleibman eccfecd529
Reader XML Properties - Eliminate strtotime (#2134)
* Document Properties - Coverage and 32-bit-safe Timestamps

While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).

Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.

<!-- end of coverage changes list -->

Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.

<!-- end of timestamp changes list -->

Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).

<!-- end of other changes list -->

Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.

* Phpstan

6 baseline deletions, 2 docblock changes

* Scrutinizer's Turn

3 minor errors that hadn't blocked the request.

* Reader XML Properties - Eliminate strtotime

Piggyback on top of prior changes to eliminate 32-bit-unsafe call.

Add explicit tests for created, modified, and custom date properties.
2021-05-31 11:04:07 +02:00
oleibman e1cb997ee6
Gnumeric Reader - Distinguish Created and Modified Timestamps (#2133)
* Document Properties - Coverage and 32-bit-safe Timestamps

While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).

Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.

<!-- end of coverage changes list -->

Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.

<!-- end of timestamp changes list -->

Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).

<!-- end of other changes list -->

Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.

* Phpstan

6 baseline deletions, 2 docblock changes

* Scrutinizer's Turn

3 minor errors that hadn't blocked the request.

* Gnumeric Reader - Distinguish Created and Modified Timestamps

Both are being used to set both fields; change to set the appropriate one in each case.

Also replace use of 32-bit-unsafe strtotime.
2021-05-31 10:24:37 +02:00
oleibman e53a2b2e0d
Document Properties - Coverage and 32-bit-safe Timestamps (#2113)
* Document Properties - Coverage and 32-bit-safe Timestamps

While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).

Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.

<!-- end of coverage changes list -->

Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.

<!-- end of timestamp changes list -->

Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).

<!-- end of other changes list -->

Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.

* Phpstan

6 baseline deletions, 2 docblock changes

* Scrutinizer's Turn

3 minor errors that hadn't blocked the request.
2021-05-30 13:55:58 +02:00
oleibman 294933c9e5
Update CsvContiguousTest.php 2021-05-16 11:48:12 -07:00
oleibman 251df6e61e
Merge branch 'master' into csvdflts 2021-05-16 06:13:40 -07:00
Owen Leibman ae80c12ef0 CSV Reader Enhancements
This PR came about as I pondered how feasible it was to change the default escape character from backslash to null string, since the latter emulates Excel's own actions. Also, surveying issues relating to CSV, it seems that people are often in a situation where the current defaults aren't optimal for them (e.g. they are in a region where semicolon rather than comma is a better default delimiter). My case and that case can both be handled by methods after a reader is constructed. However, the issues also show that many use `IOFactory::load` rather than `new Csv()`, and the methods to affect the defaults are not available in that case.

Adding a static callback that can be invoked by the constructor addresses all these problems. This can be set as part of the user application's normal initialization, and no special attention needs to be paid to CSV loads thereafter, no matter how they are invoked.

This also makes it feasible to use 'guess' as inputEncoding, by providing a new setFallbackEncoding (default CP1252) method to use if none of the heuristic tests pass. There was already the ability to guess the encoding before `$reader->load()`, but not before `IOFactory::load`.

Almost all typehints in Reader/Csv and Reader/Csv/Delimiter are now part of the function signature rather than in the DocBlock. The exceptions are one method in Delimiter which uses a `resource` parameter, and the `canRead` and `load` methods, which must match the signature in IOFactory. I will look into changing those later.

The Csv Reader tests are moved into their own directory. All Phpstan baseline entries involving Csv Reader are eliminated.
2021-05-16 06:05:02 -07:00
oleibman d5492ac8ed
Merge branch 'master' into #984 2021-05-11 14:44:33 -07:00
xandros15 bb11378fca #984 fix php-cs-fixer warnings 2021-05-11 12:44:40 +02:00
Mark Baker 72a36a5bb8
Resolve issue with conditional font size set to zero in PHP8 (#2073)
* Let's see if the tests now pass against PHP8; output file looks to be good
* Font can't be both superscript and subscript at the same time, so we use if/else rather than if/if
2021-05-07 12:53:59 +02:00
Mark Baker 5ee4fbf090
Implement basic autofilter ranges with Gnumeric Reader (#2057)
* Load basic autofilter ranges with Gnumeric Reader
* Handle null values passed to row height/column with/merged cells/autofilters
2021-05-04 22:32:12 +02:00
oleibman 4be9366722
Gnumeric Better Namespace Handling (#2022)
* Gnumeric Better Namespace Handling

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Although the issues exclusively concern Xlsx format, I am starting out by dealing with Gnumeric. It is simpler and smaller than Xlsx, and, more important, already has a test for an unexpected prefix, since, at some point, it changed its generic prefix from gmr to gnm. I added support and a test for that some time ago, but almost certainly not in the best possible manner. The code as changed for this PR seems simpler and less kludgey, both for that exceptional case as well as for normal handling.

My hope is that this change can be a template for similar Reader changes for Xml, Ods, and, especially, Xlsx.

All grandfathered Phpstan issues with Gnumeric are fixed and eliminated from baseline as part of this change.

* Namespace Handling using XMLReader

Adopt a suggestion from @IMSoP affecting listWorkSheetInfo, which uses XMLReader rather than SimpleXML for its processing.

* Update GnumericLoadTest.php

PR #2024 was pushed last night, causing a Phpstan problem with this member.

* Update Gnumeric.php

Suggestions from Mark Baker - strict equality test, more descriptive variable names.
2021-05-04 21:41:11 +02:00
Mark Baker fd14da1675
Ods defined names unit tests (#2054)
* Defined names/formulae in ODS are prefixed by $$ when used in a formula; so we need to strip this out to fully convert them to an Excel formula

* Test for ODS Writer for DefinedNames
2021-05-03 08:39:42 +02:00
xandros15 a757692992 #984 add support notContainsText for conditional styles in xlsx reader 2021-05-02 22:09:38 +02:00
Mark Baker 83e55cffcc
First steps in the implementation of AutoFilters for ODS Reader and Writer (#2053)
* First steps in the implementation of AutoFilters for ODS Reader and Writer, starting with reading a basic AutoFilter range (ignoring row visibility, filter types and active filters for the moment).

And also some additional refactoring to extract the DefinedNames Reader into its own dedicated class as a part of overall code improvement... on the principle of "when working on a class, always try to leave the library codebase in a better state than you found it"

* Provide a basic Ods Writer implementation for AutoFilters
* AutoFilter Reader Test
* AutoFilter Writer Test
* Update Change Log
2021-05-02 22:00:48 +02:00
Mark Baker d555b5d312
Pattern Fill style should default to 'solid' if there is a pattern fill with colour but no style (#2050)
* Pattern Fill style should default to 'solid' if there is a pattern fill style for a conditional; though may need to check if there are defined fg/bg colours as well; and only set a fill style if there are defined colurs
2021-04-30 20:05:45 +02:00
oleibman cc5c0205d5
Fix for Issue 2029 (Invalid Cell Coordinate A-1) (#2032)
* Fix for Issue 2029 (Invalid Cell Coordinate A-1)

Fix for #2021. When Html Reader encounters an embedded table, it tries to shift it up a row. It obviously should not attempt to shift it above row 1. @danmodini reported the problem, and suggests the correct solution. This PR implements that and adds a test case.

Performing some additional testing, I found that Html Reader cannot handle inline column width or row height set in points rather than pixels (and HTML writer with useInlineCss generates these values in points). It also doesn't handle border style when the border width (which it ignores) is omitted. Fixed and added tests.
2021-04-29 22:59:01 +02:00
oleibman aeccdb35e2
XLSX Reader and Empty Fill Tag (#2011)
Openpyxl can generate the xml tag `<patternFill/>`, possibly even as a default style. Excel has no problem with this, treating it as "fill none", but PhpSpreadsheet has a glitch because it treats it as "fill solid white". So, when PhpSpreadsheet loads and saves such a file, the result at first appears as if gridlines are disabled; in fact, the gridlines are merely invisible behind the cells with their solid white fill. This PR makes PhpSpreadsheet behave the same as Excel in this circumstance.

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-04-20 17:20:59 +02:00
Mark Baker 6282035c96
Extract Property and Style readers from the XML Reader into separate classes (#2009)
* Extract Property and Style readers from the XML Reader into separate classes
2021-04-20 15:27:44 +02:00
Tiago Fernandes 11f8a02194
Merge branch 'master' into issue-907-absolute-path-in-Target 2021-04-19 14:32:20 +01:00
Tiago Fernandes 559c0761df Remove unnecessary changes. Added test 2021-04-19 11:25:48 +01:00
Adrien Crivelli 49f87de165
Reduce PHPStan error in tests 2021-04-12 11:10:23 +09:00