PhpSpreadsheet/tests/data/Reader/XLSX
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
..
PageSetup.xlsx Forgot to check in the test files for the unit tests 2020-07-05 16:28:46 +02:00
autofilterTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
bug1686b.xlsx Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742) 2020-12-10 18:08:10 +01:00
condfmtnum.xlsx Handle ConditionalStyle NumberFormat When Reading Xlsx File (#1296) 2020-01-04 00:10:41 +01:00
conditionalFormatting2Test.xlsx Conditionals - Extend Support for (NOT)CONTAINSBLANKS (#1278) 2020-01-04 18:50:04 +01:00
conditionalFormatting3Test.xlsx #984 add support notContainsText for conditional styles in xlsx reader 2021-05-02 22:09:38 +02:00
conditionalFormattingDataBarTest.xlsx Support DataBar of conditional formatting rule (#1754) 2021-01-29 16:57:40 +01:00
conditionalFormattingTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
dataValidation2Test.xlsx Fix for #2149 / Read data validations for drop down list in another sheet. (#2150) 2021-06-15 13:28:10 +02:00
dataValidationTest.xlsx Basic unit test and fix for loading data validations from xlsx file (#1063) 2019-07-08 19:55:14 +02:00
double_attr_drawing.xlsx Fix failure when parsing xlsx with drawing having double (redefined) … (#945) 2019-05-30 11:42:00 +02:00
empty_drawing.xlsx Fix #853 when loading and saving XLSX file with empty drawing cause c… (#882) 2019-05-30 10:38:03 +02:00
excelChartsTest.xlsx Fix/chart axis titles (#1760) 2021-01-31 19:13:50 +01:00
issue2109b.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
namespacenonstd.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
namespacepurl.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
namespaces.openpyxl35.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
namespaces.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
namespacestd.xlsx Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
pageSetupTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
pr1769e.xlsx Remove unnecessary changes. Added test 2021-04-19 11:25:48 +01:00
pr1769g.py.xlsx XLSX Reader and Empty Fill Tag (#2011) 2021-04-20 17:20:59 +02:00
pr2050cf-fill.xlsx Pattern Fill style should default to 'solid' if there is a pattern fill with colour but no style (#2050) 2021-04-30 20:05:45 +02:00
propertyTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
rowColumnAttributeTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
sheetsChartsTest.xlsx Fix/sheets xlsx chart (#1761) 2021-01-31 18:53:54 +01:00
stylesTest.xlsx Refactoring xlsx reader (#1033) 2019-06-30 23:42:25 +02:00
urlImage.xlsx When image source is a URL, store the URL for use during extraction. (#2072) 2021-06-24 10:50:44 +02:00
without_cell_reference.xlsx Support missing attribute `r` in `c` node when reading xlsx 2017-09-22 14:49:38 +09:00