A pure PHP library for reading and writing spreadsheet files
Go to file
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
.github Upgrade to GitHub-native Dependabot (#2044) 2021-06-24 12:08:41 +02:00
bin Move documentation builder to infra so that it isn't included in non `--dev` composer downloads 2021-05-28 22:35:37 +02:00
docs When image source is a URL, store the URL for use during extraction. (#2072) 2021-06-24 10:50:44 +02:00
infra Locale Generator - Change to Use Unix Line Endings Even on Windows (#2174) 2021-06-19 10:20:16 +02:00
samples Autofilter Part 2 2021-06-24 10:09:21 +02:00
src/PhpSpreadsheet Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
tests Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
.gitattributes Additional language data, and improved automated build of translation files for Calculation Engine locale 2021-05-20 20:41:09 +02:00
.gitignore Make Documentation Updates Easier and More Accurate (#1573) 2021-02-15 20:50:20 +01:00
.php_cs.dist PHPStan Level 2 2021-04-04 22:06:00 +09:00
.phpcs.xml.dist Move phpcs config to file 2020-07-26 14:48:06 +09:00
.scrutinizer.yml Scrutinizer wait shorter for coverage 2020-05-25 11:20:08 +09:00
CHANGELOG.PHPExcel.md Prefer https:// URLs when available in docs & comments 2018-10-28 13:55:00 +11:00
CHANGELOG.md When image source is a URL, store the URL for use during extraction. (#2072) 2021-06-24 10:50:44 +02:00
CONTRIBUTING.md Document release process 2021-03-30 10:11:46 +09:00
LICENSE Change license from LGPL 2.1 to MIT 2019-11-17 18:08:34 +01:00
README.md Drop Travis 2020-11-26 11:10:52 +09:00
composer.json Additional language data, and improved automated build of translation files for Calculation Engine locale 2021-05-20 20:41:09 +02:00
composer.lock Resolve default values when a null argument is passed for HLOOKUP(), VLOOKUP() and ADDRESS() functions 2021-05-27 12:02:38 +02:00
mkdocs.yml Make Documentation Updates Easier and More Accurate (#1573) 2021-02-15 20:50:20 +01:00
phpstan-baseline.neon Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173) 2021-06-25 09:05:49 +02:00
phpstan.neon.dist Avoid memory leak by releasing image resources 2021-05-16 12:39:09 +09:00
phpunit.xml.dist Use current PHPUnit configuration xsd 2020-05-17 18:38:49 +09:00

README.md

PhpSpreadsheet

Build Status Code Quality Code Coverage Total Downloads Latest Stable Version License Join the chat at https://gitter.im/PHPOffice/PhpSpreadsheet

PhpSpreadsheet is a library written in pure PHP and offers a set of classes that allow you to read and write various spreadsheet file formats such as Excel and LibreOffice Calc.

Documentation

Read more about it, including install instructions, in the official documentation. Or check out the API documentation.

Please ask your support questions on StackOverflow, or have a quick chat on Gitter.

PHPExcel vs PhpSpreadsheet ?

PhpSpreadsheet is the next version of PHPExcel. It breaks compatibility to dramatically improve the code base quality (namespaces, PSR compliance, use of latest PHP language features, etc.).

Because all efforts have shifted to PhpSpreadsheet, PHPExcel will no longer be maintained. All contributions for PHPExcel, patches and new features, should target PhpSpreadsheet master branch.

Do you need to migrate? There is an automated tool for that.

License

PhpSpreadsheet is licensed under MIT.