* Tweaks to Input File Validation
This started as a response to issue #1718, for which it is a partial (not complete) solution. The following changes are made:
- canRead can currently throw an exception. This seems wrong. It should just return true/false.
- Breaking change of sorts. When AssertFile encounters a non-existent or unreadable file, it throws InvalidArgumentException. This does not make sense. I have changed it to throw PhpSpreadsheet/Reader/Exception.
- Since the previous bullet item required changing of most of the Reader files anyhow, this is a good time to add explicit typing for canRead in the function signature rather than the DocBlock. Since all the canRead functions inherit from an abstract version in IReader, they all have to be changed simulatneously. Except for Xlsx and Ods, most of the Reader files are otherwise unchanged.
- AssertFile is changed to add an optional "zip member" parameter. It will check for the existence of an appropriate member in what is supposed to be a zip file. It is used by Xlsx and Ods.
- Verifying that a given file is a valid zip ought to be a feature of ZipArchive. Thanks to a particularly nasty bug in php/libzip (see https://bugs.php.net/bug.php?id=81222), it is unsafe to attempt to open a zero-length file as a zip archive. There is a solution, but it does not apply to all the PHP releases which we support, and isn't even necessarily supported on all the point versions of the PHP versions which we do support. I have coded up a manual test for "valid zip", with a comment pointing to the spec.
- In theory, tests now cover 100% of the code in Shared/File. In practice ... One of the tests require that chmod works properly, which is not quite true on Windows systems, so that test is skipped on Windows. Another test requires that php.ini uses a non-default value for upload_temp_dir (can't be overridden in application code), which is probably not the case when Github runs the unit tests, so that test is skipped when appropriate. I have run tests for both on systems where they are not skipped.
* Update File.php
* Scrutinizer Timeout
It's not actually timing out, it's just waiting for something to finish that finished ages ago. Making a meaningless comment change in hopes that will clear the jam. Not particularly hopeful.
Most of the remaining 32-bit-unsafe date handling that remains in PhpSpreadsheet is in AutoFilter. Cleaning this up demonstrated that there are a lot of problems with AutoFilter, and I will do it in two pieces. Part 1 was PR #2141 which I have just merged.
In this PR:
- Fix remaining 32-bit dates in filterTestInDateGroupSet.
- Also in some of the existing AutoFilter samples. Note that the comments in two of those said the filter was being set for the first day of each month, but the code specifies the last day - I have corrected the comments.
- Remove mocking in unit tests for AutoFilter in favor of 'real' tests.
- Code coverage is now 100% in all of AutoFilter, AutoFilter/Column, and AutoFilter/Common/Rule.
- No remaining AutoFilter(/Column(/Rule)) exceptions in Phpstan baseline.
- Documentation for escaping of asterisk, question mark, and tilde in text filters included spurious backslashes which are now removed.
- Text filter escaping of question mark did not work. There had been no unit tests for any text filtering.
- Likewise there had been no testing for TopTen.
- Above- and below- average filters were not working because they acquired their Calculation instance incorrectly. There had been no tests.
- Several unchanging private static arrays in Rule were changed to private const arrays.
- Clones are now tested.
- RuleTest is moved to same directory as other tests.
Specifically, the default for these two functions has been changed from `ENT_COMPAT` to `ENT_QUOTES | ENT_SUBSTITUTE`
This PR configures the argument used for those functions in Settings, and then explicitly applies it everywhere they are used in the codebase.
Most of the remaining 32-bit-unsafe date handling that remains in PhpSpreadsheet is in AutoFilter. Cleaning this up demonstrated that there are a lot of problems with AutoFilter, and I will do it in (probably two) pieces.
In this PR:
- Dynamic date processing was really wrong. There were no tests nor samples to exercise this code. (If you need details, you can try running the new sample against old code.) It is completely re-written.
- ThisYear/Month/Week/Quarter had been omitted.
- Rules such as AUTOFILTER_RULETYPE_DYNAMIC_MONTH_2 were almost correct, but showed some off-by-1 errors. I suspect these were timezone-related, and therefore more obvious to those of us far away from Greenwich.
- All Autofilter tests are moved to a single directory.
- The documentation suggested using null with the Dynamic Date setup, but Phpstan did not like that in my new tests/samples. Rather than change the doc block, I changed the documentation to suggest null string.
- I created a new sample to generate sheets using all the dynamic filters.
- I have added some new unit tests for each of the dynamic filters. I would love to be able to add some "time travel" tests because the dynamic nature of the filter makes most of the results change from day to day, which presents significant challenges in writing comprehensive unit tests (the same is true for code coverage). I was not able to find a good way to simulate time within PhpUnit, but the Linux 'faketime' package was extraordinarily easy and helpful in allowing me to confirm some edge cases. I had less satisfactory results with some Windows equivalents, but was still able to run some tests.
- Code coverage increases from below 60% to above 80%.
To be done:
- Some 32-bit unsafe dates remain in filterTestInDateGroupSet.
- Also in some of the existing AutoFilter samples.
- Study existing unit tests for AutoFilter which use mocking to see if they can/should be replaced with 'real' tests.
- Improve code coverage in AutoFilter, AutoFilter/Column, and AutoFilter/Common/Rule.
* Document Properties - Coverage and 32-bit-safe Timestamps
While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).
Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.
<!-- end of coverage changes list -->
Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.
<!-- end of timestamp changes list -->
Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).
<!-- end of other changes list -->
Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.
* Phpstan
6 baseline deletions, 2 docblock changes
* Scrutinizer's Turn
3 minor errors that hadn't blocked the request.
* Reader XML Properties - Eliminate strtotime
Piggyback on top of prior changes to eliminate 32-bit-unsafe call.
Add explicit tests for created, modified, and custom date properties.
* Document Properties - Coverage and 32-bit-safe Timestamps
While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).
Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.
<!-- end of coverage changes list -->
Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.
<!-- end of timestamp changes list -->
Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).
<!-- end of other changes list -->
Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.
* Phpstan
6 baseline deletions, 2 docblock changes
* Scrutinizer's Turn
3 minor errors that hadn't blocked the request.
While researching an issue, I noticed that coverage of Document/Properties was poor. Further, the use of int timestamps will eventually lead to problems for 32-bit PHP (see issue #1826).
Coverage Changes:
- Many property types with no special handling are enumerated but not tested. These are removed, but will continue to function as before.
- Existing code theoretically allows property to be set to an object, but there is no means to read or write such a property, and, even if there were, I don't believe Excel supports it. Setting a property to an object will now be changed to a no-op (can throw an exception if preferred).
- Since the Properties object now has no members which are themselves objects, there is no need for a deep clone. The untested __clone method is removed.
- Large switch statements are replaced with associative arrays. Scrutinizer will like that.
- Coverage is now 100%.
<!-- end of coverage changes list -->
Timestamp Changes:
- Timestamps will be stored as int if possible, or float if not. This is, or will soon be, needed for 32-bit systems. Tests have been added for beyond-epoch dates, and run successfully with 32-bit.
- LibreOffice doesn't quite get the Created/Modified properties correct. These are written to the file as a string which includes offset from UTC, but LibreOffice ignores the offset portion when displaying them. Code had been generating these in UTC, but now generates them in default timezone, which should meet user's expectations.
<!-- end of timestamp changes list -->
Other Changes:
- Custom properties added to ODS Writer.
- Samples had not been generating any ODS files. One is now generated.
- Ods uses a single 'keywords' property rather than multiple 'keyword' properties.
- Breaking change - default company is changed to null string from Microsoft Corporation.
- Breaking change of sorts - PropertiesTest incorrectly tested a custom date property against a string, Reader/XlsxTest correctly tested against a timestamp converted to a string. PropertiesTest was defective, and will no longer work as coded; anyone using it as a model will likewise have a problem.
- PHP8.1 has been complaining for weeks about a time zone conversion test. I have now downloaded a version, and changed the code so that it will work in 8.1 as well as prior releases. (It is still likely that the existing code should work in 8.1, but I haven't yet figured out how to file a bug report.) In the course of testing, 3 additional 8.1 problems were reported (all along the lines of "can't pass null to strpos"), and are fixed with null coercion.
- Two Calculation tests failed because of large results on 32-bit system. These are corrected by allowing the functions involved to return float|int rather than int. I suspect that there are other functions with this problem, and will investigate as a follow-up activity.
- See issue #2090. I believe that changes between 17.1 and master will merely cause the problematic spreadsheet to fail in a different way. I believe that enclosing in quotes some variables passed to Document/Properties by Reader/Xlsx will eliminate the problem, but, in the absence of an example file, cannot say for sure.
- Properties tests are now separated out from Reader/XlsxTest and Reader/OdsTest, and now test both Read and Write (via reload).
<!-- end of other changes list -->
Miscellaneous Notes:
- There remains no support for Custom Properties in Xls Reader or Writer.
- We now have default timezones for all of PHP itself, Shared/Date, and Shared/Timezone. That is least one too many. I was unable to disentangle the latter two for this change, but will look into deprecating one or the other in future.
19_NamedRange.php was not changed to use absolute addressing when that was introduced to Named Ranges. Consequently, the output from this sample has been wrong ever since, for both Xls and Xlsx.
There was an additional problem with Xls. It appears that the Xls Writer Parser does not parse multiple concatenations using the ampersand operator correctly. So, `=B1+" "+B2` was parsed as `=B1+" "`. I believe that this is due to ampersand being treated as a condition rather than an operator; `A1>A2>A3` isn't valid, but `A1&A2&A3` is. My original PR (#1992, which I will now close) only partially resolved this, but I think moving ampersand handling from `condition` to `expression` is fully successful.
There are already more than ample tests for Named Ranges, so I did not add a new one for that purpose. However, I did add a new test for the Xls parser problem.
As issue #2042 documents, SUM behaves differently with invalid strings depending on whether they come from a cell or are used as literals in the formula. SUM is not alone in this regard; COUNTA is another function within this behavior, and the solution to this one is modeled on COUNTA. New tests are added for SUM, and the resulting tests are duplicated to confirm correct behavior for both cells and literals.
Samples 16 (CSV), 17 (Html), and 21 (PDF) were adversely affected by this problem. 17 and 21 were immediately fixed, but 16 had another problem - Excel was not interpreting the UTF8 currency symbols correctly, even though the file was saved with a BOM. After some experimenting, it appears that the `sep=;` line generated by setExcelCompatibility(true) causes Excel to mis-handle the file. This seems like a bug - there is apparently no way to save a UTF-8 CSV with non-ASCII characters which specifies a non-standard separator which Excel will open correctly. I don't know if this is a recent change or if it is just the case that nobody noticed this problem till now. So, I changed Sample 16 to use setUseBom rather than setExcelCompatibility, which solved its problem. I then added new tests for setExcelCompatibility, with documentation of this problem.
* Additional unit tests and rationalisation for Financial Functions
* Providing a series of sample files for Financial functions
* Refactor the last of the existing Financial functions
* Some more unit tests with default assignments from null arguments
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
* jpgraph seems to be finally dying with PHP. Until we have a valid alternative, disabling this run for PHP because it errors
https://github.com/HuasoFoundries/jpgraph looks like a natural successor, but it isn't BC so it will require some work to integrate
* First step extracting INDIRECT() and OFFSET() to their own classes
* Start building unit tests for OFFSET() and INDEX()
* Named ranges should be handled by the Calculation Engine, not by the implementation of the Excel INDIRECT() function
* When calling the calculation engine to get the range of cells to return, INDIRECT() and OFFSET() should use the instance of the calculation engine for the current workbook to benefit from cached results in that range
There's a couple of minor bugfixes in here; but it's basically just refactoring of the INDIRECT() and OFFSET() Excel functions into their own classes - still needs a lot of work on unit testing; and there's a lot more that could be improved in the code itself (including handling of the a1 flag for R1C1 format in INDIRECT()
* Extract LookupRef\INDEX() into index() method of LookupRef\Matrix class
Additional tests
* Bugfix for returning a column using INDEX()
* Some improvements to ROW() and COLUMN()
* Simplify some of the INDEX() logic, eliminating redundant code
* Stacked Alignment - Use Class Constant Rather than Literal
PR #1580 defined constants for "stacked" alignment in cells.
Using those constants outside of Style/Alignment was beyond the
scope of the original PR, but I said I would get to it.
This PR replaces all uses of literal -165, and appropriate uses of
literal 255, with the named constants, and adds tests to make sure
that the changed code is covered in the test suite.
There are no changes to code. Additional tests are added,
so that the following 6 items now have 100% test coverage:
- Comment
- DefinedName
- DocumentGenerator
- IOFactory
- NamedFormula
- NamedRange
All other Samples write to temporary directory. DefinedNames samples
write to main directory, which (a) means they aren't stored with others,
and (b) they aren't ignored by git so look like changed files.
The tests are also simplified by requiring Header rather than Bootstrap,
making use of Helper.
* Improving Coverage for Excel2003 XML Reader
Reader/Xml is now 100% covered.
File templates/Excel2003XMLTest.xml, used in some tests, is *not*
readable by a current version of Excel. I have substituted a new file
excel2003.xml to be used in its place. I have not deleted the original
in case someone in future (possibly me) wants to see what it needs to
make it usable.
There are minimal code changes.
- Unused protected functions pixel2WidthUnits and widthUnits2Pixel
are deleted.
- One regex looking to convert hex characters is changed from a-z to a-f,
and made case insensitive.
- No calculation performed for "error" cell (previously calculation
was attempted and threw exception).
- Empty relative row/cell is now handled correctly.
- Style applied to empty cell when appropriate.
- Support added for textRotation.
- Support added for border styles.
- Support added for diagonal borders.
- Support added for superscript and subscript.
- Support added for fill patterns.
In theory, encodings other than UTF-8 were supported.
In fact, I was unable to get SecurityScanner to pass *any* xml which is
not UTF-8. Eliminating the assumption that strings might not be UTF-8
allowed much of the code to be greatly simplified.
After that, I added some code that would permit the use of
some ASCII-compatible encodings (there is a test of ISO-8859-1).
It would be more difficult to handle other encodings (such as UTF-16).
I am not convinced that even the ISO-8859 effort is worth it,
but am willing to investigate either expanding or eliminating
non-UTF8 support.
I added a number of tests, creating an Xml directory, and moving
XmlTest to that directory.
Pull Request had problems reading old invalid sample in the code
coverage phase, not in any of the other test phases, and not in
the code coverage phase on my local machine.
As it turns out, aside from being invalid, the sample
is much larger than any of the other samples. Tests have been
adjusted accordingly.
* Smaller Test File
Should eliminate need to avoid test during xml coverage.
* Break Up Style Test into Multiple Tests
Per suggestion from Mark Baker.
* Integrate AddressHelper Change
The introduction of AddressHelper introduced a conflict which needed to
be resolved. I wanted to test it locally before resolving. This required
me to add (unchanged) AddressHelper to my local copy. I hope this is
an okay manner of resolving the conflict.
* Weird Travis Error
XmlOddTest works just fine on my local machine, but Travis failed it.
Even worse, the lines which Travis flags don't even make any sense
(one was the empty line between two methods!).
This test is not essential to the rest of the change. I am removing
it from the package, and will attempt to re-add it when I have a chance
to sync up my fork with the main project.
* Initial work modifying the way named ranges are stored, and handled by the calculation engine
This should provide better support for:
- both union and intersection operators in composite named range values
- MS Excel implementation of the union operator duplicating values
- named formulae
- named ranges and formulae that reference other named ranges and formulae
- ranges and formulae that reference multiple ranges across multiple worksheets
* Initial work on handling defined names (named ranges and named formulae) correctly
- UTF-8 names (already extracted as a separate PR and merged)
- distinction between named ranges and named formulae
- correct handling of union and intersection operators in named ranges
- correct evaluation of named range operators in calculations
- calculation support for named formulae
- support for nested ranges and formulae (named ranges and formulae that reference other named ranges/formulae) in calculations
* Minor tweaks before resolving merge conflicts
* Fix extractSheetTitle() method to work on the last ! in a cell reference rather than the first
* Throw exception if a the reference to a defined name in a formula doesn't exist as a defined name
* Properly assess scope for defined names in calculation engine
* Elimination of some redundant code
* Minor tweaks to simplify entries o the stack where we need to check type
* Ensure correct scoping rules are applied when evaluating named ranges and formulae
* Adjustments to Gnumeric Reader for new defined names structure
* Initial work modifying the Ods Reader to handle named ranges, they weren't actually supported previously... this is still ongoing work
* Handle Ranges formatted as 3-d ranges, as long as the references are both to the same worksheet
* Additional testing for Named Ranges formatted as 3-d ranges, as long as the references are both to the same worksheet
* Skip composite named range tests for the moment
* Clean handling for `undefined name` exception when thrown in the calculation engine. Catch and replace with `#NAME?`
* Adjust method we use to determine whether a defined name is a range or a formula
* PHPCS Recommendations
* PHP doesn't support `mixed` yet, at least not at the minium version that we're working with
* More phpcs fixes
* More phpcs appeasements
* Final phpcs fixes for the moment
Still have a lot of echo and var_dump() statements in the code that scrutinizer will hate, but they stay for the moment while this is still WIP
* Please let this be the last of the phpcs fixes
* Unit tests to determine whether a defined name value is a range value or a formula
* phpcs appeasement
* Named tests from provider
* Initial steps for named ranges and formulae in the Ods Reader
* Reading pseudo-3d range addresses in Ods; treat second sheet reference as being identical to the first, which is the majority of cases where this will occur
* Initial work on Gnumeric reader for named ranges and formulae
* Suppress debug logging again
* Remove more debugging displays
* Last minor tweaks before phase two
* Minor refinements
* And all for the want of a space
* A little tidying up
* More tidying up
* phpcs fix
* Modify defined names in rebindParent()
* Renaming variables
* Resolve an issue with locally scoped defined names that don't contain any worksheet reference
* Keep phpcs happy
* Fix quote handling in regexp
* Fix a couple of scrutinizer issues
* Fix a couple of scrutinizer issues
* Update Xlsx Writer to work with the new defined name internal definition
Additional validation checks
* When adding new defined names through the readers, worksheet may not exist if we're only loading selected sheets rather than the full spreadsheet
* If the only thing that phpcs can pickup on is strings in double quotes instead of single quotes, then I know I'm getting close to ready
* Refactor Defined Names logic for Xlsx Writer into its own class
* phpcs keeping me on my toes
* Restore a couple of files that I managed to change without intending to
* Initial work on Ods Write to provide support for saving named ranges and formulae
* Resolve commas to semi-colons s argument separator when writing named formulae for Ods
* Extract Named Expression Writer for Ods into its own class
* Keep phpcs happy
* Refactoring of formula conversion when reading SpreadsheetML; preparation for reading named ranges because they will also need to use the same conversion method
* First pass at reading Named Ranges/Formulae from SpreadsheetML format xml files
* Remove unused namespace reference
* Defined names being written correctly for Xls; but not yet writing cell formulae that reference those defined names... that's the next big step
And I anticipate that defined names that reference other defined names will also be a problem
* Just to keep phpcs happy
... and yes, I know that there are still diagnostic echo statements in the code
* I had to miss some of the phpcs issues didn't I
* Work on the Xls Writer's Parser Tree to identify named range tokens in a formula, and to distinguish them from function tokens
* Still working on packing that d*** defined name reference in the writer
* Throw an exception in the Parser for saving Xls output if we encounter a defined name in a formula... writer will simply write the calculated cell value, and not the formula as at present
Strip out diagnostic output
* Some phpcs appeasement
* Fix a couple of Scrutinizer issues
* Additional verifications to differentiate a formula from a range value
Add explicit getters/setters for named ranges, named formulae and defined names
Additional unit tests
* Styling for closures
* Remove redundant docblocks
* Spaces
* Gah! Namespace use complaints
* Consistency of making calls to DefinedName rather than NamedRange; NamedRange should now be used only for Named Ranges, and should exclude Named Formulae
* Styling
* spurious newline
* No need to test for variable === null when we're typing it in the function argument definition
* Additional unit tests for local/global scoped named ranges and formulae; and a fix to getNamedFormula()
* Fix silly typo that led to breaking test
* Void return signature for unit tests
* Why weren't these picked up in the last pass?
* Refactoring of getNamedRange()/getNamedFormula()
* Eliminate unused constants, and defaults for private method parameters when always called with a value
* Use strict comparisons when comparing object hash codes
* Initial update to documentation for working with named formulae
* Fix for calculation of relative cell references in named ranges/formulae
* Fix current named range tests, because we should be using absolute references; tests for relative named ranges to be added later
* Fix for calculation of relative cell references in named ranges/formulae
* Updates to changelog and documentation for handling of absolute/relative references in named ranges
* Fix last remaining unit test with a named range reference
* Refactor formula conversion for Ods into a separate class; I hadn't realised that it previously wrote formulae as the MS Excel syntax without any conversion to Ods format
* Fix Ods Writer test xml to reflect Ods-native format for formula
* Docblocks
* Drop dollar prefix from Ods formulae and ranges unless it's necessary
* Set the formula convertor in the content writer constructor
* Documentation update
* Minor updates
* Remove var_dumps from file
* Fix the spurious single quote that was breaking named expressions in the Ods Writer... big sigh of relief that I finally spotted it
* Starting work on documentation for Defined Names, and some examples of using Named Ranges and Formulae
* Starting work on documentation for Defined Names, and some examples of using Named Ranges and Formulae
* Example of a relative named range for the documentation
* Mustn't have phpcs problems in sample code either
* More updates to the documentation
* That should conclude the documentation for Named Ranges, now time to move on to documenting Named Formulae
* That should conclude the documentation for Named Ranges, now time to move on to documenting Named Formulae
* PHPCS appeasement in sample code
* Initial documentation on Named Formulae
* PHPCS appeasements
* Additional comments in the documentation, and modify the named range name validation to support a \ as the first character in a name
* Fix breaking build
* Make defined names case-insensitive
* Fix case-insensitivity
* Improved documentation, and additional unit tests
* Additional unit tests, and a fix for removing a globally scoped defined name even if a worksheet is specified in the method call
* Fix unit test for removing named formulae
* Use assertCount instead of assertSame
* Forgotten voids
* Fix arguments for assertCount
* Unit tests for removing defined names, and a fix for removing locally scoped names
* Unit tests for absolute and relative named ranges in calculation engine, and fix an issue with worksheet name in the offset adjustments for relative references
* PHPCS Appeasement
* Additional unit tests, more documentation, and a fix to the calculation engine when no worksheet reference is provided with a named formula
* PHPCS appeasements
* Additional documentation and examples of using Named Formulae
* Additional examples to go with documentation
* A few minor phpcs appeasements
* Minor refactor of updateFormulaReferencesAnyWorksheet() method
* Discard an unused method argument
* Additional unit tests
* Additional unit tests
* Remove unused argument
* Stricter typing
* Fix return typehinting from remove named range/formula; should return the Spreadsheet object
* Use return typehint of self rather than explicit object type
* Redundant code just to keep scrutinizer happy
* Minor change to handle merge conflict
* phpcs fixes after merge
* Namespace usage ordering
* Please let this be the last phpcs fix needed
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
* Xls Writer - Correct Timestamp Bug, Improve Coverage
I believe that Xls Writer is 100% covered now.
The Xls Writer sets its timestamp incorrectly. The problem is actually
in Shared/Ole::localDateToOLE, which converts its timestamp using
gmmktime; mktime is correct. If I save a file at 3:00 p.m. in San Francisco,
this bug means the time is actually recorded as 3:00 p.m. UTC.
A consequence of this is that if you use Phpspreadsheet to read the
file and save it as a new Xls, the creation timestamp goes further
and further back in time with each generation (or further forward
if east of Greenwich). One of the tests added confirms that
the creation timestamp is consistent with the start and end times
of the test.
The major change in coverage is adding tests to save GIF and BMP
images, which aren't supported in Xls, but are converted to PNG
in the PhpSpreadsheet code.
* Improve Coverage for Sylk
I believe that both BaseReader and Sylk Reader are now 100% covered.
Documentation available for this format is sparse.
It was always incomplete, and in some cases inaccurate.
My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Cell values and calculated values,
and border styles were generally handled pretty well without changes.
Other types of styling were not handled so well. I added a few cells
to exercise some previously uncovered code.
Sylk files must be ASCII. I have deprecated the use of the
setEncoding and getEncoding functions, which had no test cases.
* Improve Coverage for Gnumeric
I believe that both BaseReader and Gnumeric Reader are now 100% covered.
My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Results were generally pretty good,
but there were no tests with assertions. I added a few cells
to exercise some previously uncovered code. Code was extensively
refactored; logic changes are noted below.
Code allowed for specifying document properties in an old format.
I considered removing that, but I found the original spec at
http://www.jfree.org/jworkbook/download/gnumeric-xml.pdf
This allowed me to create an old file, which was not handled
correctly because of namespace differences. The code was corrected
to allow for this difference.
Added support for textRotation.
Mapping of fill types was not correct.
* PHP7.2 Error
One assertion failed under PHP7.2. Apparently there was some change in
the handling of SimpleXMLElement between 7.2 and 7.3. Casting to string
before use eliminates the problem.
* Scrutinizer Recommendations
All minor, solved (hopefully) mostly by casts.
* One Last Scrutinizer Fix
... I hope.
Replace default gridlines with different style. Usable in PDF
as well as HTML.
Documentation mentioned use of setUseBOM with Html, but that method
does not exist, and there is no real reason to support it.
Removed it from documentation.
We give users the ability to edit Html/Pdf, but it's a little cumbersome
to use the edited Html for an Html file, and difficult to use it
for a Pdf. I believe we could make it fairly painless in both cases
by allowing the user to set a callback to edit the generated Html.
This can be accomplished with fewer than a dozen lines of very simple code.
I think this would be easier than grabbing the Html in pieces,
editing it, and reassembling it. I think it would also be simpler
than an alternative I considered, namely the addition of a new method
(e.g. saveEditedHtml) to each of the Html and Pdf writers.
One edit that users might like to make when editing html is to add
fallback fonts, something that is not currently available in
PhpSpreadsheet, and might be difficult to add. A natural extension to
that idea would be the use of webfonts, something which is guaranteed
difficult to add. See samples/Basic/17b_Html for an example of this.
None of the PDF writers support webfonts yet. That doesn't mean they
won't do so in future, but, for now, samples/Pdf/21a_Pdf is a prosaic
example of something you could do with this callback. In fact, this
opens the door to letting the user replace the entire body with data
of their choosing, effectively allowing PhpSpreadsheet (where you can
set things like paper size and orientation) to be used as a front-end to
the Pdf processor without the user having to be be overly familiar with
the vagaries of the PDF processor. I think this is actually a pretty
nice idea. YMMV. See samples/Basic/21b_Pdf for an example.
No code changes. The tests in all of these scripts write to at least
one temporary file, which is then read and not used again. The file
should be deleted to avoid filling up the disk system.
There are a number of situations where HTML write was producing
HTML which could not be validated. These include:
- inconsistent use of backslash terminating META, IMG, and COL tags
- @page style tags in body rather than header. Aside from being
non-standard, HTML Reader treats those as spreadsheet data.
- <div style="page-break-before:always" />, a construct which is
usually better handled through css anyhow.
- no alt tag for images (drawings and charts)
Other problems:
- Windows file names not handled correctly for images
- Memory drawings not handled in extendRowsForChartsAndImages
- No handling of different values for showing gridlines
for screen and print
- Mpdf and Dompdf do not require the use of inline css.
Tcpdf remains a holdout in the use of this inferior approach.
- no need to chunk base64 encoding of embedded images
- support for colors in number format was buggy (html tags
run through htmlspecialchars)
Code has been refactored when practical to reduce the number of
very large functions.
Coverage is now 100% for the entire HTML Writer module,
from 75% lines and 39% methods beforehand.
All functions dealing only with charts
are bypassed for coverage because the version of Jpgraph available in
Composer is not suitable for PHP7. The code will, nevertheless,
run successfully, but with warning messages. I have confirmed that
the code is entirely covered, without warnings, when the current
version of Jpgraph is used in lieu of the one available in Composer.
I will be glad to revisit this when the Jpgraph problem is resolved.
Directory PhpSpreadsheetTests/Writer/Html was created to house
the new tests. It seemed logical to move HtmlCommentsTest to
the new directory from PhpSpreadsheetTests/Functional.
A function to generate all the HTML is useful, especially for testing,
but also in lieu of the multiple other generate* functions. I have
added and documented generateHTMLAll.
The documentation for the generate* functions (a) produces invalid html,
(b) produces html which cannot be handled correctly by HTML reader,
and (c) even if those were correct, does not actually affect
the display of the spreadsheet. The documentation has been replaced
by a valid, and more instructive, example.
The (undocumented) useEmbeddedCss property, and the functions
to test and set it are no longer needed. Rather than breaking
existing code by deleting them, I marked the functions deprecated.
This change borrows a change to LocaleFloatsTest from
pull request 1456, submitted a little over a week before this one.
## Improve NumberFormat Support
First phase of this change included correcting NumberFormat handling
in HTML Writer. Certain complex formats could not be handled without
changes to Style/NumberFormat, and I did not wish to combine those changes.
Once the original change had been pushed, I took this part of it back up.
HTML Writer can now handle conditions in formats like:
[Blue][>=3000.5]$#,##0.00;[Red][<0]$#,##0.00;$#,##0.00
In testing, I discovered several errors and omissions
in handling of some other formats.
These are now corrected, and tests added.
All chart examples passed the displayBlanksAs parameter as 0 instead of 'gap'.
I added a constants EMPTY_AS_GAP, EMPTY_AS_ZERO and EMPTY_AS_SPAN to the
DataSeries and then change all chart samples to use this new constant.
Fixes#1337Closes#1448
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
* Document calculation caching; and how to disable it and how to flush the cache
* Quoted text for string values beginning with `=`, so that they are still treated as strings and not as formulae
* Warning about assigning cells to variables
* Further warning about assigning cells to variables
* getCell() with a second argument
* Added String Value Binder, and a Reader example demonstrating how to use it
* Ensure value is a string before binding
* Sample file for String Value Binder
* PHPCS moaning about order or use statements
* Order of annotations, that PHPStorm determined, isn't what phpcs says it should be
Column indexes are always based on 1 everywhere in PhpSpreadsheet.
This is consistent with rows starting at 1, as well as Excel
function `COLUMN()`. It should also make it easier to reason about
columns and rows and remove any doubts whether a specific method is
expecting 0 based or 1 based indexes.
Fixes#273
Fixes https://github.com/PHPOffice/PHPExcel/issues/307
Fixes https://github.com/PHPOffice/PHPExcel/issues/476
We used to have some kind of wrapper that didn't do much except
forward methods to the real instance. That unnecessary complexity
made it harder to work with the real writer instance.
Array keys used for styling have been standardized for a more coherent experience.
It now uses the same wording and casing as the getter and setter methods.
Closes#189
This drop a lot of non-core features code and delegate their maintainance
to third parties. Also it open the door to any missing implementation
out of the box, such as Redis for the moment.
Finally this consistently enforce a constraint where there can be one and
only one active cell at any point in time in code. This used to be true for
non-default implementation of cache, but it was not true for default
implementation where all cells were kept in-memory and thus were never
detached from their worksheet and thus were all kept functionnal at any
point in time.
This inconsistency of behavior between in-memory and off-memory could
lead to bugs when changing cache system if the end-user code was badly
written. Now end-user will never be able to write buggy code in the first
place, avoiding future headache when introducing caching.
Closes#3
Over the years PclZip became a maintenance problem. It's code base
is old and the official project seems to have died out. ZipArchive
is now more commonly available and offer an easier and more complete
API. So PclZip was dropped in order to focus our efforts on what matters.
Closes#18
Third party PDF libraries must now be installed via composer and naturally
via composer autoloading mechanism. Because of that it is not necessary
to specify their path on disk. The usage is simplified and it allows us
to include them in our unit tests.
This also means that from now on PhpSpreadsheet must use composer autoloader
mechanism. The internal autoloading implementation was dropped.