Commit Graph

88 Commits

Author SHA1 Message Date
MarkBaker f3d5028518 Work on setting up locale-aware formatted number conversion for the Csv Reader
Unit tests for locale-aware boolean conversion for Csv Reader
2022-03-02 08:53:29 +01:00
MarkBaker 444d0fd77c Unit tests for merge ranges for Ods Reader/Writer 2022-02-26 22:51:15 +01:00
oleibman 9cf526a920
Reading Xlsx With Supplied Palette (#2595)
Fix #2499, which see for details of an obscure problem affecting both PhpSpreadsheet and Excel. Add support for palette contained in workbook styles. This seems to be a very rare occurrence, so allow it only when the palette contains exactly 64 entries. If there are other possibilities, we'll presumably have a new workbook to guide us how to handle them. Also add some tests for specification of indexed color without palette, another rarity (no in-range examples amongst our current files). Also change one private static array, initialized once at run-time and never changed, to a constant.
2022-02-23 22:09:22 -08:00
Thorsten Ho 0cb60a5098
Fix XLSX broken vertical align font style (#2619)
* Fix XLSX broken vertical align font style

* Add fix information to changelog

* Fix phpcs issues
2022-02-23 20:23:59 -08:00
oleibman 5bf0656e92
Xlsx Reader Warning When No sz Tag for RichText (#2550)
Fix #2542. Xlsx Reader is expecting a `sz` tag when reading RichText, but it is not required, and PhpSpreadsheet issues a warning message when it is missing.
2022-02-12 06:43:29 -08:00
oleibman ad5532e2f4
Namespacing Phase 2 - Styles (#2471)
* WIP Namespacing Phase 2 - Styles

This is part 2 of a several-phase process to permit PhpSpreadsheet to handle input Xlsx files which use unexpected namespacing. The first phase, introduced as part of release 1.19.0, essentially handled the reading of data. This phase handles the reading of styles. More phases are planned.

It is my intention to leave this in draft status for at least a month. This will give time for additional testing, by me and, I hope, others who might be interested.

This fixes the same problem addressed by PR #2458, if it reaches mergeable status before I am ready to take this out of draft status. I do not anticipate any difficult merge conflicts if the other change is merged first.

This change is more difficult than I'd hoped. I can't get xpath to work properly with the namespaced style file, even though I don't have difficulties with others. Normally we expect:
```xml
<stylesheet xmlns="http://whatever" ...
```
In the namespaced files, we typically see:
```xml
<x:stylesheet xmlns:x="http://whatever" ...
```

Simplexml_load_file specifying a namespace handles the two situations the same, as expected. But, for some reason that I cannot figure out, there are significant differences when xpath processes the result. However, I can manipulate the xml if necessary; I'm not proud of doing that, and will gladly accept any suggestions. In the meantime, it seems to work.

My major non-standard unit test file had disabled any style-related tests when phase 1 was installed. These are now all enabled.

* Scrutinizer

Its analysis is wrong, but the "errors" it pointed out are easy to fix.

* Eliminate XML Source Manipulation

Original solution required XML manipulation to overcome what appears to be an xpath problem. This version replaces xpath with iteration, eliminating the need to manipulate the XML.

* Handle Some Edge Cases

For example, Style file without a Fills section.

* Restore RGB/ARGB Interchangeability

Fix #2494. Apparently EPPlus outputs fill colors as `<fgColor rgb="BFBFBF">` while most output fill colors as `<fgColor rgb="FFBFBFBF">`. EPPlus actually makes more sense. Regardless, validating length of rgb/argb is a recent development for PhpSpreadsheet, under the assumption that an incorrect length is a user error. This development invalidates that assumption, so restore the previous behavior.

In addition, a comment in Colors.php says that the supplied color is "the ARGB value for the colour, or named colour". However, although named colors are accepted, nothing sensible is done with them - they are passed unchanged to the ARGB value, where Excel treats them as black. The routine should either reject the named color, or convert it to the appropriate ARGB value. This change implements the latter.
2022-02-11 06:42:04 -08:00
mix5003 e7b0497237
fix warning when open xlsx file with thumbnail (#2517) 2022-01-24 14:17:53 -08:00
oleibman b6bd822b9c
Xlsx Reader Merge Range For Entire Column(s) or Row(s) (#2504)
* Xlsx Reader Merge Range For Entire Column(s) or Row(s)

Fix #2501. Merge range can be supplied as entire rows or columns, e.g. `1:1` or `A:C`. PhpSpreadsheet is expecting a row and a column to be specified for both parts of the range, and fails when the unexpected format shows up.

The code to clear cells within the merge range is very inefficient in terms of both memory and time, especially when the range is large (e.g. for an entire row or column). More efficient code is substituted. It is possible that we can get even more efficient by deleting the cleared cells rather than setting them to null. However, that needs more research, and there is no reason to delay this fix while I am researching.

When Xlsx Writer encounters a null cell, it writes it to the output file. For cell merges (especially involving whole rows or columns), this results in a lot of useless output. It is changed to skip the output of null cells when (a) the cell style matches its row's style, or (b) the row style is not specified and the cell style matches its column's style.

* Scrutinizer

See if these changes appease it.

* Improved CellIterators

Finally figured out how to improve efficiency here, meaning that there is no longer a reason to change Writer/Xlsx, so restore that.

* No Change for CellIterator

I had thought a change was needed for CellIterator, but it isn't.
2022-01-23 10:44:09 -08:00
Igor dbaafba6c6
Fix loading drawing size (#2492) 2022-01-16 21:59:31 -08:00
oleibman 06ea9ead2b
Xlsx Reader Cell DataType Numeric or Boolean Without Value (#2489)
Fix #2488. When Excel sees this situation, it leaves the value of the cell as null rather than casting to the specified DataType. It doesn't really make sense to change setValueExplicit to adopt this convention; it should be sufficient to recognize the situation in the Reader and act there. The same sort of situation might apply to strings, but I don't see any practical difference between null string and null even if so.
2022-01-16 21:19:09 -08:00
oleibman 8ab834520d
Handle Explicit "Date" Type for Cell (#2485)
Fix #2373. Excel can handle DateTime/Date/Time as a string if the datatype of the cell is set to "d". The string is, apparently, supposed to follow the ISO8601 spec. Openpyxl can be configured to generate a file with such values, so I've added support and set up unit tests. Excel, naturally, converts such a string input into its numeric representation of the date/time stamp. So will PhpSpreadsheet, so a call to setValueExplicit specifying Date format will actually see the cell wind up with Numeric format - there is no way (and no reason) for the Date type to 'stick'.
2022-01-13 18:40:18 -08:00
oleibman f24dcc7911
Another Undefined Index in Xls Reader (#2470)
Fix #2463. These continue to dribble in regularly.
2021-12-31 13:43:59 -08:00
oleibman 3a6558625d
General Style Specified in Uppercase in Input Xlsx (#2451)
* General Style Specified in Uppercase in Input Xlsx

Fix #2450. Treat input style GENERAL as if it were expected upper/lowercase.

* Declare Method as Static

Surprised neither Phpstan nor Scrutinizer flagged this.

* Remove Duplicated Statement

Don't know why Scrutinizer didn't flag this the first time.
2021-12-18 09:25:08 -08:00
oleibman d5825a6682
Read Spreadsheet with # in Name (#2409)
Fix #2405. Treat last, rather than first, `#` as separator between zip file name and member name, by finding it with strrpos rather than strpos.
2021-11-30 07:39:50 -08:00
oleibman 290c18e4db
Xlsx Reader Theme Support Broken After 17.1 (#2403)
Fix #2387. Fix #2075. There was substantial refactoring of Writer Xlsx styles in 18.0. An existing static property `$theme` was intended to be shared by both Writer Xlsx and the new Writer Xlsx Styles. However, the initialization of the property in the latter happened later than it should have. This PR makes that initialization happen as soon as the theme has been read. Also, declaring that property as static seems questionable; I have made it an instance member. This small re-factoring makes it possible to now support Themes in tab colors.

Since this PR changes Reader/Xlsx/Styles, add type-hinting throughout that module to eliminate Phpstan/Scrutinizer problems. I also removed method readStyle from Reader/Xlsx, since it was essentially duplicated in Reader/Xlsx/Styles. And I added a small number of tests to ensure that Styles is 100% covered. All of this is necessary in preparation for Namespacing phase 2.
2021-11-26 09:38:09 -08:00
oleibman f831f48b71
ZipArchive and "Inconsistent" Zip File (#2376)
* ZipArchive and "Inconsistent" Zip File

Fix #2362. I added test for zip file inconsistency when dealing with a particularly nasty PHP/libzip bug affecting zero-length files. However, we also now verify that the file starts with a valid zip signature, so the consistency test is not really needed, and, from what I've read on the web, isn't particularly useful. The file with a problem, for example, opens just fine with Excel and zip, despite Php reporting it as inconsistent (when asked to check consistency). So, remove the consistency check.

* Update Issue2362Test.php

Latest Phpstan does not allow cast from 'mixed' to 'string'.

* Update Issue2362Test.php
2021-11-12 01:18:57 -08:00
oleibman 2f1f3a19b8
Csv, Boolean, and StringValueBinder (#2374)
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet.

When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows:
- If StringValueBinder is not in effect, convert to boolean.
- If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string.
- Otherwise, convert to boolean.

This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
2021-11-12 00:04:08 -08:00
oleibman 241e82b284
Unexpected Format in Timestamp (#2332)
See issue #2331. Timestamp is expected in format yyyy-mm-dd (plus other information), with the expectation that month and day are 2 digits zero-filled on the left if needed. The user's file instead used a space rather than zero as filler. Although I don't know how the unexpected timestamp was created, it was easy enough to alter the timestamp in an otherwise normal spreadsheet, and use that file as a test case.
2021-10-23 18:23:53 -07:00
oleibman 9512f54cca
Corrections for Xlsx Read Comments (#2329)
This change was suggested by issue #2316. There was a problem reading Xlsx comments which appeared with release 18.0 but which was already fixed in master. So no source change was needed to fix the issue, but I thought we should at least add the test case to our unit tests.

In developing that case, I discovered that, although comment text was read correctly, there was a problem with comment author. In fact, there were two problems. One was new, with the namespacing changes - as in several other cases, the namespaced attribute `authorId` needed some special handling. However, another problem was much older - the code was checking `!empty($comment['authorId'])`, eliminating consideration of authorId=0, and should instead have been checking `isset`. Both problems are now fixed, and tested.
2021-10-23 17:32:44 -07:00
Georgiy fea0e34e90
fix #1114 issue (#2308)
* fix #1114 issues

* fixed code style

* update for all version

* add test for bag 1114

* remove comment

Co-authored-by: georgio <georgiokot@gmail.cim>
2021-10-03 09:29:23 -07:00
oleibman 90b9decb8e
Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix (#2303)
* Xlsx Reader Better Namespace Handling Phase 1 Second Bugfix

See issue #2301. The main problem in that issue had been introduced with 18.0 and had already ben fixed in master. However there was a subsequent problem that had been introduced in master, an undotted i uncrossed t with namespace handling. When using namespaces, need to call attributes() to access the attributes before trying to access them directly. Failure to do so in parseRichText caused fonts declared in Rich Text elements to be ignored.

* Add An Assertion

Addresses problem in 2301 that had already been fixed.
2021-09-27 16:59:45 -07:00
oleibman bc9234e5a5
Process Comments in Sylk File (#2277)
Fixes issue #2276.
2021-08-26 11:56:13 -07:00
oleibman de5f450856
Data Validations Referencing Another Sheet (#2265)
See issues #1432 and #2149. Data validations on an Xlsx worksheet can be specified in two manners - one (henceforth "internal") if a list is specified from the same sheet, and a different one (henceforth "external") if a list is specified from a different sheet. Xlsx worksheet reader formerly processed only the internal format; PR #2150 fixed this so that both would be processed correctly on read. However, Xlsx worksheet writer outputs data validators only in the internal format, and that does not work for external data validations; it appears, however, that internal data validations can be specified in external format.

This PR changes Xlsx worksheet writer to use only the external format. Somewhat surprisingly, this must come after most of the other XML tags that constitute a worksheet. It shares this characteristic (and XML tag) with conditional formatting. The new test case DataValidator2Test includes a worksheet which has both internal and external data validation, as well as conditional formatting.

There is some additional namespacing work supporting Data Validations that needs to happen on Xlsx reader. Since that is substantially unchanged with this PR, that work will happen in a future namespacing phase, probably phase 2. However, there are some non-namespace-related changes to Xlsx reader in this PR:
- Cell DataValidation adds support for a new property sqref, which is initialized through Xlsx reader using a setSqref method. If not initialized at write time, the code will work as it did before the introduction of this property. In particular, before this change, data validation applied to an entire column (as in the sample spreadsheet) would be applied only through the last populated row. In addition, this also allows a user to extend a Data Validation over a range of cells rather than just a single cell; the new method is added to the documentation.
- The topLeft property had formerly been used only for worksheets which use "freeze panes". However, as luck would have it, the sample dataset provided to demonstrate the Data Validations problem uses topLeft without freeze panes, slightly affecting the view when the spreadsheet is initially opened; PhpSpreadsheet will now do so as well.

It is worth noting issue #2262, which documents a problem with the hasValidValue method involving the calculation engine. That problem existed before this PR, and I do not yet have a handle on how it might be fixed.
2021-08-24 08:58:38 -07:00
Alayn Gortazar d0076343c4
Fix Reading XLSX files without styles.xml throws an exception. (#2247)
* Fix Reading XLSX files without styles.xml throws an exception.

* Bugfix, debugging code removed

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Fix Reading XLSX files without styles.xml throws an exception (rethinked)

* Style fixes

* Fix Spreadsheet loaded without styles cannot be written

* Replaced test files for empty styles.xml testing
2021-08-16 05:05:32 -07:00
oleibman d7ac7021c6
Apache OpenOffice Creates Xls Using Wrong Case for Number Format General (#2242)
See issue #2239. Problem is dealt with at the source, by making sure that Reader Xls checks for use of 'GENERAL' rather than 'General'. There doesn't seem to be a reason to test in other places, or to test for other casing variants.
2021-08-08 08:24:03 -07:00
James Lucas a9533b77ec Add unit tests for files with true/false (LibreOffice) in DataValidation boolean values and those with 1/0 (Excel, GoogleSheets) 2021-07-21 05:53:49 -07:00
oleibman 8729a68338
Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen (#2213)
* Xls Reader Handle MACCENTRALEUROPE With or Without Hyphen

Fixes issue #549 and https://github.com/Maatwebsite/Laravel-Excel/issues/989 (which is the source of the new test file). Some systems accept MACCENTRALEUROPE as the name for the appropriate encoding, and some accept MAC-CENTRALEUROPE. I fortunately have access to at least one of each type, and have run the tests on each.

CodePage.php has an array of translations from codepage number to string. I now allow the value to itself be an array; if so, the code will test each in turn to see if it can be used in iconv. I did not go fishing for other similar problems. If such show up, they can be dealt with in the same manner as this one. I don't really expect others, since this is a problem not merely for Xls, but, even then, it applies only to BIFF5 and earlier.

I also moved XlsTest from Reader to Reader/Xls.

* Cache Successful Result For Future Use

Per suggestion from @MarkBaker
2021-07-12 03:02:47 +02:00
oleibman cd84020693
Xlsx Reader Better Namespace Handling Phase 1 Try2 (#2173)
* Xlsx Reader Better Namespace Handling Phase 1 Try2

This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.

As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.

The original commit message follows.

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment https://github.com/PHPOffice/PhpSpreadsheet/issues/860#issuecomment-824926224 in issue #860 by @IMSoP has triggered an idea about how to proceed.

Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)

This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure

The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.

While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.

Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
2021-06-25 09:05:49 +02:00
jarrett jordaan 795992835f
When image source is a URL, store the URL for use during extraction. (#2072)
When image source is a link store the link.
Add url mutator.

Update section in documentation on image extraction.
2021-06-24 10:50:44 +02:00
Olivier TARGET 803737a893
Fix for #2149 / Read data validations for drop down list in another sheet. (#2150)
* Read data validations for drop down list in another sheet.

* Add function testLoadXlsxDataValidationOfAnotherSheet() in class tests/PhpSpreadsheetTests/Reader/XlsxTest.php for unit test.

* Add sample xlsx for unit tests.

* Modifiy call function isset() for warnings.

* Additional assertions to ensure that the worksheet has been read correctly for DataValidation that references a list on a different worksheet

* This should resolve the phpstan issues

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-06-15 13:28:10 +02:00
Mark Baker 74b02fb31c
Fix for the BIFF-8 Xls colour mappings in the Reader (#2156)
* Fix for the BIFF-8 Xls colour mappings in the Reader
* Unit test for reading colours, writing hen rereading and ensuring that the RGB values have not changed
2021-06-13 21:46:49 +02:00
oleibman d5492ac8ed
Merge branch 'master' into #984 2021-05-11 14:44:33 -07:00
Mark Baker 5ee4fbf090
Implement basic autofilter ranges with Gnumeric Reader (#2057)
* Load basic autofilter ranges with Gnumeric Reader
* Handle null values passed to row height/column with/merged cells/autofilters
2021-05-04 22:32:12 +02:00
Mark Baker fd14da1675
Ods defined names unit tests (#2054)
* Defined names/formulae in ODS are prefixed by $$ when used in a formula; so we need to strip this out to fully convert them to an Excel formula

* Test for ODS Writer for DefinedNames
2021-05-03 08:39:42 +02:00
xandros15 a757692992 #984 add support notContainsText for conditional styles in xlsx reader 2021-05-02 22:09:38 +02:00
Mark Baker 83e55cffcc
First steps in the implementation of AutoFilters for ODS Reader and Writer (#2053)
* First steps in the implementation of AutoFilters for ODS Reader and Writer, starting with reading a basic AutoFilter range (ignoring row visibility, filter types and active filters for the moment).

And also some additional refactoring to extract the DefinedNames Reader into its own dedicated class as a part of overall code improvement... on the principle of "when working on a class, always try to leave the library codebase in a better state than you found it"

* Provide a basic Ods Writer implementation for AutoFilters
* AutoFilter Reader Test
* AutoFilter Writer Test
* Update Change Log
2021-05-02 22:00:48 +02:00
Mark Baker d555b5d312
Pattern Fill style should default to 'solid' if there is a pattern fill with colour but no style (#2050)
* Pattern Fill style should default to 'solid' if there is a pattern fill style for a conditional; though may need to check if there are defined fg/bg colours as well; and only set a fill style if there are defined colurs
2021-04-30 20:05:45 +02:00
oleibman aeccdb35e2
XLSX Reader and Empty Fill Tag (#2011)
Openpyxl can generate the xml tag `<patternFill/>`, possibly even as a default style. Excel has no problem with this, treating it as "fill none", but PhpSpreadsheet has a glitch because it treats it as "fill solid white". So, when PhpSpreadsheet loads and saves such a file, the result at first appears as if gridlines are disabled; in fact, the gridlines are merely invisible behind the cells with their solid white fill. This PR makes PhpSpreadsheet behave the same as Excel in this circumstance.

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-04-20 17:20:59 +02:00
Mark Baker 6282035c96
Extract Property and Style readers from the XML Reader into separate classes (#2009)
* Extract Property and Style readers from the XML Reader into separate classes
2021-04-20 15:27:44 +02:00
Tiago Fernandes 11f8a02194
Merge branch 'master' into issue-907-absolute-path-in-Target 2021-04-19 14:32:20 +01:00
Tiago Fernandes 559c0761df Remove unnecessary changes. Added test 2021-04-19 11:25:48 +01:00
Darren Maczka c82ff2526c
Fix/chart axis titles (#1760)
* use axPos value to determine whether an axis title is mapped to the XaxisLabel or YaxisLabel

* update changelog

* Fix php-cs-fixer violations

Co-authored-by: Darren Maczka <dkm@utk.edu>
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-01-31 19:13:50 +01:00
Darren Maczka 44248cd04e
Fix/sheets xlsx chart (#1761)
* Add support for Google Sheets Exported XLSX Charts

Google Sheets XLSX charts use oneCellAnchor positioning and the data series
do not have the *Cache elements with cached values.

* update CHANGELOG

* Add support for Google Sheets Exported XLSX Charts

Google Sheets XLSX charts use oneCellAnchor positioning and the data series
do not have the *Cache elements with cached values. Because the reader had been
assuming *Cache elements existed as children of strRef and numRef, errors about
the node being deleted were thrown when reading Xlsx exported from Google Sheets.

Co-authored-by: Darren Maczka <dkm@utk.edu>
2021-01-31 18:53:54 +01:00
もりもと たかひろ 8d2d78334f
Support DataBar of conditional formatting rule (#1754)
Implemented the databar of Conditional Type for XLSX Files.
- DataBar can be read, written, and added for basic use.
- Supports reading, writing and adding using "extLst".

About "extLst"
- https://docs.microsoft.com/en-us/openspecs/office_standards/ms-xlsx/07d607af-5618-4ca2-b683-6a78dc0d9627

The following setting items on the Excel setting screen can be read, written, and added.
- (minimum, maximum)type: Automatic, LowestValue, Number, Percent, Formula, Percentile
- Direction: context, leftToRight, rightToLeft (show data bar only)
- Fills Solid, Gradient
- FillColor: PositiveValues, NegativeValues
- Borders: Solid, None
- BorderColor: PositiveValues, NegativeValues
- Axis position: Automatic, Midpoint, None
- Axis color
2021-01-29 16:57:40 +01:00
oleibman a66233b72f
Fix For #1772 Null Exception on ODS Read (#1776)
Fix for #1772.
Header and Footer Properties may be omitted in Page Setting Style Set.
Code changed to allow for this possibility, and tests added.
2021-01-28 12:42:41 +01:00
oleibman e768cb0f19
CSV - Guess Encoding, Handle Null-string Escape (#1717)
* CSV - Guess Encoding, Handle Null-string Escape

This is in response to issue #1647 (detect CSV character encoding).
First, my tests with mb_detect_encoding indicate that it doesn't work
well enough; regardless, users can always do that on their own
if they deem it useful.
Rolling my own is also troublesome, but I can at least:
a. Check for BOM (UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE).
b. Do some heuristic tests for each of the above encodings.
c. Fallback to a user-specified encoding (default CP1252)
  if a and b don't yield result.
I think this is probably useful enough to include, and relatively
easy to expand if other potential encodings should be considered.

Starting with PHP7.4, fgetcsv allows specification of null string as
escape character in fgetcsv. This is a much better choice than the PHP
(and PhpSpreadsheet) default of backslash in that it handles the file
in the same manner as Excel does. There is one statement in Reader/CSV
which would be adversely affected if the caller so specified (building
a regular expression under the assumption that escape character is
a single character). Fix that statement appropriately and add tests.
2020-12-25 17:47:29 +01:00
Gianluca Giovinazzo 51cb21297d
Fix for bug #1592 (UPDATED) (#1623)
* Fix for Xls when BIFF8 SST (FCh) has bad Shared string length
2020-12-17 19:41:07 +01:00
oleibman 497a934374
Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742)
* Fix for 3 Issues Involving ReadXlsx and NamedRange

Issues #1686 and #1723, which provide sample spreadsheets, are probably
solved by this ticket. Issue #1730 is also probably solved, but I have
no way to verify.

There are two problems with how PhpSpreadsheet is handling things now.
Although the first problem is much less severe, and isn't really a factor
in the issues named above, it is helpful to get it out of the way first.
If you define a named range in Excel, and then delete the sheet where
the range exists, Excel saves the range as #REF!. If there is a cell which
references the range, it will similarly have the value #REF! when you open
the Excel file.
Currently, PhpSpreadsheet discards the #REF! definition, so a cell which
references the range will appear as #NAME? rather than #REF!.
This PR changes the behavior so that PhpSpreadsheet retains the #REF!
definition, and cells which reference it will appear as #REF!.

The second problem is the more severe, and is, I believe, responsible
for the 3 issues identified above.
If you define a named range and the sheet on which the range is defined
does not exist at the time, Excel will save the range as something like:

'[1]Unknown Sheet'!$A$1

If a cell references such a range, Excel will again display #REF!.
PhpSpreadsheet currently throws an Exception when it encounters
such a definition while reading the file. This PR changes
the behavior so that PhpSpreadsheet saves the definition as #REF!,
and cells which reference it will behave similarly.

For the record, I will note that Excel does not magically recalculate when a
missing sheet is subsequently added, despite the fact that the reference
might now become resolvable. PhpSpreadsheet behaves likewise.

* Remove Dead Code in Test

Identified it after push but before merge.
2020-12-10 18:08:10 +01:00
oleibman 1741766a9c
Improving Coverage for Excel2003 XML Reader (#1557)
* Improving Coverage for Excel2003 XML Reader

Reader/Xml is now 100% covered.

File templates/Excel2003XMLTest.xml, used in some tests, is *not*
readable by a current version of Excel. I have substituted a new file
excel2003.xml to be used in its place. I have not deleted the original
in case someone in future (possibly me) wants to see what it needs to
make it usable.

There are minimal code changes.
- Unused protected functions pixel2WidthUnits and widthUnits2Pixel
  are deleted.
- One regex looking to convert hex characters is changed from a-z to a-f,
  and made case insensitive.
- No calculation performed for "error" cell (previously calculation
  was attempted and threw exception).
- Empty relative row/cell is now handled correctly.
- Style applied to empty cell when appropriate.
- Support added for textRotation.
- Support added for border styles.
- Support added for diagonal borders.
- Support added for superscript and subscript.
- Support added for fill patterns.

In theory, encodings other than UTF-8 were supported.
In fact, I was unable to get SecurityScanner to pass *any* xml which is
not UTF-8. Eliminating the assumption that strings might not be UTF-8
allowed much of the code to be greatly simplified.
After that, I added some code that would permit the use of
some ASCII-compatible encodings (there is a test of ISO-8859-1).
It would be more difficult to handle other encodings (such as UTF-16).
I am not convinced that even the ISO-8859 effort is worth it,
but am willing to investigate either expanding or eliminating
non-UTF8 support.

I added a number of tests, creating an Xml directory, and moving
XmlTest to that directory.

Pull Request had problems reading old invalid sample in the code
coverage phase, not in any of the other test phases, and not in
the code coverage phase on my local machine.
As it turns out, aside from being invalid, the sample
is much larger than any of the other samples. Tests have been
adjusted accordingly.

* Smaller Test File

Should eliminate need to avoid test during xml coverage.

* Break Up Style Test into Multiple Tests

Per suggestion from Mark Baker.

* Integrate AddressHelper Change

The introduction of AddressHelper introduced a conflict which needed to
be resolved. I wanted to test it locally before resolving. This required
me to add (unchanged) AddressHelper to my local copy. I hope this is
an okay manner of resolving the conflict.

* Weird Travis Error

XmlOddTest works just fine on my local machine, but Travis failed it.
Even worse, the lines which Travis flags don't even make any sense
(one was the empty line between two methods!).
This test is not essential to the rest of the change. I am removing
it from the package, and will attempt to re-add it when I have a chance
to sync up my fork with the main project.
2020-10-11 13:26:56 +02:00
Adrien Crivelli 4739f8b2e7
Merge branch 'readhtml' 2020-07-26 13:11:15 +09:00