Commit Graph

231 Commits

Author SHA1 Message Date
Mark Baker fd14da1675
Ods defined names unit tests (#2054)
* Defined names/formulae in ODS are prefixed by $$ when used in a formula; so we need to strip this out to fully convert them to an Excel formula

* Test for ODS Writer for DefinedNames
2021-05-03 08:39:42 +02:00
xandros15 a757692992 #984 add support notContainsText for conditional styles in xlsx reader 2021-05-02 22:09:38 +02:00
Mark Baker 83e55cffcc
First steps in the implementation of AutoFilters for ODS Reader and Writer (#2053)
* First steps in the implementation of AutoFilters for ODS Reader and Writer, starting with reading a basic AutoFilter range (ignoring row visibility, filter types and active filters for the moment).

And also some additional refactoring to extract the DefinedNames Reader into its own dedicated class as a part of overall code improvement... on the principle of "when working on a class, always try to leave the library codebase in a better state than you found it"

* Provide a basic Ods Writer implementation for AutoFilters
* AutoFilter Reader Test
* AutoFilter Writer Test
* Update Change Log
2021-05-02 22:00:48 +02:00
Mark Baker d555b5d312
Pattern Fill style should default to 'solid' if there is a pattern fill with colour but no style (#2050)
* Pattern Fill style should default to 'solid' if there is a pattern fill style for a conditional; though may need to check if there are defined fg/bg colours as well; and only set a fill style if there are defined colurs
2021-04-30 20:05:45 +02:00
oleibman cc5c0205d5
Fix for Issue 2029 (Invalid Cell Coordinate A-1) (#2032)
* Fix for Issue 2029 (Invalid Cell Coordinate A-1)

Fix for #2021. When Html Reader encounters an embedded table, it tries to shift it up a row. It obviously should not attempt to shift it above row 1. @danmodini reported the problem, and suggests the correct solution. This PR implements that and adds a test case.

Performing some additional testing, I found that Html Reader cannot handle inline column width or row height set in points rather than pixels (and HTML writer with useInlineCss generates these values in points). It also doesn't handle border style when the border width (which it ignores) is omitted. Fixed and added tests.
2021-04-29 22:59:01 +02:00
oleibman aeccdb35e2
XLSX Reader and Empty Fill Tag (#2011)
Openpyxl can generate the xml tag `<patternFill/>`, possibly even as a default style. Excel has no problem with this, treating it as "fill none", but PhpSpreadsheet has a glitch because it treats it as "fill solid white". So, when PhpSpreadsheet loads and saves such a file, the result at first appears as if gridlines are disabled; in fact, the gridlines are merely invisible behind the cells with their solid white fill. This PR makes PhpSpreadsheet behave the same as Excel in this circumstance.

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-04-20 17:20:59 +02:00
Mark Baker 6282035c96
Extract Property and Style readers from the XML Reader into separate classes (#2009)
* Extract Property and Style readers from the XML Reader into separate classes
2021-04-20 15:27:44 +02:00
Tiago Fernandes 11f8a02194
Merge branch 'master' into issue-907-absolute-path-in-Target 2021-04-19 14:32:20 +01:00
Tiago Fernandes 559c0761df Remove unnecessary changes. Added test 2021-04-19 11:25:48 +01:00
Adrien Crivelli 49f87de165
Reduce PHPStan error in tests 2021-04-12 11:10:23 +09:00
Adrien Crivelli d02352845c
PHPStan Level 2 2021-04-04 22:06:00 +09:00
Adrien Crivelli a189d933f2
Introduce PHPStan
To improve the feedback loop on code quality with a process
that can be run locally by the developers, instead of only
on Scrutinizer.
2021-04-03 16:13:21 +09:00
oleibman 13b62becdd
Fix for Issue #1887 - Lose Track of Selected Cells After Save (#1908)
* Fix for Issue #1887 - Lose Track of Selected Cells After Save

Issue #1887 reports that selected cells are lost after saving Xlsx. Testing indicates that this applies to the object in memory, though not to the saved spreadsheet.

Xlsx writer tries to save calculated values for cells which contain formulas. Calculation::_calculateFormulaValue issues a getStyle call merely to retrieve the quotePrefix property, which, if set, indicates that the cell does not contain a formula even though it looks like one. A side-effect of calls to getStyle is that selectedCell is updated. That is clearly accidental, and highly undesirable, in this case. Code is changed to save selectedCell before getStyle call and restore it afterwards.

The problem was reported only for Xlsx save. To be on the safe side, test is made for output formats of Xlsx, Xls, Ods, Html (which basically includes Pdf), and Csv. For all of those, the object in memory is tested after the save. For Xlsx and Xls, the saved file is also tested. It does not make sense to test the saved file for Csv and Html. It does make sense to test it for Ods, but the necessary support is not yet present in either the Ods Reader or Ods Writer - a project for another day.

* Move Logic Out of Calculation, Add Support for Ods ActiveSheet and SelectedCells

Mark Baker thought logic belonged in Worksheet, not Calculation.
I couldn't get it to work in Worksheet, but doing it in Cell works,
and that has already been used to preserve ActiveSheet over call to
getCalculatedValue, so this just extends that idea to SelectedCells.

Original tests could not completely support Ods because of a lack of support
for ActiveSheet and SelectedCells in Ods Reader and Writer.
There's a lot missing in Ods support, but a journey of 1000 miles ...
Those two particular concepts are now supported for Ods.
2021-03-10 21:23:08 +01:00
Patrick Brouwers 000e6088c9
Reverted Scrutinzer fix in Xslx Reader listWorksheetInfo (#1895) 2021-03-03 21:34:45 +01:00
oleibman 04e7c30758
Fix Two 32-bit Timestamp Problems, and Minor getFormattedValue Bug (#1891)
I ran the test suite using 32-bit PHP. There were 2 places where changes
were needed due to 32-bit timestamps.

Reader\\Xml.php was using strtotime as an intermediate step in converting
a string timestamp to an Excel timestamp. The XML file type stores pure timestamps
(i.e. no date portion) as, e.g., 1899-12-31T02:30:00.000, and that value
causes an error using strtotime on a 32-bit system. However, it is sufficient
to use that value in a DateTime constructor, and that will work for 32- and 64-bit.

There was no test for that particular cell, so I added one to the XML read test.
And that's when I discovered the getFormattedValue bug. The cell's format
is `hh":"mm":"ss`. The quotes around the colons are disrupting the formatting.
PhpSpreadsheet formats the cell by converting the Excel format
to a Php Date format, in this case `H\:m\:s`.
That's a problem,
since Excel thinks 'm' means *minutes*, but PHP thinks it means *months*.
This is not a problem when the colon is not quoted; there are ample tests for that.
I added my best guess as to how to recognize this situation,
changing `\:m` to `:i`. The XML read test
now succeeds, and no other tests were broken by this change.

Test Shared\\DateTest had one test where the expected result of converting to a
Unix timestamp exceeds 2**32. Since a Unix timestamp is strictly an int,
that test fails on a 32-bit system. In the discussion regarding recently merged
PR #1870, it was felt that the user base might still be using the functions
that convert to and from a timestamp. So, we should not drop this test, but,
since it cannot succeed on a 32-bit system, I changed it to be skipped
whenever the expected result exceeded PHP_INT_MAX. There are 3 "toTimestamp"
functions within that test. Only one of these had been affected, but I thought
it was a good idea to add additional tests to the others to demonstrate this
condition.

In the course of testing, I also discovered some 32-bit problems with
bitwise and base-conversion functions. I am preparing separate PRs to
deal with those.
2021-03-03 10:52:11 +01:00
oleibman 80a20fc991
100% Coverage for Calculation/DateTime (#1870)
* 100% Coverage for Calculation/DateTime

The code in DateTime is now completely covered.
Along the way, some errors were discovered and corrected.
- The tests which have had to be changed at the start of every year are
replaced by more robust equivalents which do not require annual changes.
- Several places in the code where Gnumeric and OpenOffice were thought to differ
from Excel do not appear to have had any justification.
I have left a comment where such code has been removed.
- Use DateTime when possible rather than date, time, or strftime functions to avoid
potential Y2038 problems.
- Some impossible code has been removed, replaced by an explanatory comment.
- NETWORKDAYS had a bug when the start date was Sunday. There had been no tests
of this condition.
- Some functions allow boolean and null arguments where a number is expected.
This is more complicated than the equivalent situations in MathTrig because
the initial date for these calculations can be Day 1 rather than Day 0.
- More testing for dates from 1900-01-01 through the fictitious
everywhere-but-Excel 1900-01-29.
    - This showed that there is an additional Excel bug - Excel evaluates
WEEKNUM(emptycell) as 0, which is not a valid result for
WEEKNUM without a second argument.
PhpSpreadsheet now duplicates this bug.
    - There is a similar and even worse bug for 1904-01-01 in 1904 calculations.
Weeknum returns 0 for this,
but returns the correct value for arguments of 0 or null.
    - DATEVALUE should accept 1900-02-29 (sigh) and relatives.
PhpSpreadsheet now duplicates this bug.
- Testing bootstrap sets default timezone. This appears to be a relic from
the releases of PHP where the unwise decision, subsequenly reversed,
was made to issue messages for
"no default timezone is set" rather than just use a sensible default.
This was a disruptive setting for some of the tests I added.
There is only one test in the entire suite which is default-timezone-dependent.
Setting and resetting of default timezone is moved to that test
(Reader/ODS/ODSTest), and out of bootstrap.
- There had been no testing of NOW() function.
- DATEVALUE test had no tests for 1904 calendar and needs some.
- DATE test changed 1900/1904 calendar in use without restoring it.
- WEEKDAY test had no tests for 1904 calendar and needs some.
    - Which revealed a bug in Shared/Date (excelToDateTimeObject was not
recognizing 1904-01-01 as valid when 1904 calendar is in use).
    - And an additional bug in that legal 1904-calendar values in the 0.0-1.0
range yielded the same "wrong" answers as 1900-calendar (see "One note" below).
Also the comment for one of the calendar-1904 tests was wrong in attempting
to identify what time of day the fraction represented.

I had wanted to break this up into a set of smaller modules, a process already
started for Engineering and MathTrig.
However the number of source code changes was sufficient that I wanted
a clean delta for this request.
If it is merged, I will work on breaking it up afterwards.

One note - Shared/Date/excelToDateTimeObject, when calendar-1900 is in use,
returns an unexpected result if its argument is between 0 and 1,
which is nominally invalid for that calendar.
It uses a base-1970 calendar in that instance. That check is not justifiable
for calendar-1904, where values in that range are legal,
so I made the check specific to calendar-1900,
and adjusted 3 1904 unit test results accordingly. However, I have to admit that
I don't understand why that check should be made even for calendar-1900.
It certainly doesn't match anything that Excel does.
I would recommend scrapping that code altogether.
If agreed, I would do this as part of the break-up into smaller modules.

Another note -
more controversially, it is clear that PhpSpreadsheet needs to support
the Excel and PHP date formats. Although it requires further study,
I am not convinced that it needs to support Unix timestamp format.
Since that is a potential source of Y2038 problems on 32-bit systems,
I would like to open a PR to deprecate the use of that format.
Please let me know if you are aware of a valid reason to continue to support it.
2021-02-27 20:43:22 +01:00
oleibman cb23cca3ec
Avoid Duplicate Titles When Reading Multiple HTML Files (#1829)
This issue arose while researching issue #1823. The issue was not a bug;
it just required clarification to the author of how to use the software.
But, while researching, I discovered that loading html into 2
sheets of a spreadsheet has a problem if the html title tag is the same
for the 2 sheets. PhpSpreadsheet would be able to save the resulting file,
but Excel would not be able to read it properly because of the duplicate title.
The worksheet setTitle method allows for disambiguation is such a circumstance.
The html reader passed a parameter indicating "don't disambiguate", but I can't
see any harm in changing that to "disambiguate". An extremely simple fix,
with tests to back it up.
2021-02-27 15:10:04 +01:00
Darren Maczka c82ff2526c
Fix/chart axis titles (#1760)
* use axPos value to determine whether an axis title is mapped to the XaxisLabel or YaxisLabel

* update changelog

* Fix php-cs-fixer violations

Co-authored-by: Darren Maczka <dkm@utk.edu>
Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2021-01-31 19:13:50 +01:00
Darren Maczka 44248cd04e
Fix/sheets xlsx chart (#1761)
* Add support for Google Sheets Exported XLSX Charts

Google Sheets XLSX charts use oneCellAnchor positioning and the data series
do not have the *Cache elements with cached values.

* update CHANGELOG

* Add support for Google Sheets Exported XLSX Charts

Google Sheets XLSX charts use oneCellAnchor positioning and the data series
do not have the *Cache elements with cached values. Because the reader had been
assuming *Cache elements existed as children of strRef and numRef, errors about
the node being deleted were thrown when reading Xlsx exported from Google Sheets.

Co-authored-by: Darren Maczka <dkm@utk.edu>
2021-01-31 18:53:54 +01:00
もりもと たかひろ 8d2d78334f
Support DataBar of conditional formatting rule (#1754)
Implemented the databar of Conditional Type for XLSX Files.
- DataBar can be read, written, and added for basic use.
- Supports reading, writing and adding using "extLst".

About "extLst"
- https://docs.microsoft.com/en-us/openspecs/office_standards/ms-xlsx/07d607af-5618-4ca2-b683-6a78dc0d9627

The following setting items on the Excel setting screen can be read, written, and added.
- (minimum, maximum)type: Automatic, LowestValue, Number, Percent, Formula, Percentile
- Direction: context, leftToRight, rightToLeft (show data bar only)
- Fills Solid, Gradient
- FillColor: PositiveValues, NegativeValues
- Borders: Solid, None
- BorderColor: PositiveValues, NegativeValues
- Axis position: Automatic, Midpoint, None
- Axis color
2021-01-29 16:57:40 +01:00
oleibman a66233b72f
Fix For #1772 Null Exception on ODS Read (#1776)
Fix for #1772.
Header and Footer Properties may be omitted in Page Setting Style Set.
Code changed to allow for this possibility, and tests added.
2021-01-28 12:42:41 +01:00
MarkBaker 11522afee0 Merge branch 'master' into PHP8-Sane-Property-Names
# Conflicts:
#	CHANGELOG.md
#	src/PhpSpreadsheet/Shared/Drawing.php
#	src/PhpSpreadsheet/Spreadsheet.php
#	src/PhpSpreadsheet/Style/Conditional.php
2020-12-27 15:07:50 +01:00
oleibman e768cb0f19
CSV - Guess Encoding, Handle Null-string Escape (#1717)
* CSV - Guess Encoding, Handle Null-string Escape

This is in response to issue #1647 (detect CSV character encoding).
First, my tests with mb_detect_encoding indicate that it doesn't work
well enough; regardless, users can always do that on their own
if they deem it useful.
Rolling my own is also troublesome, but I can at least:
a. Check for BOM (UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE).
b. Do some heuristic tests for each of the above encodings.
c. Fallback to a user-specified encoding (default CP1252)
  if a and b don't yield result.
I think this is probably useful enough to include, and relatively
easy to expand if other potential encodings should be considered.

Starting with PHP7.4, fgetcsv allows specification of null string as
escape character in fgetcsv. This is a much better choice than the PHP
(and PhpSpreadsheet) default of backslash in that it handles the file
in the same manner as Excel does. There is one statement in Reader/CSV
which would be adversely affected if the caller so specified (building
a regular expression under the assumption that escape character is
a single character). Fix that statement appropriately and add tests.
2020-12-25 17:47:29 +01:00
Gianluca Giovinazzo 51cb21297d
Fix for bug #1592 (UPDATED) (#1623)
* Fix for Xls when BIFF8 SST (FCh) has bad Shared string length
2020-12-17 19:41:07 +01:00
oleibman 3025824a48
Merge pull request #1698
* Merge pull request #4 from PHPOffice/master

* Restore Omitted Read XML Test
2020-12-17 17:00:19 +01:00
oleibman e0feeca555
Fix for #1612 - SLK Long File Name (#1706)
Issue has been marked stale, but ...
Sylk read sets worksheet title to filename (minus .slk).
If that is >31 characters, PhpSpreadsheet throws Exception.
This change truncates sheet title, as Excel does, to 31 characters.
2020-12-10 22:02:36 +01:00
oleibman 497a934374
Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742)
* Fix for 3 Issues Involving ReadXlsx and NamedRange

Issues #1686 and #1723, which provide sample spreadsheets, are probably
solved by this ticket. Issue #1730 is also probably solved, but I have
no way to verify.

There are two problems with how PhpSpreadsheet is handling things now.
Although the first problem is much less severe, and isn't really a factor
in the issues named above, it is helpful to get it out of the way first.
If you define a named range in Excel, and then delete the sheet where
the range exists, Excel saves the range as #REF!. If there is a cell which
references the range, it will similarly have the value #REF! when you open
the Excel file.
Currently, PhpSpreadsheet discards the #REF! definition, so a cell which
references the range will appear as #NAME? rather than #REF!.
This PR changes the behavior so that PhpSpreadsheet retains the #REF!
definition, and cells which reference it will appear as #REF!.

The second problem is the more severe, and is, I believe, responsible
for the 3 issues identified above.
If you define a named range and the sheet on which the range is defined
does not exist at the time, Excel will save the range as something like:

'[1]Unknown Sheet'!$A$1

If a cell references such a range, Excel will again display #REF!.
PhpSpreadsheet currently throws an Exception when it encounters
such a definition while reading the file. This PR changes
the behavior so that PhpSpreadsheet saves the definition as #REF!,
and cells which reference it will behave similarly.

For the record, I will note that Excel does not magically recalculate when a
missing sheet is subsequently added, despite the fact that the reference
might now become resolvable. PhpSpreadsheet behaves likewise.

* Remove Dead Code in Test

Identified it after push but before merge.
2020-12-10 18:08:10 +01:00
Adrien Crivelli bd05c590e3
Drop Travis 2020-11-26 11:10:52 +09:00
MarkBaker bd0462bcfc Work on renaming method arguments for the Readers and Writers 2020-11-19 16:41:52 +01:00
oleibman 1741766a9c
Improving Coverage for Excel2003 XML Reader (#1557)
* Improving Coverage for Excel2003 XML Reader

Reader/Xml is now 100% covered.

File templates/Excel2003XMLTest.xml, used in some tests, is *not*
readable by a current version of Excel. I have substituted a new file
excel2003.xml to be used in its place. I have not deleted the original
in case someone in future (possibly me) wants to see what it needs to
make it usable.

There are minimal code changes.
- Unused protected functions pixel2WidthUnits and widthUnits2Pixel
  are deleted.
- One regex looking to convert hex characters is changed from a-z to a-f,
  and made case insensitive.
- No calculation performed for "error" cell (previously calculation
  was attempted and threw exception).
- Empty relative row/cell is now handled correctly.
- Style applied to empty cell when appropriate.
- Support added for textRotation.
- Support added for border styles.
- Support added for diagonal borders.
- Support added for superscript and subscript.
- Support added for fill patterns.

In theory, encodings other than UTF-8 were supported.
In fact, I was unable to get SecurityScanner to pass *any* xml which is
not UTF-8. Eliminating the assumption that strings might not be UTF-8
allowed much of the code to be greatly simplified.
After that, I added some code that would permit the use of
some ASCII-compatible encodings (there is a test of ISO-8859-1).
It would be more difficult to handle other encodings (such as UTF-16).
I am not convinced that even the ISO-8859 effort is worth it,
but am willing to investigate either expanding or eliminating
non-UTF8 support.

I added a number of tests, creating an Xml directory, and moving
XmlTest to that directory.

Pull Request had problems reading old invalid sample in the code
coverage phase, not in any of the other test phases, and not in
the code coverage phase on my local machine.
As it turns out, aside from being invalid, the sample
is much larger than any of the other samples. Tests have been
adjusted accordingly.

* Smaller Test File

Should eliminate need to avoid test during xml coverage.

* Break Up Style Test into Multiple Tests

Per suggestion from Mark Baker.

* Integrate AddressHelper Change

The introduction of AddressHelper introduced a conflict which needed to
be resolved. I wanted to test it locally before resolving. This required
me to add (unchanged) AddressHelper to my local copy. I hope this is
an okay manner of resolving the conflict.

* Weird Travis Error

XmlOddTest works just fine on my local machine, but Travis failed it.
Even worse, the lines which Travis flags don't even make any sense
(one was the empty line between two methods!).
This test is not essential to the rest of the change. I am removing
it from the package, and will attempt to re-add it when I have a chance
to sync up my fork with the main project.
2020-10-11 13:26:56 +02:00
Roland Eigelsreiter ab4d7413b0
fixed php8 deprecation warning for libxml_disable_entity_loader() (#1625)
* fixed php8 deprecation warning for libxml_disable_entity_loader()
2020-10-08 15:02:14 +02:00
Adrien Crivelli 6a41381c1d
PSR12 code style 2020-07-26 14:13:11 +09:00
Adrien Crivelli 4739f8b2e7
Merge branch 'readhtml' 2020-07-26 13:11:15 +09:00
oleibman 735103c120
Improve Coverage for ODS Reader (#1545)
* Improve Coverage for ODS Reader

Reader/ODS/Properties is now 100% covered.
Reader/ODS is covered except for 1 statement. As the original author
put it, "table-header-rows TODO: figure this out ... I'm not sure that
PhpExcel has an API for this". I'm still thinking about it, but, so far,
I agree with the author.

There are minimal code changes.
- Several places test !zip->open() to see whether the test failed.
  However, zip->open() returns true or a string, so the test never
  detects failure. Change to zip->open() !== true. No previous tests.
- Suppress warning messages from simplexml_load_string (there had
  been no tests for invalid xml).
- One document property was misnamed, and one non-existent property
  was tested for.

I added a number of tests, creating an ODS directory, and moving
OdsTest to that directory.

* Scrutinizer Recommendation

Unused variable in one test.

* Update CHANGELOG

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2020-07-26 12:40:49 +09:00
Mark Baker 5233e9caaf
Merge branch 'master' into Page-Setup-Page-Order 2020-07-19 12:57:48 +02:00
Dhaval Purohit 7e12575d86
Borders are complete even on rowspanned columns using HTML reader
Fixed #1455
Closes #1473
2020-07-19 14:04:53 +09:00
Adrien Crivelli 395b750030
Stricter visibility 2020-07-19 12:30:31 +09:00
Adrien Crivelli c3fa31de13
Missing typing 2020-07-19 12:21:40 +09:00
MarkBaker 6cbb622a9e Minor refactoring 2020-07-05 18:25:39 +02:00
MarkBaker cf6769eab1 Hopefully a final phpcs fix before I start looking at how to run a pre-commit hook on Windows 10 2020-07-05 17:17:56 +02:00
MarkBaker e89196f65b phpcs fixes, although I thought I'd successfully added the pre-commit hook to pick those up before the commit, guess the problem is running from Windoze, so I'll have to address that 2020-07-05 17:05:44 +02:00
MarkBaker 8629337101 Retrieving print/page setup for the Xml Reader 2020-07-05 16:22:35 +02:00
Owen Leibman 752a0a5a6c Scrutinizer Recommendations
Two unneeded assignments in tests, one unused parameter in source code.
2020-06-25 23:11:30 -07:00
Owen Leibman 6080c4561d Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.

It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
  Helper/Html function colourNameLookup was changed from protected
  to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
  <ul><li>A</li><li>B</li><li>C</li></ul>
  had formerly caused a wrapped cell to be created with 2 empty lines
  followed by A, B, and C on separate lines; it will now just have the
  3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.

Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.

I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-25 22:42:38 -07:00
oleibman 38fab4e632
Fix for #1505 (#1525)
This problem is the same as #1238, which was resolved by #1239.
For that issue, the fix was to check in one place whether
$this->mapCellXfIndex[$xfIndex] was set before using it.
The sample spreadsheet supplied as a description for this
problem had exactly the same problem in 2 other places in the code.
In addition, there were 7 other places in the code where that
particular item was used unchecked. This fix corrects all 9 locations.
The spreadsheet supplied with the problem is used as the basis
for some new tests, which particularly test column dimensions
and styles, the problems involved in this case.
2020-06-19 21:01:18 +02:00
oleibman 262896086a
Improve Coverage for Sylk (#1514)
* Improve Coverage for Sylk

I believe that both BaseReader and Sylk Reader are now 100% covered.

Documentation available for this format is sparse.
It was always incomplete, and in some cases inaccurate.
My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Cell values and calculated values,
and border styles were generally handled pretty well without changes.
Other types of styling were not handled so well. I added a few cells
to exercise some previously uncovered code.

Sylk files must be ASCII. I have deprecated the use of the
setEncoding and getEncoding functions, which had no test cases.
2020-06-19 20:35:44 +02:00
oleibman 73379cdfb1
Improve Coverage for Gnumeric (#1517)
* Improve Coverage for Gnumeric

I believe that both BaseReader and Gnumeric Reader are now 100% covered.

My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Results were generally pretty good,
but there were no tests with assertions. I added a few cells
to exercise some previously uncovered code. Code was extensively
refactored; logic changes are noted below.

Code allowed for specifying document properties in an old format.
I considered removing that, but I found the original spec at
http://www.jfree.org/jworkbook/download/gnumeric-xml.pdf
This allowed me to create an old file, which was not handled
correctly because of namespace differences. The code was corrected
to allow for this difference.

Added support for textRotation.

Mapping of fill types was not correct.

* PHP7.2 Error

One assertion failed under PHP7.2. Apparently there was some change in
the handling of SimpleXMLElement between 7.2 and 7.3. Casting to string
before use eliminates the problem.

* Scrutinizer Recommendations

All minor, solved (hopefully) mostly by casts.

* One Last Scrutinizer Fix

... I hope.
2020-06-19 20:34:02 +02:00
oleibman 585409a949
Testing - Delete Temp Files When No Longer Needed (#1488)
No code changes. The tests in all of these scripts write to at least
one temporary file, which is then read and not used again. The file
should be deleted to avoid filling up the disk system.
2020-05-24 20:03:07 +09:00
oleibman 41b95c1542
CSV Sample File Was Miscoded (#1489)
File author erroneously assumed that backslash was used to escape
quotes in CSV; in fact, doubling the quote is used for escape.
The test still worked, but mainly because the content of the cell
with the escape wasn't tested. The file is now fixed, and
a new test added.
2020-05-24 19:57:39 +09:00
Adrien Crivelli 137268d61a
Remove undesired annotations 2020-05-18 15:49:29 +09:00
Adrien Crivelli fcd9f10663
Update PHP-CS-Fixer rules 2020-05-18 13:49:57 +09:00
Adrien Crivelli e868e58d20
Allow to run an entire folder of tests
We now can do something like:

```sh
./vendor/bin/phpunit tests/PhpSpreadsheetTests/Reader/
```
2020-05-17 18:35:55 +09:00
oleibman 7517cdd008
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.

There were some errors uncovered during development.

The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).

"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.

I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 18:15:18 +09:00
Adrien Crivelli f1a019e492
Upgrad PHP deps 2020-04-27 19:29:45 +09:00
bbinotto e2f87e8b7a
Load with styles should not default to black fill color
Fixes #1353
Closes #1361
2020-04-26 22:33:30 +09:00
Matthijs Alles 87f71e1930 Support whitespaces in CSS style in Xlsx
Indentation in the xml leaves spaces in style string even after
replacing newlines. Replacing the spaces ensures no spaces in keys
of the resulting style-array

Fixes #1347
2020-04-05 19:50:57 +09:00
Stronati Andrea 9f5a472426 Fix XLSX file loading with autofilter containing '$'
The `setRange` method of the `Xlsx/AutoFilter` class expects a filter
range format like "A1:E10". The returned value from
`$this->worksheetXml->autoFilter['ref']` could contain "$" and returning
a value like "$A$1:$E$10".

Fixes #687
Fixes #1325
Closes #1326
2020-03-02 18:43:27 +07:00
oleibman 082266aacd Conditionals - Extend Support for (NOT)CONTAINSBLANKS (#1278)
Support for the CONTAINSBLANKS conditional style was added a while ago.
However, that support was on write only; any cells which used
CONTAINSBLANKS on a file being read would drop that style.

I am also adding support for NOTCONTAINSBLANKS, on read and write.
2020-01-04 18:50:04 +01:00
oleibman afd070a756 Handle ConditionalStyle NumberFormat When Reading Xlsx File (#1296)
* Handle ConditionalStyle NumberFormat When Reading Xlsx File

ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string.
However, when reading conditional style in Xlsx file, NumberFormat
   is actually a SimpleXMLElement, so is not handled correctly.
While testing this change, it turned out that reader always expects
   that there is a "SharedString" portion of the XML, which is not
   true for spreadsheets with no string data, which causes a
   run-time message.
Likewise, when conditional number format is not one of the built-in
   formats, a run-time message is issued because 'isset' is used
   to determine existence rather than 'array_key_exists'.
The new workbook added to the testing data demonstrates both those
   problems (prior to the code changes).

* Move Comment to Resolve Conflict

Github reports conflict involving placement of one comment statement.

* Respond to Scrutinizer Style Suggestion

Change detection for empty SimpleXMLElement.
2020-01-04 00:10:41 +01:00
coolhub 86fa5424a6
Correct column style even when using rowspan
Closes #1249
2019-11-30 15:40:42 +01:00
Nathanael d. Noblet 22bf54ca11 Allow Html Reader to write into existing spreadsheet
Sometimes you may want to read html into multiple worksheets within one
spreadsheet. Allowing the passing of a spreadsheet in makes this possible.
2019-11-17 21:17:56 +01:00
Nathanael Noblet 95c8bb9918
Allow HTML Reader to load from string
We often want to export a table as an excel sheet. The system renders the
html and it seems like a waste of time to write it to the file system to
use the reader. This allows us to render the html and then just pass it to
a reader

Closes #1136
2019-08-17 12:54:22 -07:00
Mahmoud Abdo 785705b712
Best effort to support invalid colspan values in HTML reader
Closes #878
2019-07-27 23:31:23 -07:00
Adrien Crivelli fa54ca79a3
Migrate away from deprecated PHPUnit asserts 2019-07-25 10:15:53 -07:00
Mark Baker bf59cf0cbc
Html cellwrapping (#1075)
* When <br> appears in a table cell, set the cell to wrap.

If the cell is not set to wrap, it appears as a single line when first
displayed in Excel, although editing the cell will cause Excel to wrap
it.

* fix whitespace

Upstream has a coding standard that includes whitespace

* Add Unit tests for cell wrapping

* Update changelog
2019-07-12 07:52:03 +02:00
Mark Baker d8047b071b
Basic unit test and fix for loading data validations from xlsx file (#1063) 2019-07-08 19:55:14 +02:00
rtek 6ab969e9cc Allow XmlScanner to correctly restore libxml entity_loader setting (#1050)
XmlScanner was not restoring libxml_disable_entity_loader since
destruct was not being called until script shutdown. This is because
the shutdown handler required an XmlScanner instance.

Also fix an unrelated bug where the UTF-8 encoding test was
case sensitive.
2019-07-03 09:53:43 +02:00
Mark Baker 0e6238c69e
CVE-2019-12331 (#1041)
* Detect doubly-encoded xml to hide XXE attacks
Correct use of LibXml_Disable_Entity_Loader

* New test for double-encoded xml in security scanner
2019-07-01 00:55:25 +02:00
Mark Baker 1e711541f1
Refactoring xlsx reader (#1033)
Start work on breaking up monolithic Reader and Writer classes into dedicated subclasses to make maintenance work easier
2019-06-30 23:42:25 +02:00
Mark Baker 6c25b6f422
Refactor Xlsx Properties Reader code into a separate class (#1001)
* Unit tests for refactoring Spreadsheet properties
* Refactor Xlsx Properties Reader code into a separate class
2019-06-10 16:44:55 +02:00
MarkBaker d6018a273e Codestyle fixes in tests.... spawn of the devil 2019-05-30 12:23:25 +02:00
MarkBaker 9ba96efc97 Still test against 5.6, but with allowed failures, and skip tests explicitly for features that require PHP >= 7.0.0 2019-05-30 12:11:49 +02:00
kraser 906bdc613c Fix failure when parsing xlsx with drawing having double (redefined) … (#945)
* Fix failure when parsing xlsx with drawing having double (redefined) attributes

* Fix failure when parsing xlsx with drawing having double (redefined) attributes
2019-05-30 11:42:00 +02:00
AlexPravdin ebc0b56959 Fix #853 when loading and saving XLSX file with empty drawing cause c… (#882)
* Fix #853 when loading and saving XLSX file with empty drawing cause corrupted output file. Store empty drawing as unparsed entity and save it as is when saving the file.

* Fix code style
2019-05-30 10:38:03 +02:00
Mark Baker 9b004b1e6a
Ignore escaped enclosures within an enclosure when inferring csv separator (#906) 2019-02-25 23:20:50 +01:00
Patrick Brouwers 1c99f4999c [Feature] Html reader improvements (#884)
* Extract character set, so we can convert to UTF-8 if required

* Set column width and row height when defined on tr/td

* Parse align and valign on td

* Specify number format of cell via html attribute

* Formatting of b, strong, i and em tags

* Inserting image in cell when using img tag in html

* Add applying inline styles: border, fonts, alignment, dimensions

* Add tests for applying inline styles
2019-02-16 23:11:16 +01:00
Adrien Crivelli d0dea580ad
Fix a few Scrutinizer issues 2019-01-02 15:38:13 +11:00
Mahmoud Abdo 86c635b3f5
Fix color from CSS when reading from HTML
In case we generate Spreadsheet from html file and the code
in file have text color in css "color:#FF00FF" it will showing
as black color because it will render like rgb content with } "FF00FF}"
So, we fix it by adding missing bracket "{".

Closes #831
2019-01-02 11:57:30 +11:00
Philipp Kolesnikov 8918888e7c
libxml_disable_entity_loader() changes global state so it should be used as local as possible
Fixes #801
Closes #802
Closes #803
2019-01-01 17:25:24 +11:00
Dennis Birkholz e56fbe2745
Fix column names if read filter calls in XLSX reader skip columns
Fixes #777
Closes #778
2018-12-10 20:00:26 +11:00
MarkBaker 3abb7ccb35 CS Complaining about not uisng $this->assertInternalType('object', $scanner); 2018-11-25 14:41:11 +01:00
MarkBaker 14159d985c Coding standards 2018-11-25 14:33:01 +01:00
MarkBaker 41bcf9a21c Support for additional callback in XML Security Scanner 2018-11-25 14:00:35 +01:00
MarkBaker c708411529 Refactor scanner into base reader class 2018-11-25 12:14:54 +01:00
MarkBaker abad49d426 Use factory for XMLcanner 2018-11-23 23:05:17 +01:00
MarkBaker 5854ce3738 phpcs cleanup 2018-11-20 08:18:35 +01:00
MarkBaker 7a06d71e1c Add UTF-7 XXE Unit test data 2018-11-19 23:22:59 +01:00
MarkBaker a4d97ba896 Clean handle charset in XXE scanner 2018-11-19 22:47:34 +01:00
Laurent 79d86ef5cc
Csv reader avoid notice when the file is empty
Fixes #337
2018-10-28 14:16:53 +11:00
Jon Dufresne 5b3870c508
Prefer https:// URLs when available in docs & comments
Fixes #737
2018-10-28 13:55:00 +11:00
Paul Barton 813855b2b2
Fix CSV delimiter detection on line breaks
The CSV Reader can now correctly ignore line breaks inside
enclosures which allows it to determine the delimiter
correctly.

Fixes #716
Fixes #717
2018-10-21 18:23:55 +11:00
bayzhanov 08b4456641
Xls file threw exception during open by Xls reader
Ignore some exception in property, if stream is empty

Fixes #402
Fixes #659
2018-10-07 18:49:01 +11:00
Adrien Crivelli 9fdcaabe3c
Could not open CSV file containing HTML fragment
We now always trust the file extension to avoid false positive of mime
detection for most simple cases. But we still try to guess the mime type
if the file extension does not match or is missing.

Fixes #564
2018-06-25 11:12:27 +09:00
Robin D'Arcy c723833d6f Allow CSV escape character to be set
Fixes #492
Closes #510
2018-05-23 10:31:41 +09:00
Adrien Crivelli e31878ceb1
Check for MIME type to know if CSV reader can read a file
CSV reader used to accept any file without any kind of check. That made
users incorrectly believe that things were ok, even though there is no
way for CSV reader to read anything else that plain text files.

Fixes #167
2018-02-05 21:33:23 +09:00
Adrien Crivelli c96e2dae02
Update to PHP-CS-Fixer 2.10 2018-01-28 15:59:38 +09:00
Adrien Crivelli 481fc4a7c6
Support XML file without styles
Closes #331
Closes https://github.com/PHPOffice/PHPExcel/pull/559
Fixes https://github.com/PHPOffice/PHPExcel/issues/558
2018-01-14 17:08:50 +09:00
Adrien Crivelli 4dd486fb94
Clean up very obsolete links 2017-12-30 19:07:22 +09:00
Adrien Crivelli 139d85d874
Better auto-detection of CSV separators
Closes #305
2017-12-28 12:25:37 +09:00
Adrien Crivelli 32a55a3f13
Introduce identical functional tests across several formats 2017-12-17 16:35:20 +09:00
Adrien Cohen 11b055b29f
Able to set the `topLeftCell` in freeze panes
Fixes #260
Closes #261
2017-12-17 13:32:16 +09:00
Adrien Crivelli 962367c95f
Can read very small HTML files
Fixes #194
2017-12-11 11:09:25 +09:00
Gabriel Caruso aed27a0bed Use PHPUnit\Framework\TestCase instead of PHPUnit_Framework_TestCase (#271)
Use the `PHPUnit\Framework\TestCase` notation instead of `PHPUnit_Framework_TestCase` while extending our TestCases. This will help us migrate to PHPUnit 6, that [no longer support snake case class names](https://github.com/sebastianbergmann/phpunit/blob/master/ChangeLog-6.0.md#changed-1).
2017-11-09 00:48:01 +09:00
Adrien Crivelli 40efcd2fdd
Rename tests according to the class the class they are testing 2017-11-03 12:47:19 +09:00
Adrien Crivelli 557e80dc03
Rename classes to keep them in their related namespaces 2017-10-29 17:39:42 +09:00
Adrien Crivelli 4fd8e742e7
Upgrade to PHP-CS-Fixer 2.7 2017-10-01 20:07:04 +09:00
GreatHumorist 2abe56b946 Support missing attribute `r` in `c` node when reading xlsx
When describing a cell, the cell reference (r="A1") is optional.
When not present, we should just increment the index of the last processed row.

Fixes #201 
Closes #225
2017-09-22 14:49:38 +09:00
GreatHumorist 7aa6233185
Added xml reader hyperlink support
Closes #223
2017-09-22 14:40:47 +09:00
Adrien Crivelli aef4d711f5
Use `self::assert*()` instead of `$this->assert*()`
Because even if it doesn't make a difference in practice, it is
technically more correct to call static methods statically. It
also better advertise that those methods can be used from any context.
2017-09-22 14:22:44 +09:00
GreatHumorist 0477e6fcfe In Xml reader throw exception in case of invalid XML (#222)
When the xml file is not a standard xml file, the `simplexml_load_string` will return false, this will cause an error on "$xml->getNamespaces(true);" . So instead of showing the error, we throw an exception.
2017-09-20 14:20:12 +09:00
Adrien Crivelli 31daed0048
Fix class name case 2017-08-02 23:13:08 +02:00
Zharikov Viktor 07455d24f6
Make global usage of `use` instead of FQCN
Closes #78
Closes #147
2017-05-18 00:10:16 +02:00
Markus Lanthaler 3ee9cc5ce6
Infer CSV delimiter if it hasn't been set explicitly
Closes #141
2017-04-20 17:02:03 +09:00
Adrien Crivelli fd9c925a7b
Refactor CachedObjectStorage to PSR-16
This drop a lot of non-core features code and delegate their maintainance
to third parties. Also it open the door to any missing implementation
out of the box, such as Redis for the moment.

Finally this consistently enforce a constraint where there can be one and
only one active cell at any point in time in code. This used to be true for
non-default implementation of cache, but it was not true for default
implementation where all cells were kept in-memory and thus were never
detached from their worksheet and thus were all kept functionnal at any
point in time.

This inconsistency of behavior between in-memory and off-memory could
lead to bugs when changing cache system if the end-user code was badly
written. Now end-user will never be able to write buggy code in the first
place, avoiding future headache when introducing caching.

Closes #3
2017-04-14 16:56:27 +09:00
Kurounin b01671213a
Removed double un-escaping when reading CSV
Removed "unescape enclosure functionality", since the unescaping is already handled by fgetcsv,
and performing the unescaping again would actually result int the text from the cell being read wrong.

As an example try parsing the folowing CSV:

```
"<img alt="""" src=""http://example.com/image.jpg"" />"
```

With the additional unescaping it would have ended up as:

```
<img alt=" src="http://example.com/image.jpg" />
```

instead of the correct:
```
<img alt="" src="http://example.com/image.jpg" />
```

Fixes https://github.com/PHPOffice/PHPExcel/pull/1171
2017-04-03 11:57:10 +09:00
Adrien Crivelli 93e2204774
Document ODS supported features
This should be completed in the future.
2017-03-06 14:40:27 +09:00
Paolo Agostinetto 9785f926c1 php-cs run: fixed code style for new/changed files 2017-02-20 21:05:25 +01:00
Paolo Agostinetto a0321fd6fd Removed PhpStorm comment on top of file 2017-02-20 21:02:49 +01:00
Paolo Agostinetto c954eddf57 Ods reader: fix sheet count and added a test for sheet names 2017-02-20 21:02:04 +01:00
Paolo Agostinetto 6d6353c0f1 Ods reader: fix reading of cells with hyperlinks 2017-02-18 20:59:14 +01:00
Paolo Agostinetto 1dba2d1766 Ods reader: tests for repeated spaces and rich text 2017-02-18 20:49:48 +01:00
Paolo Agostinetto bcd1bc364c Ods reader: test loading of Worksheets 2017-02-18 13:55:22 +01:00
Paolo Agostinetto 3c7b2e23da Added unit tests for Ods reader 2017-02-18 13:36:08 +01:00
Adrien Crivelli 031af1e9d2
Standardize writers and readers name to be the format most common extension in CamelCase 2017-01-22 17:39:23 +09:00
Adrien Crivelli 8c66afe39a
Upgrade to PHP-CS-Fixer 2.0 2016-12-22 23:46:26 +09:00
Alexander Kurilo 408da0c17a Make HTML checks more strict 2016-11-16 22:21:30 +09:00
Alexander Kurilo 928b592c14 Add readable labels for data provider samples 2016-11-16 22:21:30 +09:00
Alexander Kurilo 2809cce298 Remove unused data from test provider data set 2016-11-16 22:21:30 +09:00
Alexander Kurilo edb3974a0d Move XEEE test data to add data for other readers 2016-11-16 22:21:30 +09:00
Adrien Crivelli 47cde0dadc
Introduce vendor prefix `PhpOffice` to namespace 2016-09-01 02:20:47 +09:00
Adrien Crivelli 29bdbd4e0b
Respect PSR-0 with matching folder name and namespace `PhpSpreadsheetTests` 2016-08-25 13:53:15 +09:00