Commit Graph

17 Commits

Author SHA1 Message Date
oleibman 0cd20f3099
Csv Handling of Booleans (and an 8.1 Deprecation) (#2232)
* Csv Handling of Booleans (and an 8.1 Deprecation)

PhpSpreadsheet writes boolean values to a Csv as null-string/1, and treats input values of 'true' and 'false' as if they were strings. On the other hand, Excel writes boolean values to a Csv as TRUE/FALSE, and case-insensitively treats a matching string as boolean on read. This PR changes PhpSpreadsheet to match Excel.

A side-effect of this change is that it fixes behavior incorrectly reported as a bug in PR #2048. That issue was closed, correctly, as user error. The user had altered Csv Writer, including adding ```declare(strict_types=1);```; that declaration was the cause of the error. The "offending" statements, calls to strpbrk and str_replace, will now work correctly whether or not strict_types is in use.

And, just as I was getting ready to push this, the dailies for PHP 8.1 introduced a change deprecating auto_detect_line_endings. Csv Reader uses that setting; it allows it to process a Csv with Mac line endings, which happens to be something that Excel can do. As they say in https://wiki.php.net/rfc/deprecations_php_8_1, where the proposal passed without a single dissenting vote, "These newlines were used by “Classic” Mac OS, a system which has been discontinued in 2001, nearly two decades ago. Interoperability with such systems is no longer relevant." I tend to agree, but I don't know that we're ready to pull the plug yet. I don't see an easy way to emulate that functionality. For now, I have silenced the deprecation notices with at signs. I have also added a test case which will fail when support for that setting is pulled; this will give time to consider alternatives.

* Scrutinizer: Handling ini_set

This could be interesting. It doesn't like not handling an error condition for ini_set. Let's see if this satisfies it.
2021-08-04 07:00:17 -07:00
oleibman 346bad1b1d
Fix for Issue 2042 (SUM Partially Broken) (#2045)
As issue #2042 documents, SUM behaves differently with invalid strings depending on whether they come from a cell or are used as literals in the formula. SUM is not alone in this regard; COUNTA is another function within this behavior, and the solution to this one is modeled on COUNTA. New tests are added for SUM, and the resulting tests are duplicated to confirm correct behavior for both cells and literals.

Samples 16 (CSV), 17 (Html), and 21 (PDF) were adversely affected by this problem. 17 and 21 were immediately fixed, but 16 had another problem - Excel was not interpreting the UTF8 currency symbols correctly, even though the file was saved with a BOM. After some experimenting, it appears that the `sep=;` line generated by setExcelCompatibility(true) causes Excel to mis-handle the file. This seems like a bug - there is apparently no way to save a UTF-8 CSV with non-ASCII characters which specifies a non-standard separator which Excel will open correctly. I don't know if this is a recent change or if it is just the case that nobody noticed this problem till now. So, I changed Sample 16 to use setUseBom rather than setExcelCompatibility, which solved its problem. I then added new tests for setExcelCompatibility, with documentation of this problem.
2021-05-03 18:31:01 +02:00
Adrien Crivelli 49f87de165
Reduce PHPStan error in tests 2021-04-12 11:10:23 +09:00
Akira Taniguchi 551bbdf30f
Update CsvOutputEncodingTest.php 2021-04-06 18:53:21 +09:00
Akira Taniguchi e48ffdbe77
Update CsvOutputEncodingTest.php 2021-04-06 04:44:24 +09:00
Akira Taniguchi 991616d1d9
Update CsvOutputEncodingTest.php 2021-04-06 04:39:26 +09:00
Akira Taniguchi 86f3bdd598
Update CsvOutputEncodingTest.php 2021-04-06 04:27:14 +09:00
Akira Taniguchi bfd4a659a4
Update CsvOutputEncodingTest.php 2021-04-06 04:11:46 +09:00
Akira Taniguchi 2adc44262c
Update CsvOutputEncodingTest.php 2021-04-06 03:48:08 +09:00
Akira Taniguchi 7681a1093f
Update CsvOutputEncodingTest.php 2021-04-06 03:46:10 +09:00
Akira Taniguchi 2df8e1a93f
Update CsvOutputEncodingTest.php 2021-04-06 03:07:10 +09:00
Akira Taniguchi f6bfbd0655
Create CsvOutputEncodingTest.php 2021-04-06 02:56:54 +09:00
Akira Taniguchi 2a16ce1432
Delete CsvOutputEncoding.php 2021-04-06 02:47:46 +09:00
Akira Taniguchi eef90b62df
Create CsvOutputEncoding.php 2021-04-06 02:05:08 +09:00
oleibman ce6ac1f040
Fix For #1509 (#1518)
* Fix For #1509

User expected no CSV enclosures after $writer->setEnclosure(''),
which had been changed to be consistent with $reader->setEnclosure('').
Writer will now omit enclosures after code above; no change to Reader.
Tests have been added for this condition.

* Add Option to Write CSV Enclosure Only When Required

Allowing the user to specify no enclosure when writing a CSV can lead to
a situation where PhpSpreadsheet (likewise Excel) will not read the
resulting file as intended, e.g. if any cell contains a delimiter character.
This is demonstrated in new test TestBadReread.
No existing setting will rectify this situation.

A better choice would be to add an option to write the enclosure
only when it is needed, which is what Excel does. The RFC4180 spec at
https://tools.ietf.org/html/rfc4180
states when it is needed - when the cell contains the delimiter,
or the enclosure, or a newline.
New test TestGoodReread demonstrates that the file is read as intended.

The documentation has been updated to describe the new function,
and to change the write example where the enclosure is set to null.

* Scrutinizer Suggestions

3 minor changes, all in tests.
2020-06-19 20:28:57 +02:00
Adrien Crivelli fcd9f10663
Update PHP-CS-Fixer rules 2020-05-18 13:49:57 +09:00
oleibman 7517cdd008
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.

There were some errors uncovered during development.

The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).

"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.

I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 18:15:18 +09:00