PhpSpreadsheet

Commit Graph

Author	SHA1	Message	Date
oleibman	d27b6a672a	Cleanup 3 LookupRef Tests (#3097 ) Scrutinizer had previously suggested annotations for 3 LookupRef tests, but it no longer accepts its own annotation for those cases. This PR cleans them up. ColumnsTest and RowsTest are extremely straightforward. IndexTest is a bit more complicated, but only because, unlike the other two, it had no test which executed in the context of a spreadsheet. And, when I added those, I discovered a couple of bugs. INDEX always requires at least 2 parameters (row# is always required), but its entry in the function table specified 1-4 parameters, now changed to 2-4. And, omitting col# is not handled the same way as specifying 0 for col#, though the code had treated them identically. (The same would have been true for row# but, because it is now required, ...)	2022-10-01 07:05:54 -07:00
oleibman	050a42db8e	Xlsx Reader External Data Validations Flag Missing (#3078 ) * Xlsx Reader External Data Validations Flag Missing Fix #2677. This PR supersedes #2679, written by @technghiath, which lacks tests, and probably doesn't solve the problem entirely. The code causing the problem appears to be the last remnant in Xlsx Reader which calls `children` using a namespace prefix rather than a namespace. That is changed, and tests are added where the tag is unexpectedly missing, and also where it uses a non-standard namespace prefix. * Scrutinizer Reports 1 "new" error. It isn't, but fix it anyhow. * Fix One Existing Scrutinizer Problem Only remaining problem in Reader/Xlsx.	2022-09-28 00:14:37 -07:00
oleibman	90422bf1d2	R1C1 Format and Internationalization, plus Relative Offsets (#3052 ) * R1C1 Format and Internationalization, plus Relative Offsets Fix #1704, albeit imperfectly. Excel's implementation of this feature makes it impossible to fix perfectly. I don't know why it was necessary to internationalize R1C1 in the first place - the benefits are so minimal,and the result is worksheets that break when opened in different locales. Ugh. I can't even find complete documentation about the format in different languages; I am using https://answers.microsoft.com/en-us/officeinsider/forum/all/indirect-function-is-broken-at-least-for-excel-in/1fcbcf20-a103-4172-abf1-2c0dfe848e60 as my definitive reference. This fix concentrates on the original report, using the INDIRECT function; there may be other areas similarly affected. As with ambiguous date formats, PhpSpreadsheet will do a little better than Excel itself when reading spreadsheets with internationalized R1C1 by trying all possibilities before giving up. When it does give up, it will now return `#REF!`, as Excel does, rather than throwing an exception, which is certainly friendlier. Although read now works better, when writing it will use whatever the user specified, so spreadsheets breaking in the wrong locale will still happen. There were some bugs that turned up as I added test cases, all of them concerning relative addressing in R1C1 format, e.g. `R[+1]C[-1]`. The regexp for validating the format allowed for minus signs, but not plus signs. Also, the relevant functions did not allow for passing the current cell address, which made relative addressing impossible. The code now allows these, and suitable test cases are added. * Use Locale for Formats, but Not for XML Implementing a suggestion from @MarkBaker to use the system locale for determining R1C1 format rather than looping through a set of regexes and accepting any that work. This is closer to how Excel itself operates. The assumption we are making is to use the first character of the translated ROW and COLUMN functions. This will not work for Russian or Bulgarian, where each starts with the same letter, but it appears that Russian, at least, still uses R1C1. So our algorithm will not use non-ASCII characters, nor characters where ROW and COLUMN start with the same letter, falling back to R/C in those cases. Turkish falls into that category. Czech uses an accented character for one of the functions, and I'm guessing to use the unaccented character in that case. Polish COLUMN function is NR.KOLUMNY, and I'm guessing to use K in that case. The function that converts R1C1 references is also used by the XML reader where the format is always R1C1, not locale-based (confirmed by successfully opening in Excel an XML spreadsheet when my language is set to French). The conversion code now handles that distinction through the use of an extra parameter. Xml Reader Load Test is duplicated to confirm that spreadsheet is loaded properly whether the locale is English or French. (No, I did not add an INDIRECT function to the Xml spreadsheet.) Tests CsvIssue2232Test and TranslationTest both changed locale without resetting it when done. That omission was exposed by the new code, and both are now corrected. * OpenOffice and Gnumeric OpenOffice and Gnumeric make it much easier to test with other languages - they can be handled with an environment variable. Sensibly, they require R and C as the characters for R1C1 notation regardless of the language. Change code to recognize this difference from Excel. * Handle Output of ADDRESS Function One other function has to deal with R1C1 format as a string. Unlike INDIRECT, which receives the string on input, ADDRESS generates the string on output. Ensure that the ADDRESS output is consistent with the INDIRECT input. ADDRESS expects its 4th arg to be bool, but it can also accept int, and many examples on the net supply it as an int. This had not been handled properly, but is now corrected. * More Structured Test I earlier introduced a new test for relative R1C1 addressing. Rewrite it to be clearer. * Add Row for This to Locale Spreadsheet It took a while for me to figure out how it all works. I have added a new row (with English value `RC`) to Translations.xlsx, in the "Lookup and Reference" section of sheet "Excel Functions". By starting the "function name" with an asterisk, it will not be confused with a "real" function (confirmed by a new test). This approach also gives us the flexibility to do something similar if another surprise case occurs in future; in particular, I think this is more flexible than adding this as another option on the "Excel Localisation" sheet. It also means that any errors or omissions in the list below will be handled as with any other translation problem, by updating the spreadsheet without needing to touch any code. The spreadsheet has the following entries in the RC row: - first letter of ROW/COLUMN functions for da, de, es, fi, fr, hu, nl, nb, pt, pt_br, sv - no value for locales where ROW/COLUMN functions start with same letter - bg, ru, tr - no value for locales with a multi-part name for ROW and/or COLUMN - it, pl (I had not previously noted Italian as an exception) - no value for locales where ROW and/or COLUMN starts with a non-ASCII character - cs (this would also apply to bg and ru which are already included under "same letter") - it does nothing for locales which are defined on the "Excel Localisation" sheet but have no entries yet on the "Excel Functions" sheet (e.g. eu) Note that all but the first bullet item will continue to use R/C, which leaves them no worse off than they were before this change.	2022-09-16 08:25:26 -07:00
MarkBaker	84d6d98348	Unit tests for correctly handling hidden merged cells in Readers	2022-09-16 12:27:57 +02:00
Mark Baker	47067494f5	Merge branch 'master' into Calculation-Examples	2022-09-14 20:13:38 +02:00
oleibman	252474c1bd	Scrutinizer Clean Up Tests (#3061 ) * Scrutinizer Clean Up Tests No source code involved. * Scrutinizer Whack-a-mole Fixed 17, added 10. Trying again. * Simplify Some Tests Eliminate some null assertions. * Dead Code Remove 2 statements.	2022-09-14 07:11:20 -07:00
MarkBaker	441ae741d7	Update Excel function samples for Date/Time and Engineering functions	2022-09-14 14:16:32 +02:00
oleibman	3e8d50547c	Minor Fix for Percentage Formatting (#3053 ) Fix #1929. This was already substantially fixed, but there was a lingering problem with an unexpected leading space. It turns out there was also a problem with leading zeros, also fixed. There are also problems involving commas; fixing those seems too complicated to delay these changes, but I will add it to my to-do list.	2022-09-12 08:45:13 -07:00
MarkBaker	f7a3534928	Implementation of the `VALUETOTEXT()` Excel Function	2022-08-27 16:23:55 +02:00
MarkBaker	cd66298901	Adjust `extractAllCellReferencesInRange()` method to allow a worksheet in the reference	2022-08-14 23:43:52 +02:00
MarkBaker	71b2c5ae89	Expand [PR #2964 ](https://github.com/PHPOffice/PhpSpreadsheet/pull/2964 ) to cover all arithmetic operators, not just multiplication, and both left and right side values	2022-08-07 13:59:26 +02:00
oleibman	b661d31887	Limited Support for Chart Titles as Formulas (#2971 ) This is a start in addressing issue #2965 (and earlier issue #749). Chart Titles are usually entered as strings or Rich Text strings, and PhpSpreadsheet supports that. They can also be entered as formulas (typically a pointer to a cell with the title text), and, not only did PhpSpreadsheet not support that, it threw an exception when reading a spreadsheet that did so. This change does: - eliminate the exception - set a static chart title when it can determine it from the Xml This change does not: - fully support dynamic titles (e.g. if you change the contents of the source cell, or delete or insert cells or rows or columns) - permit the user to set the title to a formula - allow the use of formulas when writing a chart title to a spreadsheet - provide styling for titles when it has read them as a formula	2022-08-06 17:39:18 -07:00
Mark Baker	a8b6214574	Merge branch 'master' into TextFunctions-ArrayToText	2022-08-04 23:02:20 +02:00
MarkBaker	4724c8f7e9	Initial work on the ARRAYTOTEXT() Excel Function	2022-08-04 22:43:36 +02:00
MarkBaker	f331bca470	cellExists() and getCell() methods should support UTF-8 named cells	2022-08-04 14:38:35 +02:00
Mark Baker	fe3ec55341	Merge branch 'master' into TextFunctions-New-TextSplit	2022-08-02 19:36:47 +02:00
MarkBaker	07f4fbe396	Initial implementation of the `TEXTSPLIT()` Excel Function	2022-08-02 19:05:43 +02:00
MarkBaker	290d0731fe	Allow multiple delimiters for `TEXTBEFORE()` and `TEXTAFTER()` functions	2022-07-30 10:27:31 +02:00
Mark Baker	345c0ebdfc	Merge branch 'master' into TextFunctions-New	2022-07-28 19:14:06 +02:00
MarkBaker	88bfa98291	Initial Implementation of the new Excel TEXTBEFORE() and TEXTAFTER() functions	2022-07-28 16:05:18 +02:00
oleibman	4bf4278a39	VLOOKUP Breaks When Array Contains Null Cells (#2939 ) Fix #2934. Null is passed to StringHelper::strtolower which expects string. Same problem appears to be applicable to HLOOKUP. I noted the following problem in the code, but will document it here as well. Excel's results are not consistent when a non-numeric string is passed as the third parameter. For example, if cell Z1 contains `xyz`, Excel will return a REF error for function `VLOOKUP(whatever,whatever,Z1)`, but it returns a VALUE error for function `VLOOKUP(whatever,whatever,"xyz")`. I don't think PhpSpreadsheet can match both behaviors. For now, it will return VALUE for both, with similar results for other errors. While studying the returned errors, I realized there is something that needs to be deprecated. `ExcelError::$errorCodes` is a public static array. This means that a user can change its value, which should not be allowed. It is replaced by a constant. Since the original is public, I think it needs to stay, but with a deprecation notice; users can reference and change it, but it will be unused in the rest of the code. I suppose this might be considered a break in functionality (that should not have been allowed in the first place).	2022-07-17 06:27:56 -07:00
oleibman	5de82981d8	Html Reader Not Handling non-ASCII Data Correctly (#2943 ) * Html Reader Not Handling non-ASCII Data Correctly Fix #2942. Code was changed by #2894 because PHP8.2 will deprecate how it was being done. See linked issue for more details. Dom loadhtml assumes ISO-8859-1 in the absence of a charset attribute or equivalent, and there is no way to override that assumption. Sigh. The suggested replacements are unsuitable in one way or another. I think this will work with minimal disruption (replace ampersand, less than, and greater than with entities representing illegal characters, then use htmlentities, then restore ampersand, less than, and greater than). * Better Implementation Use regexp to escape non-ASCII. Less kludgey, less reliant on the vagaries of the PHP maintainers. * Additional Tests Test non-ASCII outside of cell contents: sheet title, image alt attribute. * Apply Same Change in Second Location Forgot to change loadFromString. * Additional Test Confirm escaped ampersand is handled correctly.	2022-07-16 22:08:44 -07:00
oleibman	f0059bb4bc	Xlsx Chart Reader and Writer Mishandle Explosion Value (#2928 ) Fix #2506. Reader only tests if Explosion is set without capturing its value. Writer hard-codes value when it is set.	2022-07-14 07:48:53 -07:00
MarkBaker	ad15232fc7	Modify rangeBoundaries(), rangeDimension() and getRangeBoundaries() methods to work with row/column ranges as well as with cell ranges and cells	2022-07-07 17:30:27 +02:00
oleibman	faf6d819c6	Keep Calculated String Results Below 32K (#2921 ) * Keep Calculated String Results Below 32K This is the result of an investigation into issue #2884 (see also PR #2913). It is, unfortunately, not a fix for the original problem; see the discussion in that PR for why I don't think there is a practical fix for that specific problem at this time. Excel limits strings to 32,767 characters. We already truncate strings to that length when added to the spreadsheet. However, we have been able to exceed that length as a result of the concatenation operator (Excel truncates); as a result of the CONCATENATE or TEXTJOIN functions (Excel returns #CALC!); or as a result of the REPLACE, REPT, SUBSTITUTE functions (Excel returns #VALUE!). This PR changes PhpSpreadsheet to return the same value as Excel in these cases. Note that Excel2003 truncates in all those cases; I don't think there is a way to differentiate that behavior in PhpSpreadsheet. However, LibreOffice and Gnumeric do not have that limit; if they have a limit at all, it is much higher. It would be fairly easy to use existing settings to differentiate between Excel and LibreOffice/Gnumeric in this respect. I have not done so in this PR because I am not sure how useful that is, and I can easily see it leading to problems (read in a LibreOffice spreadsheet with a 33K cell and then output to an Excel spreadsheet). Perhaps it should be handled with an additional opt-in setting. I changed the maximum size from a literal to a constant in the one place where it was already being enforced (Cell/DataType). I am not sure that is the best place for it to be defined; I am open to suggestions. * Implement Some Suggestions ... from @MarkBaker.	2022-07-04 08:30:46 -07:00
oleibman	c3f53854b6	Php/iconv Should Not Treat FFFE/FFFF as Valid (#2910 ) Fix #2897. We have been relying on iconv/mb_convert_encoding to detect invalid UTF-8, but all techniques designed to validate UTF-8 seem to accept FFFE and FFFF. This PR explicitly converts those characters to FFFD (Unicode substitution character) before validating the rest of the string. It also substitutes one or more FFFD when it detects invalid UTF-8 character sequences. A comment in the code being change stated that it doesn't handle surrogates. It is right not to do so. The only case where we should see surrogates is reading UTF-16. Additional tests are added to an existing test reading a UTF-16 Csv to demonstrate that surrogates are handled correctly, and that FFFE/FFFF are handled reasonably.	2022-07-02 08:53:39 -07:00
oleibman	b5b83abc0e	Adjust Both Coordinates for Two-Cell Anchors (#2909 ) Fix #2908. When support for two-cell anchors was added for drawings, we neglected to adjust the second cell address when rows or columns are added or deleted. It also appears that "twoCell" and "oneCell" were introduced as lower-case literals when support for the editAs attribute was subsequently introduced.	2022-06-29 09:20:33 -07:00
FlameStorm	1829dea91e	Ignore square-$-brackets prefix in format string (#2886 ) * Ignore square-$-brackets prefix in format string * Test for square-$-brackets prefix in format string issue fixed * Fix for phpstan compliance * Additional assert for checking number format of tested source cell	2022-06-15 14:55:17 -07:00
oleibman	90bdc7c12e	Test For Excel File Saved With Ribbon Data (#2883 ) File from https://www.rondebruin.nl/win/s2/win003.htm. I have been in conversation with the author, who has no objection to its use. I have not actually opened the file in Excel (at least not with macros enabled); I am using it merely to demonstrate that the ribbon data is read and written correctly. Test added; no source code changed. This should slightly increase coverage for Reader/Xlsx (moderate), Writer/Xlsx (slight), and Spreadsheet (substantial). Note that this file has no Ribbon Bin objects, so some coverage is still lacking.	2022-06-14 08:33:36 -07:00
oleibman	c936f1d9f8	Coverage Improvements (#2859 ) Mostly new tests, some code annotations, some minor code changes: - RichText clone logic is wrong - TextElement doesn't have object properties, doesn't need clone	2022-06-01 08:29:56 -07:00
MarkBaker	bb7e083745	Add Hidden Worksheet unit tests for other Readers	2022-05-24 14:40:45 +02:00
MarkBaker	23ce21901d	Add support for reading Worksheet Visibility for Gnumeric	2022-05-24 14:08:41 +02:00
Mark Baker	2de58d4164	Merge branch 'master' into Ods-Writer-Worksheet-Visibility	2022-05-23 17:06:19 +02:00
MarkBaker	3fae29d613	Add support for reading Worksheet Visibility for Ods	2022-05-23 15:30:45 +02:00
MarkBaker	2e5ebea110	Add support for writing Worksheet Visibility for Ods	2022-05-23 13:38:08 +02:00
oleibman	7c1c896959	Change and Re-enable URLImageTest (#2844 ) Disabled it earlier because its reliance on an external site not under our control was causing problems. URL in spreadsheet is now changed to point to an image in phpspreadsheet.readthedocs.io, which should be more reliable. Test is re-enabled.	2022-05-21 08:07:59 -07:00
MarkBaker	db4dac3de9	Filter null values in PRODUCT() function, as they don't affect the calculation in any way, and actually require additional code handling if they're present within the calculation itself	2022-05-17 16:40:14 +02:00
Mark Baker	cb81f44091	Merge branch 'master' into ChartSheet	2022-05-17 15:26:09 +02:00
MarkBaker	79f5cf99f6	Resolve Issue #2833 , NULL value handling in PRODUCT() Excel function	2022-05-17 12:56:21 +02:00
MarkBaker	198878b347	Add unit tests for ChartSheet reading	2022-05-16 13:30:30 +02:00
MarkBaker	fd44f47f62	Fix issue with Xls Reader converting all named ranges from absolute to relative	2022-05-12 12:47:18 +02:00
oleibman	070bc68514	Html Reader Converting Cell Containing 0 to Null String (#2813 ) Fix #2810. Repairing some Phpstan diagnostics, used `?:` rather than `??` in a few places. 2 different Html modules are affected. Also, Ods Reader, but its problem is with sheet title rather than cell contents. And, as it turns out, Ods Reader was already not handling sheets with a title of `0` correctly - it made a truthy test before setting sheet title. That is now changed to truthy or numeric. Other readers are not susceptible to this problem. Tests are added.	2022-05-10 07:33:45 -07:00
MarkBaker	4f7da29332	Minor performance tweaks	2022-05-01 21:55:37 +02:00
oleibman	766252ccb0	Real Errors Identified in Calculation by Scrutinizer (#2774 ) * Real Errors Identified in Calculation by Scrutinizer Before Scrutinizer broke, I took a look at the remaining 43 errors which it categorized as 'major'. Most of these were false positives, but, in the case of Calculation and Reader/Xlsx/Chart, I was able to determine that its analysis of some of the problems was correct. There is little point addressing the false positives until it starts working again, but we should fix the real errors. This PR addresses the real errors in Calculation. - A test for `$pCellParent` should have been a test for `$pCellWorksheet`. - A test for `$operand1Data['reference']` should have been a test for `$operand1Data['value']`. - A test for `$operand2Data['reference']` should have been a test for `$operand2Data['value']`. * Fix Attempt to Erroneously Call trim on Array Fix #2780. Warning message is being issued when getting calculated value of cell with value `=INDIRECT(ADDRESS(ROW(),COLUMN()-2))/$C$4`. This appears to be the case for all recent (and probably not so recent) releases; it is not a result of changes to the code base. Fix added to this PR because the erring section of code was proximate to code already changed in the PR. Test added. * Minor Code Changes Apply some suggestions from @MarkBaker	2022-04-30 19:13:17 -07:00
MarkBaker	0171709e7f	The `sortCellReferenceArray()` in `Coordinate` should have returned cells ordered by row, then by column... but instead sorted by column, then row. Fixed that bug, using a slightly faster algorithm for the sort index than the simple fix would have used, and modified the tests that didn't have the correct expected result :-(	2022-04-28 17:43:01 +02:00
Mark Baker	da76f0d977	Merge branch 'master' into issue2768	2022-04-24 11:26:52 +02:00
oleibman	7fe5ee84ea	Time Interval Formatting Fix #2768. DateFormatter handles only one of six special formats for time intervals `[h] [hh] [m] [mm] [s] [ss]`. This PR extends support to the rest. There should be no more than one of these in any format string. Although it certainly could make sense to treat `[d] [dd]` in the same manner, Excel does not seem to support those. Interesting observations - hours and minutes are truncated (presumably because they may be followed by minutes and seconds), but seconds are rounded. Also, there are some floating point issues, which fortunately showed up for the example in the original issue. There, the time interval was 1.15, which should evaluate to a minutes value of 1656 (as it does in Excel). However, on my system it evaluated to 1655 because of a rounding error in the 13th decimal place. To overcome this, values are rounded to 10 decimal places before truncating.	2022-04-23 11:12:46 -07:00
andres1gb	4cd1d7039d	Fix reading of files in the root of a zip (#2731 ) * Fix reading of files in the root of a zip Xlsx.php relies in dirname($filename) for path generation. When path is a bare filename (i.e. files in the root of the zip file), dirname($filename) returns a relative path to the current directory ("."). This is ok for filesystems, but not when accesing contents in a zip file. Xlsx documents with files in the root of the zip container are not common, but legit. I've found it to happen in files generated by Google Campaign Manager 360. * Update Xlsx.php * Update Xlsx.php * Update CHANGELOG.md * Add files via upload * Create XlsxRootZipFilesTest.php * Update XlsxRootZipFilesTest.php * Add files via upload * Delete rootZipFiles.xlsx * Update XlsxRootZipFilesTest.php * Update Xlsx.php	2022-04-18 06:54:41 -07:00
redforks	d593617287	Fix font index problem (#2642 ) * Fix font index problem * Update RichTextSizeTest.php Eliminate Phpstan failure. * Update RichTextSizeTest.php Eliminate now-unused import.	2022-04-17 08:27:28 -07:00
MarkBaker	7a2f5c4ccc	Ods Writer support for setting column width/row height (including Autosizing)	2022-04-17 11:34:39 +02:00

1 2 3 4 5 ...

424 Commits