Bug Report: Excel Package Skips Duplicate SharedString Entries
Package Information
Summary
The _SharedStringsMaintainer.add() method uses putIfAbsent() which skips adding duplicate string values to the _list. This causes index mismatches when cells reference sharedString indices that were skipped, resulting in "Unexpected null value" errors.
Root Cause
File: lib/src/sharedStrings/shared_strings.dart
Method: _SharedStringsMaintainer.add()
Current Implementation (Buggy)
void add(SharedString val, String key) {
_map[val]?.increaseCount();
_map.putIfAbsent(val, () { // ❌ Problem: This closure only executes if key doesn't exist
_mapString[key] = val;
_list.add(val); // ❌ This is never called for duplicate values!
return _IndexingHolder(_index++);
});
}
The Problem
- When parsing
sharedStrings.xml, the parser iterates through all <si> elements (e.g., 416 elements)
- For each element, it calls
_SharedStringsMaintainer.add()
- If a duplicate string value is encountered (e.g., "Black" appears at index 143 and again at index 348),
putIfAbsent() does not execute the closure because the key already exists
- This means
_list.add(val) is never called for the duplicate
- Result:
_list has only 415 entries (indices 0-414) instead of 416 (indices 0-415)
- When a cell references index 415,
_list[415] throws "Unexpected null value"
Steps to Reproduce
1. Create a corrupted Excel file
Windows Excel sometimes creates duplicate sharedString entries when using "Enable Editing and Save". Example XML:
<sst xmlns="..." count="2723" uniqueCount="416">
<si><t>Black</t></si> <!-- Index 143 -->
<!-- ... other entries ... -->
<si><t>Black</t></si> <!-- Index 348: DUPLICATE! -->
<!-- ... other entries ... -->
<si><t>Final Value</t></si> <!-- Index 415 -->
</sst>
2. Create a cell referencing the last index
<worksheet>
<sheetData>
<row r="87">
<c r="B87" t="s">
<v>415</v> <!-- References index 415 -->
</c>
</row>
</sheetData>
</worksheet>
3. Try to parse the file
final excel = Excel.decodeBytes(fileBytes);
// Error: Unexpected null value
4. Debug Output
📊 XML contains 416 <si> elements
⚠️ DUPLICATE SKIPPED at XML index 348: String value = "Black" (already exists at parser index 143)
📊 Parser loaded 415 sharedStrings (indices 0-414)
❌ Cell B87: SharedString at index 415 is NULL! Total sharedStrings: 415
Expected Behavior
All 416 <si> elements from the XML should be loaded into _list, even if some have duplicate text values. The _list should maintain the exact same indexing as the XML to ensure cells can reference any index 0-415.
Actual Behavior
Only 415 unique entries are loaded into _list. Any cell referencing index 415 (or any other skipped duplicate index) crashes with "Unexpected null value".
Proposed Fix
Option 1: Always Add to List (Recommended)
Modify _SharedStringsMaintainer.add() to always add elements to _list:
void add(SharedString val, String key) {
// Always add to _list to preserve XML index positions
_list.add(val);
// Track in map for lookups (update if exists, add if new)
if (_map.containsKey(val)) {
_map[val]?.increaseCount();
} else {
_map[val] = _IndexingHolder(_index);
_mapString[key] = val;
}
_index++;
}
Advantages:
- Maintains 1:1 correspondence between XML indices and
_list indices
- Fixes the crash for corrupted files
- No breaking changes to existing API
Considerations:
_list may contain duplicate entries
- Memory usage slightly higher for files with duplicate sharedStrings
- But this accurately reflects the actual Excel file structure
Option 2: Track Original XML Index
Create a separate index mapping to preserve XML positions while deduplicating:
final List<SharedString> _list = <SharedString>[];
final Map<int, int> _xmlIndexToListIndex = <int, int>{}; // New
int _xmlIndex = 0; // New: Track position in XML
void add(SharedString val, String key) {
_map[val]?.increaseCount();
final listIndex = _map.putIfAbsent(val, () {
_mapString[key] = val;
final idx = _list.length;
_list.add(val);
return _IndexingHolder(idx);
}).index;
_xmlIndexToListIndex[_xmlIndex] = listIndex;
_xmlIndex++;
}
SharedString? value(int xmlIndex) {
final listIndex = _xmlIndexToListIndex[xmlIndex];
if (listIndex != null && listIndex < _list.length) {
return _list[listIndex];
}
return null;
}
Impact
High Severity: Files created by Windows Excel's "Enable Editing and Save" feature may contain duplicate sharedString entries, causing the parser to crash with "Unexpected null value".
Workaround
Pre-process the Excel file to make duplicate entries unique before parsing:
// Add invisible zero-width space to duplicates
final uniqueText = originalText + '\u200B' + index.toString();
Test Case
test('should load all sharedString entries including duplicates', () {
// Create Excel file with duplicate sharedStrings
final archive = Archive();
final sharedStringsXml = '''
<sst xmlns="..." count="3" uniqueCount="3">
<si><t>Value1</t></si>
<si><t>Value2</t></si>
<si><t>Value1</t></si> <!-- Duplicate of index 0 -->
</sst>
''';
// ... add to archive and encode ...
final excel = Excel.decodeBytes(bytes);
// Should load all 3 entries
expect(excel._sharedStrings._list.length, 3);
expect(excel._sharedStrings.value(0)?.stringValue, 'Value1');
expect(excel._sharedStrings.value(1)?.stringValue, 'Value2');
expect(excel._sharedStrings.value(2)?.stringValue, 'Value1');
});
Additional Context
- Excel files use sharedStrings to avoid storing duplicate text values multiple times
- However, Windows Excel sometimes creates malformed files with actual duplicate entries in the XML
- The parser should handle these gracefully rather than crashing
- The
count attribute in <sst> represents total string usage across all cells, not unique strings
- The
uniqueCount attribute should represent unique values, but Windows Excel sometimes sets this incorrectly
Environment
- Dart SDK: 3.x
- Flutter: 3.x
- Platform: Web, Android, iOS, Windows, macOS, Linux
- File Source: Excel files created/edited by Microsoft Excel on Windows
Related Files
lib/src/sharedStrings/shared_strings.dart (main issue)
lib/src/parser/parse.dart (calls add() method)
lib/src/excel.dart (uses _sharedStrings)
References
Bug Report: Excel Package Skips Duplicate SharedString Entries
Package Information
excel(https://pub.dev/packages/excel)Summary
The
_SharedStringsMaintainer.add()method usesputIfAbsent()which skips adding duplicate string values to the_list. This causes index mismatches when cells reference sharedString indices that were skipped, resulting in "Unexpected null value" errors.Root Cause
File:
lib/src/sharedStrings/shared_strings.dartMethod:
_SharedStringsMaintainer.add()Current Implementation (Buggy)
The Problem
sharedStrings.xml, the parser iterates through all<si>elements (e.g., 416 elements)_SharedStringsMaintainer.add()putIfAbsent()does not execute the closure because the key already exists_list.add(val)is never called for the duplicate_listhas only 415 entries (indices 0-414) instead of 416 (indices 0-415)_list[415]throws "Unexpected null value"Steps to Reproduce
1. Create a corrupted Excel file
Windows Excel sometimes creates duplicate sharedString entries when using "Enable Editing and Save". Example XML:
2. Create a cell referencing the last index
3. Try to parse the file
4. Debug Output
Expected Behavior
All 416
<si>elements from the XML should be loaded into_list, even if some have duplicate text values. The_listshould maintain the exact same indexing as the XML to ensure cells can reference any index 0-415.Actual Behavior
Only 415 unique entries are loaded into
_list. Any cell referencing index 415 (or any other skipped duplicate index) crashes with "Unexpected null value".Proposed Fix
Option 1: Always Add to List (Recommended)
Modify
_SharedStringsMaintainer.add()to always add elements to_list:Advantages:
_listindicesConsiderations:
_listmay contain duplicate entriesOption 2: Track Original XML Index
Create a separate index mapping to preserve XML positions while deduplicating:
Impact
High Severity: Files created by Windows Excel's "Enable Editing and Save" feature may contain duplicate sharedString entries, causing the parser to crash with "Unexpected null value".
Workaround
Pre-process the Excel file to make duplicate entries unique before parsing:
Test Case
Additional Context
countattribute in<sst>represents total string usage across all cells, not unique stringsuniqueCountattribute should represent unique values, but Windows Excel sometimes sets this incorrectlyEnvironment
Related Files
lib/src/sharedStrings/shared_strings.dart(main issue)lib/src/parser/parse.dart(callsadd()method)lib/src/excel.dart(uses_sharedStrings)References