Three optimizations for EPUBs with many chapters (e.g. 2768 chapters):
1. OPF idref→href lookup: Build sorted hash index during manifest parsing,
use binary search during spine resolution. Reduces ~4min to ~30-60s.
2. TOC href→spineIndex lookup: Build sorted hash index in beginTocPass(),
use binary search in createTocEntry(). Reduces ~4min to ~30-60s.
3. ZIP central-dir cursor: Resume scanning from last position instead of
restarting from beginning. Reduces ~8min to ~1-3min.
All optimizations only activate for large EPUBs (≥400 spine items).
Small books use unchanged code paths.
Memory impact: ~33KB + ~39KB temporary during indexing, freed after.
Expected total: ~17min → ~3-5min for Shadow Slave (2768 chapters).
Also adds phase timing logs for performance measurement.
## Summary
* Adds (optional) Hyphenation for English, French, German, Russian
languages
## Additional Context
* Included hyphenation dictionaries add approximately 280kb to the flash
usage (German alone takes 200kb)
* Trie encoded dictionaries are adopted from hypher project
(https://github.com/typst/hypher)
* Soft hyphens (and other explicit hyphens) take precedence over
dict-based hyphenation. Overall, the hyphenation rules are quite
aggressive, as I believe it makes more sense on our smaller screen.
---------
Co-authored-by: Dave Allie <dave@daveallie.com>
## Summary
* **What is the goal of this PR?** Add EPUB 3 support by implementing
native navigation document (nav.xhtml) parsing with NCX fallback,
addressing issue Fixes: #143.
* **What changes are included?**
- New `TocNavParser` for parsing EPUB 3 HTML5 navigation documents
(`<nav epub:type="toc">`)
- Detection of nav documents via `properties="nav"` attribute in OPF
manifest
- Fallback logic: try EPUB 3 nav first, fall back to NCX (EPUB 2) if
unavailable
- Graceful degradation: books without any TOC now load with a warning
instead of failing
## Additional Context
* The implementation follows the existing streaming XML parser pattern
using Expat to minimize RAM usage on the ESP32-C3
* EPUB 3 books that include both nav.xhtml and toc.ncx will prefer the
nav document (per EPUB 3 spec recommendation)
* No breaking changes - existing EPUB 2 books continue to work as before
* Tested on examples from
https://idpf.github.io/epub3-samples/30/samples.html
This parses the guide section in the content.opf for text/start
references and jumps to this on first open of the book.
Currently, this behavior will be repeated in case the reader manually
jumps to Chapter 0 and then re-opens the book. IMO, this is an
acceptable edge case (for which I couldn't see a good fix other than to
drag a "first open" boolean around).
---------
Co-authored-by: Sam Davis <sam@sjd.co>
Co-authored-by: Dave Allie <dave@daveallie.com>
## Summary
* Swap to updated SDCardManager which uses SdFat
* Add exFAT support
* Swap to using FsFile everywhere
* Use newly exposed `SdMan` macro to get to static instance of
SDCardManager
* Move a bunch of FsHelpers up to SDCardManager
## Summary
* Use single unified cache file for book spine, table of contents, and
core metadata (title, author, cover image)
* Use new temp item store file in OPF parsing to store items to be
rescaned when parsing spine
* This avoids us holding these items in memory
* Use new toc.bin.tmp and spine.bin.tmp to build out partial toc / spine
data as part of parsing content.opf and the NCX file
* These files are re-read multiple times to ultimately build book.bin
## Additional Context
* Spec for file format included below as an image
* This should help with:
* #10
* #60
* #99