## Summary Optimizes EPUB metadata indexing for large books (2000+ chapters) from ~30 minutes to ~50 seconds by replacing O(n²) algorithms with O(n log n) hash-indexed lookups. Fixes #134 ## Problem Three phases had O(n²) complexity due to nested loops: | Phase | Operation | Before (2768 chapters) | |-------|-----------|------------------------| | OPF Pass | For each spine ref, scan all manifest items | ~25 min | | TOC Pass | For each TOC entry, scan all spine items | ~5 min | | buildBookBin | For each spine item, scan ZIP central directory | ~8.4 min | Total: **~30+ minutes** for first-time indexing of large EPUBs. ## Solution Replace linear scans with sorted hash indexes + binary search: - **OPF Pass**: Build `{hash(id), len, offset}` index from manifest, binary search for each spine ref - **TOC Pass**: Build `{hash(href), len, spineIndex}` index from spine, binary search for each TOC entry - **buildBookBin**: New `ZipFile::fillUncompressedSizes()` API - single ZIP central directory scan with batch hash matching All indexes use FNV-1a hashing with length as secondary key to minimize collisions. Indexes are freed immediately after each phase. ## Results **Shadow Slave EPUB (2768 chapters):** | Phase | Before | After | Speedup | |-------|--------|-------|---------| | OPF pass | ~25 min | 10.8 sec | ~140x | | TOC pass | ~5 min | 4.7 sec | ~60x | | buildBookBin | 506 sec | 34.6 sec | ~15x | | **Total** | **~30+ min** | **~50 sec** | **~36x** | **Normal EPUB (87 chapters):** 1.7 sec - no regression. ## Memory Peak temporary memory during indexing: - OPF index: ~33KB (2770 items × 12 bytes) - TOC index: ~33KB (2768 items × 12 bytes) - ZIP batch: ~44KB (targets + sizes arrays) All indexes cleared immediately after each phase. No OOM risk on ESP32-C3. ## Note on Threshold All optimizations are gated by `LARGE_SPINE_THRESHOLD = 400` to preserve existing behavior for small books. However, the algorithms work correctly for any book size and are faster even for small books: | Book Size | Old O(n²) | New O(n log n) | Improvement | |-----------|-----------|----------------|-------------| | 10 ch | 100 ops | 50 ops | 2x | | 100 ch | 10K ops | 800 ops | 12x | | 400 ch | 160K ops | 4K ops | 40x | If preferred, the threshold could be removed to use the optimized path universally. ## Testing - [x] Shadow Slave (2768 chapters): 50s first-time indexing, loads and navigates correctly - [x] Normal book (87 chapters): 1.7s indexing, no regression - [x] Build passes - [x] clang-format passes ## Files Changed - `lib/Epub/Epub/parsers/ContentOpfParser.h/.cpp` - OPF manifest index - `lib/Epub/Epub/BookMetadataCache.h/.cpp` - TOC index + batch size lookup - `lib/ZipFile/ZipFile.h/.cpp` - New `fillUncompressedSizes()` API - `lib/Epub/Epub.cpp` - Timing logs <details> <summary><b>Algorithm Details</b> (click to expand)</summary> ### Phase 1: OPF Pass - Manifest to Spine Lookup **Problem**: Each `<itemref idref="ch001">` in spine must find matching `<item id="ch001" href="...">` in manifest. ``` OLD: For each of 2768 spine refs, scan all 2770 manifest items = 7.6M string comparisons NEW: While parsing manifest, build index: { hash("ch001"), len=5, file_offset=120 } Sort index, then binary search for each spine ref: 2768 × log₂(2770) ≈ 2768 × 11 = 30K comparisons ``` ### Phase 2: TOC Pass - TOC Entry to Spine Index Lookup **Problem**: Each TOC entry with `href="chapter0001.xhtml"` must find its spine index. ``` OLD: For each of 2768 TOC entries, scan all 2768 spine entries = 7.6M string comparisons NEW: At beginTocPass(), read spine once and build index: { hash("OEBPS/chapter0001.xhtml"), len=25, spineIndex=0 } Sort index, binary search for each TOC entry: 2768 × log₂(2768) ≈ 30K comparisons Clear index at endTocPass() to free memory. ``` ### Phase 3: buildBookBin - ZIP Size Lookup **Problem**: Need uncompressed file size for each spine item (for reading progress). Sizes are in ZIP central directory. ``` OLD: For each of 2768 spine items, scan ZIP central directory (2773 entries) = 7.6M filename reads + string comparisons Time: 506 seconds NEW: Step 1: Build targets from spine { hash("OEBPS/chapter0001.xhtml"), len=25, index=0 } Sort by (hash, len) Step 2: Single pass through ZIP central directory For each entry: - Compute hash ON THE FLY (no string allocation) - Binary search targets - If match: sizes[target.index] = uncompressedSize Step 3: Use sizes array directly (O(1) per spine item) Total: 2773 entries × log₂(2768) ≈ 33K comparisons Time: 35 seconds ``` ### Why Hash + Length? Using 64-bit FNV-1a hash + string length as a composite key: - Collision probability: ~1 in 2⁶⁴ × typical_path_lengths - No string storage needed in index (just 12-16 bytes per entry) - Integer comparisons are faster than string comparisons - Verification on match handles the rare collision case </details> --- _AI-assisted development. All changes tested on hardware._ |
||
|---|---|---|
| .github | ||
| bin | ||
| docs | ||
| include | ||
| lib | ||
| open-x4-sdk@bd4e670750 | ||
| scripts | ||
| src | ||
| test | ||
| .clang-format | ||
| .clangd | ||
| .gitignore | ||
| .gitmodules | ||
| LICENSE | ||
| partitions.csv | ||
| platformio.ini | ||
| README.md | ||
| USER_GUIDE.md | ||
CrossPoint Reader
Firmware for the Xteink X4 e-paper display reader (unaffiliated with Xteink). Built using PlatformIO and targeting the ESP32-C3 microcontroller.
CrossPoint Reader is a purpose-built firmware designed to be a drop-in, fully open-source replacement for the official Xteink firmware. It aims to match or improve upon the standard EPUB reading experience.
Motivation
E-paper devices are fantastic for reading, but most commercially available readers are closed systems with limited customisation. The Xteink X4 is an affordable, e-paper device, however the official firmware remains closed. CrossPoint exists partly as a fun side-project and partly to open up the ecosystem and truely unlock the device's potential.
CrossPoint Reader aims to:
- Provide a fully open-source alternative to the official firmware.
- Offer a document reader capable of handling EPUB content on constrained hardware.
- Support customisable font, layout, and display options.
- Run purely on the Xteink X4 hardware.
This project is not affiliated with Xteink; it's built as a community project.
Features & Usage
- EPUB parsing and rendering (EPUB 2 and EPUB 3)
- Image support within EPUB
- Saved reading position
- File explorer with file picker
- Basic EPUB picker from root directory
- Support nested folders
- EPUB picker with cover art
- Custom sleep screen
- Cover sleep screen
- Wifi book upload
- Wifi OTA updates
- Configurable font, layout, and display options
- User provided fonts
- Full UTF support
- Screen rotation
Multi-language support: Read EPUBs in various languages, including English, Spanish, French, German, Italian, Portuguese, Russian, Ukrainian, Polish, Swedish, Norwegian, and more.
See the user guide for instructions on operating CrossPoint.
Installing
Web (latest firmware)
- Connect your Xteink X4 to your computer via USB-C
- Go to https://xteink.dve.al/ and click "Flash CrossPoint firmware"
To revert back to the official firmware, you can flash the latest official firmware from https://xteink.dve.al/, or swap back to the other partition using the "Swap boot partition" button here https://xteink.dve.al/debug.
Web (specific firmware version)
- Connect your Xteink X4 to your computer via USB-C
- Download the
firmware.binfile from the release of your choice via the releases page - Go to https://xteink.dve.al/ and flash the firmware file using the "OTA fast flash controls" section
To revert back to the official firmware, you can flash the latest official firmware from https://xteink.dve.al/, or swap back to the other partition using the "Swap boot partition" button here https://xteink.dve.al/debug.
Manual
See Development below.
Development
Prerequisites
- PlatformIO Core (
pio) or VS Code + PlatformIO IDE - Python 3.8+
- USB-C cable for flashing the ESP32-C3
- Xteink X4
Checking out the code
CrossPoint uses PlatformIO for building and flashing the firmware. To get started, clone the repository:
git clone --recursive https://github.com/daveallie/crosspoint-reader
# Or, if you've already cloned without --recursive:
git submodule update --init --recursive
Flashing your device
Connect your Xteink X4 to your computer via USB-C and run the following command.
pio run --target upload
Internals
CrossPoint Reader is pretty aggressive about caching data down to the SD card to minimise RAM usage. The ESP32-C3 only has ~380KB of usable RAM, so we have to be careful. A lot of the decisions made in the design of the firmware were based on this constraint.
Data caching
The first time chapters of a book are loaded, they are cached to the SD card. Subsequent loads are served from the
cache. This cache directory exists at .crosspoint on the SD card. The structure is as follows:
.crosspoint/
├── epub_12471232/ # Each EPUB is cached to a subdirectory named `epub_<hash>`
│ ├── progress.bin # Stores reading progress (chapter, page, etc.)
│ ├── cover.bmp # Book cover image (once generated)
│ ├── book.bin # Book metadata (title, author, spine, table of contents, etc.)
│ └── sections/ # All chapter data is stored in the sections subdirectory
│ ├── 0.bin # Chapter data (screen count, all text layout info, etc.)
│ ├── 1.bin # files are named by their index in the spine
│ └── ...
│
└── epub_189013891/
Deleting the .crosspoint directory will clear the entire cache.
Due the way it's currently implemented, the cache is not automatically cleared when a book is deleted and moving a book file will use a new cache directory, resetting the reading progress.
For more details on the internal file structures, see the file formats document.
Contributing
Contributions are very welcome!
If you're looking for a way to help out, take a look at the ideas discussion board. If there's something there you'd like to work on, leave a comment so that we can avoid duplicated effort.
To submit a contribution:
- Fork the repo
- Create a branch (
feature/dithering-improvement) - Make changes
- Submit a PR
CrossPoint Reader is not affiliated with Xteink or any manufacturer of the X4 hardware.
Huge shoutout to diy-esp32-epub-reader by atomic14, which was a project I took a lot of inspiration from as I was making CrossPoint.
