Xteink-X4-crosspoint-reader

mirror of https://github.com/daveallie/crosspoint-reader.git synced 2026-02-04 14:47:37 +03:00

Author	SHA1	Message	Date
Jake Kenneally	f0ac68d26c	Address review comments - Renamed `getIndentWidth` to `getTextAdvanceX` - Collapsed `Style` and `BlockStyle` into a single struct, and switched to using bitflag setup for determining font style in `EpdFontFamily::Style`, including underlined text - Added caching for parsed CSS rules - Reverted changes for fixing spurious spaces - Skipped loading CSS on Sleep and HomeScreen activities, since we only need BookMetadata and the cover image - Reverted changes to BookMetadataCache, since we don't need to cache the individual CSS files and can instead use the parsed CSS rules (and the new cache file for those) - Switched intermediary values to direct assignment in `CssParser.cpp` - Added function in `BlockStyle.h` to directly convert from a `CssStyle` to a `BlockStyle`, as well as combined multiple `BlockStyle`s together for nested elements that should inherit the parent's style when the child's is unspecified - Updated names of variables in `CssStyle` to match those of the CSS they represent (e.g. alignment -> textAlign, indent -> textIndent) - General cleaning up and simplifying the code	2026-02-02 22:18:06 -05:00
Jake Kenneally	834440aab4	Merge branch 'master' into feature/add-epub-css-parsing * master: (33 commits) feat: add HalDisplay and HalGPIO (#522) feat: Display epub metadata on Recents (#511) chore: Cut release 0.16.0 fix: Correctly render italics on image alt placeholders (#569) chore: .gitignore: add compile_commands.json & .cache (#568) fix: Render keyboard entry over multiple lines (#567) fix: missing front layout in mapLabels() (#564) refactor: Re-work for OTA feature (#509) perf: optimize large EPUB indexing from O(n^2) to O(n) (#458) feat: Add Spanish hyphenation support (#558) feat: Add support to B&W filters to image covers (#476) feat(ux): page turning on button pressed if long-press chapter skip is disabled (#451) feat: Add status bar option "Full w/ Progress Bar" (#438) fix: Validate settings on read. (#492) fix: rotate origin in drawImage (#557) feat: Extract author from XTC/XTCH files (#563) fix: add txt books to recent tab (#526) docs: add font generation commands to builtin font headers (#547) docs: Update README with supported languages for EPUB (#530) fix: Fix KOReader document md5 calculation for binary matching progress sync (#529) ...	2026-01-27 20:24:38 -05:00
Daniel Chelling	83315b6179	perf: optimize large EPUB indexing from O(n^2) to O(n) (#458 ) ## Summary Optimizes EPUB metadata indexing for large books (2000+ chapters) from ~30 minutes to ~50 seconds by replacing O(n²) algorithms with O(n log n) hash-indexed lookups. Fixes #134 ## Problem Three phases had O(n²) complexity due to nested loops: \| Phase \| Operation \| Before (2768 chapters) \| \|-------\|-----------\|------------------------\| \| OPF Pass \| For each spine ref, scan all manifest items \| ~25 min \| \| TOC Pass \| For each TOC entry, scan all spine items \| ~5 min \| \| buildBookBin \| For each spine item, scan ZIP central directory \| ~8.4 min \| Total: ~30+ minutes for first-time indexing of large EPUBs. ## Solution Replace linear scans with sorted hash indexes + binary search: - OPF Pass: Build `{hash(id), len, offset}` index from manifest, binary search for each spine ref - TOC Pass: Build `{hash(href), len, spineIndex}` index from spine, binary search for each TOC entry - buildBookBin: New `ZipFile::fillUncompressedSizes()` API - single ZIP central directory scan with batch hash matching All indexes use FNV-1a hashing with length as secondary key to minimize collisions. Indexes are freed immediately after each phase. ## Results Shadow Slave EPUB (2768 chapters): \| Phase \| Before \| After \| Speedup \| \|-------\|--------\|-------\|---------\| \| OPF pass \| ~25 min \| 10.8 sec \| ~140x \| \| TOC pass \| ~5 min \| 4.7 sec \| ~60x \| \| buildBookBin \| 506 sec \| 34.6 sec \| ~15x \| \| Total \| ~30+ min \| ~50 sec \| ~36x \| Normal EPUB (87 chapters): 1.7 sec - no regression. ## Memory Peak temporary memory during indexing: - OPF index: ~33KB (2770 items × 12 bytes) - TOC index: ~33KB (2768 items × 12 bytes) - ZIP batch: ~44KB (targets + sizes arrays) All indexes cleared immediately after each phase. No OOM risk on ESP32-C3. ## Note on Threshold All optimizations are gated by `LARGE_SPINE_THRESHOLD = 400` to preserve existing behavior for small books. However, the algorithms work correctly for any book size and are faster even for small books: \| Book Size \| Old O(n²) \| New O(n log n) \| Improvement \| \|-----------\|-----------\|----------------\|-------------\| \| 10 ch \| 100 ops \| 50 ops \| 2x \| \| 100 ch \| 10K ops \| 800 ops \| 12x \| \| 400 ch \| 160K ops \| 4K ops \| 40x \| If preferred, the threshold could be removed to use the optimized path universally. ## Testing - [x] Shadow Slave (2768 chapters): 50s first-time indexing, loads and navigates correctly - [x] Normal book (87 chapters): 1.7s indexing, no regression - [x] Build passes - [x] clang-format passes ## Files Changed - `lib/Epub/Epub/parsers/ContentOpfParser.h/.cpp` - OPF manifest index - `lib/Epub/Epub/BookMetadataCache.h/.cpp` - TOC index + batch size lookup - `lib/ZipFile/ZipFile.h/.cpp` - New `fillUncompressedSizes()` API - `lib/Epub/Epub.cpp` - Timing logs <details> <summary><b>Algorithm Details</b> (click to expand)</summary> ### Phase 1: OPF Pass - Manifest to Spine Lookup Problem: Each `<itemref idref="ch001">` in spine must find matching `<item id="ch001" href="...">` in manifest. ``` OLD: For each of 2768 spine refs, scan all 2770 manifest items = 7.6M string comparisons NEW: While parsing manifest, build index: { hash("ch001"), len=5, file_offset=120 } Sort index, then binary search for each spine ref: 2768 × log₂(2770) ≈ 2768 × 11 = 30K comparisons ``` ### Phase 2: TOC Pass - TOC Entry to Spine Index Lookup Problem: Each TOC entry with `href="chapter0001.xhtml"` must find its spine index. ``` OLD: For each of 2768 TOC entries, scan all 2768 spine entries = 7.6M string comparisons NEW: At beginTocPass(), read spine once and build index: { hash("OEBPS/chapter0001.xhtml"), len=25, spineIndex=0 } Sort index, binary search for each TOC entry: 2768 × log₂(2768) ≈ 30K comparisons Clear index at endTocPass() to free memory. ``` ### Phase 3: buildBookBin - ZIP Size Lookup Problem: Need uncompressed file size for each spine item (for reading progress). Sizes are in ZIP central directory. ``` OLD: For each of 2768 spine items, scan ZIP central directory (2773 entries) = 7.6M filename reads + string comparisons Time: 506 seconds NEW: Step 1: Build targets from spine { hash("OEBPS/chapter0001.xhtml"), len=25, index=0 } Sort by (hash, len) Step 2: Single pass through ZIP central directory For each entry: - Compute hash ON THE FLY (no string allocation) - Binary search targets - If match: sizes[target.index] = uncompressedSize Step 3: Use sizes array directly (O(1) per spine item) Total: 2773 entries × log₂(2768) ≈ 33K comparisons Time: 35 seconds ``` ### Why Hash + Length? Using 64-bit FNV-1a hash + string length as a composite key: - Collision probability: ~1 in 2⁶⁴ × typical_path_lengths - No string storage needed in index (just 12-16 bytes per entry) - Integer comparisons are faster than string comparisons - Verification on match handles the rare collision case </details> --- _AI-assisted development. All changes tested on hardware._	2026-01-28 01:29:15 +11:00
Jonas Diemer	9224bc3f8c	fix: #348 fit cover artifacts 2 (#465 ) Supersedes #358 and includes the bugfix from #351	2026-01-27 20:21:15 +11:00
Jake Kenneally	be2de1123b	Merge remote-tracking branch 'origin' into feature/add-epub-css-parsing * origin: fix: truncate chapter names that are too long (#422) feat: dict based Hyphenation (#305) fix: render U+FFFD replacement character instead of ? (#366) fix: Invert colors on home screen cover overlay when recent book is selected (#390) Adds KOReader Sync support (#232) feat: Change keyboard "caps" to "shift" & Wrap Keyboard (#377) fix: XTC 1-bit thumb BMP polarity inversion (#373)	2026-01-19 22:37:37 -06:00
Arthur Tazhitdinov	8824c87490	feat: dict based Hyphenation (#305 ) ## Summary * Adds (optional) Hyphenation for English, French, German, Russian languages ## Additional Context * Included hyphenation dictionaries add approximately 280kb to the flash usage (German alone takes 200kb) * Trie encoded dictionaries are adopted from hypher project (https://github.com/typst/hypher) * Soft hyphens (and other explicit hyphens) take precedence over dict-based hyphenation. Overall, the hyphenation rules are quite aggressive, as I believe it makes more sense on our smaller screen. --------- Co-authored-by: Dave Allie <dave@daveallie.com>	2026-01-19 12:56:26 +00:00
Justin Mitchell	f69cddf2cc	Adds KOReader Sync support (#232 ) Some checks are pending CI / build (push) Waiting to run Details ## Summary - Adds KOReader progress sync integration, allowing CrossPoint to sync reading positions with other KOReader-compatible devices - Stores credentials securely with XOR obfuscation - Uses KOReader's partial MD5 document hashing for cross-device book matching - Syncs position via percentage with estimated XPath for compatibility # Features - Settings: KOReader Username, Password, and Authenticate options - Sync from chapters menu: "Sync Progress" option appears when credentials are configured - Bidirectional sync: Can apply remote progress or upload local progress --------- Co-authored-by: Dave Allie <dave@daveallie.com>	2026-01-19 11:55:35 +00:00
Jake Kenneally	94ce987f2c	feat: Add CSS parsing and CSS support in EPUBs	2026-01-17 17:57:04 -05:00
Eunchurn Park	fecd1849b9	Add cover image display in Continue Reading card with framebuffer caching (#200 ) ## Summary * What is the goal of this PR? (e.g., Fixes a bug in the user authentication module, Display the book cover image in the "Continue Reading" card on the home screen, with fast navigation using framebuffer caching. * What changes are included? - Display book cover image in the "Continue Reading" card on home screen - Load cover from cached BMP (same as sleep screen cover) - Add framebuffer store/restore functions (`copyStoredBwBuffer`, `freeStoredBwBuffer`) for fast navigation after initial render - Fix `drawBitmap` scaling bug: apply scale to offset only, not to base coordinates - Add white text boxes behind title/author/continue reading label for readability on cover - Support both EPUB and XTC file cover images - Increase HomeActivity task stack size from 2048 to 4096 for cover image rendering ## Additional Context * Add any other information that might be helpful for the reviewer (e.g., performance implications, potential risks, specific areas to focus on). - Performance: First render loads cover from SD card (~800ms), subsequent navigation uses cached framebuffer (~instant) - Memory: Framebuffer cache uses ~48KB (6 chunks × 8KB) while on home screen, freed on exit - Fallback: If cover image is not available, falls back to standard text-only display - The `drawBitmap` fix corrects a bug where screenY = (y + offset) scale was incorrectly scaling the base coordinates. Now correctly uses screenY = y + (offset scale)	2026-01-14 21:24:02 +11:00
Dave Allie	8f3df7e10e	fix: Handle EPUB 3 TOC to spine mapping when nav file in subdirectory (#332 ) ## Summary - Nav file in EPUB 3 file is a HTML file with relative hrefs - If this file exists anywhere but in the same location as the content.opf file, navigating in the book will fail - Bump the book cache version to rebuild potentially broken books ## Additional Context - Fixes https://github.com/daveallie/crosspoint-reader/issues/264 --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? - [ ] Yes - [ ] Partially - [x] No	2026-01-13 00:57:34 +11:00
Jonas Diemer	a9242fe61f	Generate different .bmp for cropped covers so settings have effect. (#330 ) Addresses https://github.com/daveallie/crosspoint-reader/pull/225#issuecomment-3735150337	2026-01-12 20:55:47 +11:00
Pavel Liashkov	0332e1103a	Add EPUB 3 nav.xhtml TOC support (#197 ) ## Summary * What is the goal of this PR? Add EPUB 3 support by implementing native navigation document (nav.xhtml) parsing with NCX fallback, addressing issue Fixes: #143. * What changes are included? - New `TocNavParser` for parsing EPUB 3 HTML5 navigation documents (`<nav epub:type="toc">`) - Detection of nav documents via `properties="nav"` attribute in OPF manifest - Fallback logic: try EPUB 3 nav first, fall back to NCX (EPUB 2) if unavailable - Graceful degradation: books without any TOC now load with a warning instead of failing ## Additional Context * The implementation follows the existing streaming XML parser pattern using Expat to minimize RAM usage on the ESP32-C3 * EPUB 3 books that include both nav.xhtml and toc.ncx will prefer the nav document (per EPUB 3 spec recommendation) * No breaking changes - existing EPUB 2 books continue to work as before * Tested on examples from https://idpf.github.io/epub3-samples/30/samples.html	2026-01-03 19:10:35 +11:00
Dave Allie	52a0b5bbe9	Small cleanups from https://github.com/juicecultus/crosspoint-reader-x4	2025-12-30 23:19:08 +11:00
Dave Allie	3abcd0d05d	Redesign home screen (#166 ) ## Summary * Redesigned home screen with big option to continue reading and slightly nicer options to navigate to core sections * Attempt to use the cached EPUB details (title, author) if they exist, otherwise fall back to file name * Adjusted button hints on home screen, removed Back option and changed left/right to up/down ## Additional Context * Core of this work comes from @ChandhokTannay in `1d36a86ef1`	2025-12-30 23:18:10 +11:00
Jonas Diemer	03f0ce04cc	Feature: go to text/start reference in epub guide section at first start (#156 ) This parses the guide section in the content.opf for text/start references and jumps to this on first open of the book. Currently, this behavior will be repeated in case the reader manually jumps to Chapter 0 and then re-opens the book. IMO, this is an acceptable edge case (for which I couldn't see a good fix other than to drag a "first open" boolean around). --------- Co-authored-by: Sam Davis <sam@sjd.co> Co-authored-by: Dave Allie <dave@daveallie.com>	2025-12-30 23:02:46 +11:00
Dave Allie	be1b5bad21	Parse the author name from content.opf file (#165 ) ## Summary * Parse the author name from content.opf file * Listed in the dc:creator tag within the metadata section	2025-12-30 22:15:44 +11:00
Dave Allie	fb5fc32c5d	Add exFAT support (#150 ) ## Summary * Swap to updated SDCardManager which uses SdFat * Add exFAT support * Swap to using FsFile everywhere * Use newly exposed `SdMan` macro to get to static instance of SDCardManager * Move a bunch of FsHelpers up to SDCardManager	2025-12-30 16:09:30 +11:00
Dave Allie	071ccb9d1b	Custom zip parsing (#140 ) ## Summary * Use custom zip central directory parsing to lower memory usage when loading zipped epub content	2025-12-29 21:17:29 +11:00
Dave Allie	b6bc1f7ed3	New book.bin spine and table of contents cache (#104 ) ## Summary * Use single unified cache file for book spine, table of contents, and core metadata (title, author, cover image) * Use new temp item store file in OPF parsing to store items to be rescaned when parsing spine * This avoids us holding these items in memory * Use new toc.bin.tmp and spine.bin.tmp to build out partial toc / spine data as part of parsing content.opf and the NCX file * These files are re-read multiple times to ultimately build book.bin ## Additional Context * Spec for file format included below as an image * This should help with: * #10 * #60 * #99	2025-12-24 22:36:13 +11:00
Dave Allie	1107590b56	Standardize File handling with FsHelpers (#110 ) Some checks failed CI / build (push) Has been cancelled Details ## Summary * Standardize File handling with FsHelpers * Better central place to manage to logic of if files exist/open for reading/writing	2025-12-23 14:14:10 +11:00
Dave Allie	955c78de64	Book cover sleep screen (#89 ) ## Summary * Fix issue with 2-bit bmp rendering * Add support generate book cover BMP from JPG and use as sleep screen ## Additional Context * It does not support other image formats beyond JPG at this point * Something is cooked with my JpegToBmpConverter logic, it generates weird interlaced looking images for some JPGs \| Book 1 \| Book 2\| \| --- \| --- \| \| ![IMG_5653](https://github.com/user-attachments/assets/49bbaeaa-b171-44c7-a68d-14cbe42aef03) \| ![IMG_5652](https://github.com/user-attachments/assets/7db88d70-e09a-49b0-a9a0-4cc729b4ca0c) \|	2025-12-21 18:42:06 +11:00
Dave Allie	f264efdb12	Extract EPUB TOC into temp file before parsing (#85 ) ## Summary * Extract EPUB TOC into temp file before parsing * Streaming ZIP -> XML parser uses up a lot of memory as we're allocating inflation buffers while also holding a few copies of the buffer in different forms * Instead, but streaming the inflated file down to the SD card (like we do for HTML parsing, we can lower memory usage) ## Additional Context * This should help with https://github.com/daveallie/crosspoint-reader/issues/60 and https://github.com/daveallie/crosspoint-reader/issues/10. It won't remove those class of issues completely, but will allow for many more books to be opened.	2025-12-21 17:08:34 +11:00
Dave Allie	0d32d21d75	Small code cleanup (#83 ) Some checks are pending CI / build (push) Waiting to run Details ## Summary * Fix cppcheck low violations * Remove teardown method on parsers, use destructor * Code cleanup	2025-12-21 15:43:53 +11:00
Jonas Diemer	926c786705	Keep ZipFile open to speed up getting file stats. (#76 ) Still a bit raw, but gets the time required to determine the size of each chapter (for reading progress) down from ~25ms to 0-1ms. This is done by keeping the zipArchive open (so simple ;)). Probably we don't need to cache the spine sizes anymore then... --------- Co-authored-by: Dave Allie <dave@daveallie.com>	2025-12-21 14:38:51 +11:00
IFAKA	73d1839ddd	fix: add bounds checks to Epub getter functions (#82 ) ## Problem Three Epub getter functions can throw exceptions: - `getCumulativeSpineItemSize()`: No bounds check before `.at(spineIndex)` - `getSpineItem()`: If spine is empty and index invalid, `.at(0)` throws - `getTocItem()`: If toc is empty and index invalid, `.at(0)` throws ## Fix - Add bounds check to `getCumulativeSpineItemSize()`, return 0 on error - Add empty container checks to `getSpineItem()` and `getTocItem()` - Use static fallback objects for safe reference returns on empty containers Changed `lib/Epub/Epub.cpp`. ## Test - Defensive additions - follows existing bounds check patterns - No logic changes for valid inputs - Manual device testing appreciated	2025-12-21 13:36:30 +11:00
IFAKA	adfeee063f	Handle empty spine in getBookSize() and calculateProgress() (#67 ) ## Problem - `getBookSize()` calls `getCumulativeSpineItemSize(getSpineItemsCount() - 1)` which passes -1 when spine is empty - `calculateProgress()` then divides by zero when book size is 0 ## Fix - Return 0 from `getBookSize()` if spine is empty - Return 0 from `calculateProgress()` if book size is 0 ## Testing - Builds successfully with `pio run` - Affects: `lib/Epub/Epub.cpp`	2025-12-19 23:28:36 +11:00
IFAKA	3e28724b62	Add bounds checking for TOC/spine array access (#64 ) ## Problem `getSpineIndexForTocIndex()` and `getTocIndexForSpineIndex()` access `toc[tocIndex]` and `spine[spineIndex]` without validating indices are within bounds. Malformed EPUBs or edge cases could trigger out-of-bounds access. ## Fix Added bounds validation at the start of both functions before accessing the arrays. ## Testing - Builds successfully with `pio run` - Affects: `lib/Epub/Epub.cpp`	2025-12-19 23:23:23 +11:00
Jonas Diemer	424594488f	Caching of spine item sizes for faster book loading (saves 1-4 seconds). (#54 ) Some checks are pending CI / build (push) Waiting to run Details As discussed in https://github.com/daveallie/crosspoint-reader/pull/38#issuecomment-3665142427, #38	2025-12-18 22:49:14 +11:00
Jonas Diemer	063a1df851	Bugfix for #46 : don't look at previous chapters if in chapter 0. (#48 ) Some checks are pending CI / build (push) Waiting to run Details Fixes #46	2025-12-18 06:28:06 +11:00
Jonas Diemer	c78f2a9840	Calculate the progress in the book by file sizes of each chapter. (#38 ) ## Summary Addresses #35. Maybe it could be wise to do some caching of the spine sizes (but performance isn't too bad).	2025-12-17 23:05:24 +11:00
Arthur Tazhitdinov	973d372521	TOC location fix (#25 ) ## Summary * Rely on media-type="application/x-dtbncx+xml" to find TOC instead of hardcoded values ## Additional Context * Most of my epubs don't have id==ncx for toc file location. I think this media-type is EPUB standard --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-17 18:49:45 +11:00
Dave Allie	c262f222de	Parse cover image path from content.opf file (#24 ) Some checks are pending CI / build (push) Waiting to run Details	2025-12-16 03:15:54 +11:00
Dave Allie	ead39fd04b	Return -1 from getTocIndexForSpineIndex if TOC item does not exist	2025-12-13 21:17:22 +11:00
Dave Allie	c7a32fe41f	Remove tinyxml2 dependency replace with expat parsers (#9 )	2025-12-13 19:36:01 +11:00
Dave Allie	69f357998e	Move to smart pointers and split out ParsedText class (#6 ) * Move to smart pointers and split out ParsedText class * Cleanup ParsedText * Fix clearCache functions and clear section cache if page load fails * Bump Page and Section file versions * Combine removeDir implementations in Epub * Adjust screen margins	2025-12-12 22:13:34 +11:00
Dave Allie	07cc589e59	Cleanup serial output	2025-12-08 22:39:23 +11:00
Dave Allie	de453fed1d	Stream inflated EPUB HTMLs down to disk instead of inflating in memory (#4 ) * Downgrade miniz for stability * Stream HTML from ZIP down to disk instead of loading all in mem	2025-12-08 00:39:17 +11:00
Dave Allie	7704772ebe	Handle nested navpoint elements in nxc TOC	2025-12-03 22:30:50 +11:00
Dave Allie	4186c7da9e	Remove debug lines	2025-12-03 22:30:13 +11:00
Dave Allie	2ccdbeecc8	Public release	2025-12-03 22:06:45 +11:00

40 Commits