diff --git a/README.md b/README.md index f015f718..d59df835 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,7 @@ This project is **not affiliated with Xteink**; it's built as a community projec ## Features & Usage -- [x] EPUB parsing and rendering +- [x] EPUB parsing and rendering (EPUB 2 and EPUB 3) - [ ] Image support within EPUB - [x] Saved reading position - [x] File explorer with file picker diff --git a/USER_GUIDE.md b/USER_GUIDE.md index 26ff1075..1af7e129 100644 --- a/USER_GUIDE.md +++ b/USER_GUIDE.md @@ -1,17 +1,16 @@ # CrossPoint User Guide -Welcome to the **CrossPoint** firmware. This guide outlines the hardware controls, navigation, and reading features of -the device. +Welcome to the **CrossPoint** firmware. This guide outlines the hardware controls, navigation, and reading features of the device. ## 1. Hardware Overview The device utilises the standard buttons on the Xtink X4 (in the same layout as the manufacturer firmware, by default): ### Button Layout -| Location | Buttons | -|-----------------|--------------------------------------------| -| **Bottom Edge** | **Back**, **Confirm**, **Left**, **Right** | -| **Right Side** | **Power**, **Volume Up**, **Volume Down** | +| Location | Buttons | +| --------------- | ---------------------------------------------------- | +| **Bottom Edge** | **Back**, **Confirm**, **Left**, **Right** | +| **Right Side** | **Power**, **Volume Up**, **Volume Down**, **Reset** | Button layout can be customized in **[Settings](#35-settings)**. @@ -21,8 +20,9 @@ Button layout can be customized in **[Settings](#35-settings)**. ### Power On / Off -To turn the device on or off, **press and hold the Power button for half a second**. In **[Settings](#35-settings)** you can configure -the power button to trigger on a short press instead of a long one. +To turn the device on or off, **press and hold the Power button for half a second**. In **[Settings](#35-settings)** you can configure the power button to trigger on a short press instead of a long one. + +To reboot the device (for example if it's frozen, or after a firmware update), press and release the Reset button, and then hold the Power button for a few seconds. ### First Launch @@ -37,15 +37,13 @@ Upon turning the device on for the first time, you will be placed on the **[Home ### 3.1 Home Screen -The Home Screen is the main entry point to the firmware. From here you can navigate to **[Reading Mode](#4-reading-mode)** with the most recently read book, **[Book Selection](#32-book-selection)**, -**[Settings](#35-settings)**, or the **[File Upload](#34-file-upload-screen)** screen. +The Home Screen is the main entry point to the firmware. From here you can navigate to **[Reading Mode](#4-reading-mode)** with the most recently read book, **[Book Selection](#32-book-selection)**, **[Settings](#35-settings)**, or the **[File Upload](#34-file-upload-screen)** screen. ### 3.2 Book Selection The Book Selection acts as a folder and file browser. -* **Navigate List:** Use **Left** (or **Volume Up**), or **Right** (or **Volume Down**) to move the selection cursor up - and down through folders and books. +* **Navigate List:** Use **Left** (or **Volume Up**), or **Right** (or **Volume Down**) to move the selection cursor up and down through folders and books. You can also long-press these buttons to scroll a full page up or down. * **Open Selection:** Press **Confirm** to open a folder or read a selected book. ### 3.3 Reading Mode @@ -54,42 +52,46 @@ See [Reading Mode](#4-reading-mode) below for more information. ### 3.4 File Upload Screen -The File Upload screen allows you to upload new e-books to the device. When you enter the screen, you'll be prompted with -a WiFi selection dialog and then your X4 will start hosting a web server. +The File Upload screen allows you to upload new e-books to the device. When you enter the screen, you'll be prompted with a WiFi selection dialog and then your X4 will start hosting a web server. See the [webserver docs](./docs/webserver.md) for more information on how to connect to the web server and upload files. +> [!TIP] +> Advanced users can also manage files programmatically or via the command line using `curl`. See the [webserver docs](./docs/webserver.md) for details. + ### 3.5 Settings The Settings screen allows you to configure the device's behavior. There are a few settings you can adjust: -- **Sleep Screen**: Which sleep screen to display when the device sleeps, options are: +- **Sleep Screen**: Which sleep screen to display when the device sleeps: - "Dark" (default) - The default dark sleep screen - "Light" - The same default sleep screen, on a white background - "Custom" - Custom images from the SD card, see [Sleep Screen](#36-sleep-screen) below for more information - "Cover" - The book cover image (Note: this is experimental and may not work as expected) -- **Status Bar**: Configure the status bar displayed while reading, options are: +- **Status Bar**: Configure the status bar displayed while reading: - "None" - No status bar - "No Progress" - Show status bar without reading progress - "Full" - Show status bar with reading progress -- **Extra Paragraph Spacing**: If enabled, vertical space will be added between paragraphs in the book, if disabled, - paragraphs will not have vertical space between them, but will have first word indentation. +- **Extra Paragraph Spacing**: If enabled, vertical space will be added between paragraphs in the book. If disabled, paragraphs will not have vertical space between them, but will have first-line indentation. - **Short Power Button Click**: Whether to trigger the power button on a short press or a long press. -- **Reading Orientation**: Set the screen orientation for reading, options are: +- **Reading Orientation**: Set the screen orientation for reading: - "Portrait" (default) - Standard portrait orientation - "Landscape CW" - Landscape, rotated clockwise - "Inverted" - Portrait, upside down - "Landscape CCW" - Landscape, rotated counter-clockwise -- **Front Button Layout**: Configure the order of the bottom edge buttons, options are: - - "Bck, Cnfrm, Lft, Rght" (default) - Back, Confirm, Left, Right - - "Lft, Rght, Bck, Cnfrm" - Left, Right, Back, Confirm - - "Lft, Bck, Cnfrm, Rght" - Left, Back, Confirm, Right -- **Side Button Layout**: Swap the order of the volume buttons from Previous/Next to Next/Previous. This change is only in effect when reading. -- **Reader Font Family**: Choose the font used for reading, options are: +- **Front Button Layout**: Configure the order of the bottom edge buttons: + - Back, Confirm, Left, Right (default) + - Left, Right, Back, Confirm + - Left, Back, Confirm, Right +- **Side Button Layout**: Swap the order of the up and down volume buttons from Previous/Next to Next/Previous. This change is only in effect when reading. +- **Reader Font Family**: Choose the font used for reading: - "Bookerly" (default) - Amazon's reading font - "Noto Sans" - Google's sans-serif font - "Open Dyslexic" - Font designed for readers with dyslexia -- **Reader Font Size**: Adjust the text size for reading, options are "Small", "Medium", "Large", or "X Large". -- **Reader Line Spacing**: Adjust the spacing between lines, options are "Tight", "Normal", or "Wide". +- **Reader Font Size**: Adjust the text size for reading; options are "Small", "Medium", "Large", or "X Large". +- **Reader Line Spacing**: Adjust the spacing between lines; options are "Tight", "Normal", or "Wide". +- **Reader Paragraph Alignment**: Set the alignment of paragraphs; options are "Justified" (default), "Left", "Center", or "Right". +- **Time to Sleep**: Set the duration of inactivity before the device automatically goes to sleep. +- **Refresh Frequency**: Set how often the screen does a full refresh while reading to reduce ghosting. - **Check for updates**: Check for firmware updates over WiFi. ### 3.6 Sleep Screen @@ -97,9 +99,7 @@ The Settings screen allows you to configure the device's behavior. There are a f You can customize the sleep screen by placing custom images in specific locations on the SD card: - **Single Image:** Place a file named `sleep.bmp` in the root directory. -- **Multiple Images:** Create a `sleep` directory in the root of the SD card and place any number of `.bmp` images - inside. If images are found in this directory, they will take priority over the `sleep.bmp` file, and one will be - randomly selected each time the device sleeps. +- **Multiple Images:** Create a `sleep` directory in the root of the SD card and place any number of `.bmp` images inside. If images are found in this directory, they will take priority over the `sleep.bmp` file, and one will be randomly selected each time the device sleeps. > [!NOTE] > You'll need to set the **Sleep Screen** setting to **Custom** in order to use these images. @@ -117,17 +117,19 @@ Once you have opened a book, the button layout changes to facilitate reading. ### Page Turning | Action | Buttons | -|-------------------|--------------------------------------| +| ----------------- | ------------------------------------ | | **Previous Page** | Press **Left** _or_ **Volume Up** | | **Next Page** | Press **Right** _or_ **Volume Down** | +The role of the volume (side) buttons can be swapped in **[Settings](#35-settings)**. + ### Chapter Navigation * **Next Chapter:** Press and **hold** the **Right** (or **Volume Down**) button briefly, then release. * **Previous Chapter:** Press and **hold** the **Left** (or **Volume Up**) button briefly, then release. ### System Navigation * **Return to Book Selection:** Press **Back** to close the book and return to the **[Book Selection](#32-book-selection)** screen. -* **Return to Home:** Press and hold **Back** to close the book and return to the **[Home](#31-home-screen)** screen. +* **Return to Home:** Press and **hold** the **Back** button to close the book and return to the **[Home](#31-home-screen)** screen. * **Chapter Menu:** Press **Confirm** to open the **[Table of Contents/Chapter Selection](#5-chapter-selection-screen)**. --- @@ -144,7 +146,6 @@ Accessible by pressing **Confirm** while inside a book. ## 6. Current Limitations & Roadmap -Please note that this firmware is currently in active development. The following features are **not yet supported** but -are planned for future updates: +Please note that this firmware is currently in active development. The following features are **not yet supported** but are planned for future updates: * **Images:** Embedded images in e-books will not render. diff --git a/docs/webserver.md b/docs/webserver.md index 2c96b8eb..2285a927 100644 --- a/docs/webserver.md +++ b/docs/webserver.md @@ -170,6 +170,40 @@ This is useful for organizing your ebooks by genre, author, or series. --- +## Command Line File Management + +For power users, you can manage files directly from your terminal using `curl` while the device is in File Upload mode. + +### Uploading a File +To upload a file to the root directory, use the following command: +```bash +curl -F "file=@book.epub" "http://crosspoint.local/upload?path=/" +``` + +* **`-F "file=@filename"`**: Points to the local file on your computer. +* **`path=/`**: The destination folder on the device SD card. + +### Deleting a File + +To delete a specific file, provide the full path on the SD card: + +```bash +curl -F "path=/folder/file.epub" "http://crosspoint.local/delete" +``` + +### Advanced Flags + +For more reliable transfers of large EPUB files, consider adding these flags: + +* `-#`: Shows a simple progress bar. +* `--connect-timeout 30`: Limits how long curl waits to establish a connection (in seconds). +* `--max-time 300`: Sets a maximum duration for the entire transfer (5 minutes). + +> [!NOTE] +> These examples use `crosspoint.local`. If your network does not support mDNS or the address does not resolve, replace it with the specific **IP Address** displayed on your device screen (e.g., `http://192.168.1.102/`). + +--- + ## Troubleshooting ### Cannot See the Device on the Network diff --git a/lib/Epub/Epub.cpp b/lib/Epub/Epub.cpp index aa6bb490..37e09a60 100644 --- a/lib/Epub/Epub.cpp +++ b/lib/Epub/Epub.cpp @@ -8,6 +8,7 @@ #include "Epub/parsers/ContainerParser.h" #include "Epub/parsers/ContentOpfParser.h" +#include "Epub/parsers/TocNavParser.h" #include "Epub/parsers/TocNcxParser.h" bool Epub::findContentOpfFile(std::string* contentOpfFile) const { @@ -80,6 +81,10 @@ bool Epub::parseContentOpf(BookMetadataCache::BookMetadata& bookMetadata) { tocNcxItem = opfParser.tocNcxPath; } + if (!opfParser.tocNavPath.empty()) { + tocNavItem = opfParser.tocNavPath; + } + Serial.printf("[%lu] [EBP] Successfully parsed content.opf\n", millis()); return true; } @@ -141,6 +146,60 @@ bool Epub::parseTocNcxFile() const { return true; } +bool Epub::parseTocNavFile() const { + // the nav file should have been specified in the content.opf file (EPUB 3) + if (tocNavItem.empty()) { + Serial.printf("[%lu] [EBP] No nav file specified\n", millis()); + return false; + } + + Serial.printf("[%lu] [EBP] Parsing toc nav file: %s\n", millis(), tocNavItem.c_str()); + + const auto tmpNavPath = getCachePath() + "/toc.nav"; + FsFile tempNavFile; + if (!SdMan.openFileForWrite("EBP", tmpNavPath, tempNavFile)) { + return false; + } + readItemContentsToStream(tocNavItem, tempNavFile, 1024); + tempNavFile.close(); + if (!SdMan.openFileForRead("EBP", tmpNavPath, tempNavFile)) { + return false; + } + const auto navSize = tempNavFile.size(); + + TocNavParser navParser(contentBasePath, navSize, bookMetadataCache.get()); + + if (!navParser.setup()) { + Serial.printf("[%lu] [EBP] Could not setup toc nav parser\n", millis()); + return false; + } + + const auto navBuffer = static_cast(malloc(1024)); + if (!navBuffer) { + Serial.printf("[%lu] [EBP] Could not allocate memory for toc nav parser\n", millis()); + return false; + } + + while (tempNavFile.available()) { + const auto readSize = tempNavFile.read(navBuffer, 1024); + const auto processedSize = navParser.write(navBuffer, readSize); + + if (processedSize != readSize) { + Serial.printf("[%lu] [EBP] Could not process all toc nav data\n", millis()); + free(navBuffer); + tempNavFile.close(); + return false; + } + } + + free(navBuffer); + tempNavFile.close(); + SdMan.remove(tmpNavPath.c_str()); + + Serial.printf("[%lu] [EBP] Parsed TOC nav items\n", millis()); + return true; +} + // load in the meta data for the epub file bool Epub::load(const bool buildIfMissing) { Serial.printf("[%lu] [EBP] Loading ePub: %s\n", millis(), filepath.c_str()); @@ -184,15 +243,31 @@ bool Epub::load(const bool buildIfMissing) { return false; } - // TOC Pass + // TOC Pass - try EPUB 3 nav first, fall back to NCX if (!bookMetadataCache->beginTocPass()) { Serial.printf("[%lu] [EBP] Could not begin writing toc pass\n", millis()); return false; } - if (!parseTocNcxFile()) { - Serial.printf("[%lu] [EBP] Could not parse toc\n", millis()); - return false; + + bool tocParsed = false; + + // Try EPUB 3 nav document first (preferred) + if (!tocNavItem.empty()) { + Serial.printf("[%lu] [EBP] Attempting to parse EPUB 3 nav document\n", millis()); + tocParsed = parseTocNavFile(); } + + // Fall back to NCX if nav parsing failed or wasn't available + if (!tocParsed && !tocNcxItem.empty()) { + Serial.printf("[%lu] [EBP] Falling back to NCX TOC\n", millis()); + tocParsed = parseTocNcxFile(); + } + + if (!tocParsed) { + Serial.printf("[%lu] [EBP] Warning: Could not parse any TOC format\n", millis()); + // Continue anyway - book will work without TOC + } + if (!bookMetadataCache->endTocPass()) { Serial.printf("[%lu] [EBP] Could not end writing toc pass\n", millis()); return false; diff --git a/lib/Epub/Epub.h b/lib/Epub/Epub.h index 7b5eb7c4..2be7418c 100644 --- a/lib/Epub/Epub.h +++ b/lib/Epub/Epub.h @@ -12,8 +12,10 @@ class ZipFile; class Epub { - // the ncx file + // the ncx file (EPUB 2) std::string tocNcxItem; + // the nav file (EPUB 3) + std::string tocNavItem; // where is the EPUBfile? std::string filepath; // the base path for items in the EPUB file @@ -26,6 +28,7 @@ class Epub { bool findContentOpfFile(std::string* contentOpfFile) const; bool parseContentOpf(BookMetadataCache::BookMetadata& bookMetadata); bool parseTocNcxFile() const; + bool parseTocNavFile() const; public: explicit Epub(std::string filepath, const std::string& cacheDir) : filepath(std::move(filepath)) { diff --git a/lib/Epub/Epub/Section.cpp b/lib/Epub/Epub/Section.cpp index 1f99f018..18b81aae 100644 --- a/lib/Epub/Epub/Section.cpp +++ b/lib/Epub/Epub/Section.cpp @@ -7,9 +7,9 @@ #include "parsers/ChapterHtmlSlimParser.h" namespace { -constexpr uint8_t SECTION_FILE_VERSION = 8; -constexpr uint32_t HEADER_SIZE = sizeof(uint8_t) + sizeof(int) + sizeof(float) + sizeof(bool) + sizeof(uint16_t) + - sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint32_t); +constexpr uint8_t SECTION_FILE_VERSION = 9; +constexpr uint32_t HEADER_SIZE = sizeof(uint8_t) + sizeof(int) + sizeof(float) + sizeof(bool) + sizeof(uint8_t) + + sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint32_t); } // namespace uint32_t Section::onPageComplete(std::unique_ptr page) { @@ -30,19 +30,21 @@ uint32_t Section::onPageComplete(std::unique_ptr page) { } void Section::writeSectionFileHeader(const int fontId, const float lineCompression, const bool extraParagraphSpacing, - const uint16_t viewportWidth, const uint16_t viewportHeight) { + const uint8_t paragraphAlignment, const uint16_t viewportWidth, + const uint16_t viewportHeight) { if (!file) { Serial.printf("[%lu] [SCT] File not open for writing header\n", millis()); return; } static_assert(HEADER_SIZE == sizeof(SECTION_FILE_VERSION) + sizeof(fontId) + sizeof(lineCompression) + - sizeof(extraParagraphSpacing) + sizeof(viewportWidth) + sizeof(viewportHeight) + - sizeof(pageCount) + sizeof(uint32_t), + sizeof(extraParagraphSpacing) + sizeof(paragraphAlignment) + sizeof(viewportWidth) + + sizeof(viewportHeight) + sizeof(pageCount) + sizeof(uint32_t), "Header size mismatch"); serialization::writePod(file, SECTION_FILE_VERSION); serialization::writePod(file, fontId); serialization::writePod(file, lineCompression); serialization::writePod(file, extraParagraphSpacing); + serialization::writePod(file, paragraphAlignment); serialization::writePod(file, viewportWidth); serialization::writePod(file, viewportHeight); serialization::writePod(file, pageCount); // Placeholder for page count (will be initially 0 when written) @@ -50,7 +52,8 @@ void Section::writeSectionFileHeader(const int fontId, const float lineCompressi } bool Section::loadSectionFile(const int fontId, const float lineCompression, const bool extraParagraphSpacing, - const uint16_t viewportWidth, const uint16_t viewportHeight) { + const uint8_t paragraphAlignment, const uint16_t viewportWidth, + const uint16_t viewportHeight) { if (!SdMan.openFileForRead("SCT", filePath, file)) { return false; } @@ -70,15 +73,17 @@ bool Section::loadSectionFile(const int fontId, const float lineCompression, con uint16_t fileViewportWidth, fileViewportHeight; float fileLineCompression; bool fileExtraParagraphSpacing; + uint8_t fileParagraphAlignment; serialization::readPod(file, fileFontId); serialization::readPod(file, fileLineCompression); serialization::readPod(file, fileExtraParagraphSpacing); + serialization::readPod(file, fileParagraphAlignment); serialization::readPod(file, fileViewportWidth); serialization::readPod(file, fileViewportHeight); if (fontId != fileFontId || lineCompression != fileLineCompression || - extraParagraphSpacing != fileExtraParagraphSpacing || viewportWidth != fileViewportWidth || - viewportHeight != fileViewportHeight) { + extraParagraphSpacing != fileExtraParagraphSpacing || paragraphAlignment != fileParagraphAlignment || + viewportWidth != fileViewportWidth || viewportHeight != fileViewportHeight) { file.close(); Serial.printf("[%lu] [SCT] Deserialization failed: Parameters do not match\n", millis()); clearCache(); @@ -109,8 +114,8 @@ bool Section::clearCache() const { } bool Section::createSectionFile(const int fontId, const float lineCompression, const bool extraParagraphSpacing, - const uint16_t viewportWidth, const uint16_t viewportHeight, - const std::function& progressSetupFn, + const uint8_t paragraphAlignment, const uint16_t viewportWidth, + const uint16_t viewportHeight, const std::function& progressSetupFn, const std::function& progressFn) { constexpr uint32_t MIN_SIZE_FOR_PROGRESS = 50 * 1024; // 50KB const auto localPath = epub->getSpineItem(spineIndex).href; @@ -166,11 +171,13 @@ bool Section::createSectionFile(const int fontId, const float lineCompression, c if (!SdMan.openFileForWrite("SCT", filePath, file)) { return false; } - writeSectionFileHeader(fontId, lineCompression, extraParagraphSpacing, viewportWidth, viewportHeight); + writeSectionFileHeader(fontId, lineCompression, extraParagraphSpacing, paragraphAlignment, viewportWidth, + viewportHeight); std::vector lut = {}; ChapterHtmlSlimParser visitor( - tmpHtmlPath, renderer, fontId, lineCompression, extraParagraphSpacing, viewportWidth, viewportHeight, + tmpHtmlPath, renderer, fontId, lineCompression, extraParagraphSpacing, paragraphAlignment, viewportWidth, + viewportHeight, [this, &lut](std::unique_ptr page) { lut.emplace_back(this->onPageComplete(std::move(page))); }, progressFn); success = visitor.parseAndBuildPages(); diff --git a/lib/Epub/Epub/Section.h b/lib/Epub/Epub/Section.h index 55244d0e..bac95efd 100644 --- a/lib/Epub/Epub/Section.h +++ b/lib/Epub/Epub/Section.h @@ -14,8 +14,8 @@ class Section { std::string filePath; FsFile file; - void writeSectionFileHeader(int fontId, float lineCompression, bool extraParagraphSpacing, uint16_t viewportWidth, - uint16_t viewportHeight); + void writeSectionFileHeader(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment, + uint16_t viewportWidth, uint16_t viewportHeight); uint32_t onPageComplete(std::unique_ptr page); public: @@ -28,11 +28,12 @@ class Section { renderer(renderer), filePath(epub->getCachePath() + "/sections/" + std::to_string(spineIndex) + ".bin") {} ~Section() = default; - bool loadSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint16_t viewportWidth, - uint16_t viewportHeight); + bool loadSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment, + uint16_t viewportWidth, uint16_t viewportHeight); bool clearCache() const; - bool createSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint16_t viewportWidth, - uint16_t viewportHeight, const std::function& progressSetupFn = nullptr, + bool createSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment, + uint16_t viewportWidth, uint16_t viewportHeight, + const std::function& progressSetupFn = nullptr, const std::function& progressFn = nullptr); std::unique_ptr loadPageFromSectionFile(); }; diff --git a/lib/Epub/Epub/htmlEntities.cpp b/lib/Epub/Epub/htmlEntities.cpp deleted file mode 100644 index f44a1584..00000000 --- a/lib/Epub/Epub/htmlEntities.cpp +++ /dev/null @@ -1,163 +0,0 @@ -// from -// https://github.com/atomic14/diy-esp32-epub-reader/blob/2c2f57fdd7e2a788d14a0bcb26b9e845a47aac42/lib/Epub/RubbishHtmlParser/htmlEntities.cpp - -#include "htmlEntities.h" - -#include -#include - -const int MAX_ENTITY_LENGTH = 10; - -// Use book: entities_ww2.epub to test this (Page 7: Entities parser test) -// Note the supported keys are only in lowercase -// Store the mappings in a unordered hash map -static std::unordered_map entity_lookup( - {{""", "\""}, {"⁄", "⁄"}, {"&", "&"}, {"<", "<"}, {">", ">"}, - {"À", "À"}, {"Á", "Á"}, {"Â", "Â"}, {"Ã", "Ã"}, {"Ä", "Ä"}, - {"Å", "Å"}, {"Æ", "Æ"}, {"Ç", "Ç"}, {"È", "È"}, {"É", "É"}, - {"Ê", "Ê"}, {"Ë", "Ë"}, {"Ì", "Ì"}, {"Í", "Í"}, {"Î", "Î"}, - {"Ï", "Ï"}, {"Ð", "Ð"}, {"Ñ", "Ñ"}, {"Ò", "Ò"}, {"Ó", "Ó"}, - {"Ô", "Ô"}, {"Õ", "Õ"}, {"Ö", "Ö"}, {"Ø", "Ø"}, {"Ù", "Ù"}, - {"Ú", "Ú"}, {"Û", "Û"}, {"Ü", "Ü"}, {"Ý", "Ý"}, {"Þ", "Þ"}, - {"ß", "ß"}, {"à", "à"}, {"á", "á"}, {"â", "â"}, {"ã", "ã"}, - {"ä", "ä"}, {"å", "å"}, {"æ", "æ"}, {"ç", "ç"}, {"è", "è"}, - {"é", "é"}, {"ê", "ê"}, {"ë", "ë"}, {"ì", "ì"}, {"í", "í"}, - {"î", "î"}, {"ï", "ï"}, {"ð", "ð"}, {"ñ", "ñ"}, {"ò", "ò"}, - {"ó", "ó"}, {"ô", "ô"}, {"õ", "õ"}, {"ö", "ö"}, {"ø", "ø"}, - {"ù", "ù"}, {"ú", "ú"}, {"û", "û"}, {"ü", "ü"}, {"ý", "ý"}, - {"þ", "þ"}, {"ÿ", "ÿ"}, {" ", " "}, {"¡", "¡"}, {"¢", "¢"}, - {"£", "£"}, {"¤", "¤"}, {"¥", "¥"}, {"¦", "¦"}, {"§", "§"}, - {"¨", "¨"}, {"©", "©"}, {"ª", "ª"}, {"«", "«"}, {"¬", "¬"}, - {"­", "­"}, {"®", "®"}, {"¯", "¯"}, {"°", "°"}, {"±", "±"}, - {"²", "²"}, {"³", "³"}, {"´", "´"}, {"µ", "µ"}, {"¶", "¶"}, - {"¸", "¸"}, {"¹", "¹"}, {"º", "º"}, {"»", "»"}, {"¼", "¼"}, - {"½", "½"}, {"¾", "¾"}, {"¿", "¿"}, {"×", "×"}, {"÷", "÷"}, - {"∀", "∀"}, {"∂", "∂"}, {"∃", "∃"}, {"∅", "∅"}, {"∇", "∇"}, - {"∈", "∈"}, {"∉", "∉"}, {"∋", "∋"}, {"∏", "∏"}, {"∑", "∑"}, - {"−", "−"}, {"∗", "∗"}, {"√", "√"}, {"∝", "∝"}, {"∞", "∞"}, - {"∠", "∠"}, {"∧", "∧"}, {"∨", "∨"}, {"∩", "∩"}, {"∪", "∪"}, - {"∫", "∫"}, {"∴", "∴"}, {"∼", "∼"}, {"≅", "≅"}, {"≈", "≈"}, - {"≠", "≠"}, {"≡", "≡"}, {"≤", "≤"}, {"≥", "≥"}, {"⊂", "⊂"}, - {"⊃", "⊃"}, {"⊄", "⊄"}, {"⊆", "⊆"}, {"⊇", "⊇"}, {"⊕", "⊕"}, - {"⊗", "⊗"}, {"⊥", "⊥"}, {"⋅", "⋅"}, {"Α", "Α"}, {"Β", "Β"}, - {"Γ", "Γ"}, {"Δ", "Δ"}, {"Ε", "Ε"}, {"Ζ", "Ζ"}, {"Η", "Η"}, - {"Θ", "Θ"}, {"Ι", "Ι"}, {"Κ", "Κ"}, {"Λ", "Λ"}, {"Μ", "Μ"}, - {"Ν", "Ν"}, {"Ξ", "Ξ"}, {"Ο", "Ο"}, {"Π", "Π"}, {"Ρ", "Ρ"}, - {"Σ", "Σ"}, {"Τ", "Τ"}, {"Υ", "Υ"}, {"Φ", "Φ"}, {"Χ", "Χ"}, - {"Ψ", "Ψ"}, {"Ω", "Ω"}, {"α", "α"}, {"β", "β"}, {"γ", "γ"}, - {"δ", "δ"}, {"ε", "ε"}, {"ζ", "ζ"}, {"η", "η"}, {"θ", "θ"}, - {"ι", "ι"}, {"κ", "κ"}, {"λ", "λ"}, {"μ", "μ"}, {"ν", "ν"}, - {"ξ", "ξ"}, {"ο", "ο"}, {"π", "π"}, {"ρ", "ρ"}, {"ς", "ς"}, - {"σ", "σ"}, {"τ", "τ"}, {"υ", "υ"}, {"φ", "φ"}, {"χ", "χ"}, - {"ψ", "ψ"}, {"ω", "ω"}, {"ϑ", "ϑ"}, {"ϒ", "ϒ"}, {"ϖ", "ϖ"}, - {"Œ", "Œ"}, {"œ", "œ"}, {"Š", "Š"}, {"š", "š"}, {"Ÿ", "Ÿ"}, - {"ƒ", "ƒ"}, {"ˆ", "ˆ"}, {"˜", "˜"}, {" ", ""}, {" ", ""}, - {" ", ""}, {"‌", "‌"}, {"‍", "‍"}, {"‎", "‎"}, {"‏", "‏"}, - {"–", "–"}, {"—", "—"}, {"‘", "‘"}, {"’", "’"}, {"‚", "‚"}, - {"“", "“"}, {"”", "”"}, {"„", "„"}, {"†", "†"}, {"‡", "‡"}, - {"•", "•"}, {"…", "…"}, {"‰", "‰"}, {"′", "′"}, {"″", "″"}, - {"‹", "‹"}, {"›", "›"}, {"‾", "‾"}, {"€", "€"}, {"™", "™"}, - {"←", "←"}, {"↑", "↑"}, {"→", "→"}, {"↓", "↓"}, {"↔", "↔"}, - {"↵", "↵"}, {"⌈", "⌈"}, {"⌉", "⌉"}, {"⌊", "⌊"}, {"⌋", "⌋"}, - {"◊", "◊"}, {"♠", "♠"}, {"♣", "♣"}, {"♥", "♥"}, {"♦", "♦"}}); - -// converts from a unicode code point to the utf8 equivalent -void convert_to_utf8(const int code, std::string& res) { - // convert to a utf8 sequence - if (code < 0x80) { - res += static_cast(code); - } else if (code < 0x800) { - res += static_cast(0xc0 | (code >> 6)); - res += static_cast(0x80 | (code & 0x3f)); - } else if (code < 0x10000) { - res += static_cast(0xe0 | (code >> 12)); - res += static_cast(0x80 | ((code >> 6) & 0x3f)); - res += static_cast(0x80 | (code & 0x3f)); - } else if (code < 0x200000) { - res += static_cast(0xf0 | (code >> 18)); - res += static_cast(0x80 | ((code >> 12) & 0x3f)); - res += static_cast(0x80 | ((code >> 6) & 0x3f)); - res += static_cast(0x80 | (code & 0x3f)); - } else if (code < 0x4000000) { - res += static_cast(0xf8 | (code >> 24)); - res += static_cast(0x80 | ((code >> 18) & 0x3f)); - res += static_cast(0x80 | ((code >> 12) & 0x3f)); - res += static_cast(0x80 | ((code >> 6) & 0x3f)); - res += static_cast(0x80 | (code & 0x3f)); - } else if (code < 0x80000000) { - res += static_cast(0xfc | (code >> 30)); - res += static_cast(0x80 | ((code >> 24) & 0x3f)); - res += static_cast(0x80 | ((code >> 18) & 0x3f)); - res += static_cast(0x80 | ((code >> 12) & 0x3f)); - res += static_cast(0x80 | ((code >> 6) & 0x3f)); - } -} - -// handles numeric entities - e.g. Ӓ or ሴ -bool process_numeric_entity(const std::string& entity, std::string& res) { - int code = 0; - // is it hex? - if (entity[2] == 'x' || entity[2] == 'X') { - // parse the hex code - code = strtol(entity.substr(3, entity.size() - 3).c_str(), nullptr, 16); - } else { - code = strtol(entity.substr(2, entity.size() - 3).c_str(), nullptr, 10); - } - if (code != 0) { - // special handling for nbsp - if (code == 0xA0) { - res += " "; - } else { - convert_to_utf8(code, res); - } - return true; - } - return false; -} - -// handles named entities - e.g. & -bool process_string_entity(const std::string& entity, std::string& res) { - // it's a named entity - find it in the lookup table - // find it in the map - const auto it = entity_lookup.find(entity); - if (it != entity_lookup.end()) { - res += it->second; - return true; - } - return false; -} - -// replace all the entities in the string -std::string replaceHtmlEntities(const char* text) { - std::string res; - res.reserve(strlen(text)); - for (int i = 0; i < strlen(text); ++i) { - bool flag = false; - // do we have a potential entity? - if (text[i] == '&') { - // find the end of the entity - int j = i + 1; - while (j < strlen(text) && text[j] != ';' && j - i < MAX_ENTITY_LENGTH) { - j++; - } - if (j - i > 2) { - char entity[j - i + 1]; - strncpy(entity, text + i, j - i); - // is it a numeric code? - if (entity[1] == '#') { - flag = process_numeric_entity(entity, res); - } else { - flag = process_string_entity(entity, res); - } - // skip past the entity if we successfully decoded it - if (flag) { - i = j; - } - } - } - if (!flag) { - res += text[i]; - } - } - return res; -} diff --git a/lib/Epub/Epub/htmlEntities.h b/lib/Epub/Epub/htmlEntities.h deleted file mode 100644 index 109f717a..00000000 --- a/lib/Epub/Epub/htmlEntities.h +++ /dev/null @@ -1,7 +0,0 @@ -// from -// https://github.com/atomic14/diy-esp32-epub-reader/blob/2c2f57fdd7e2a788d14a0bcb26b9e845a47aac42/lib/Epub/RubbishHtmlParser/htmlEntities.cpp - -#pragma once -#include - -std::string replaceHtmlEntities(const char* text); diff --git a/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp b/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp index 5cd53293..b96d28f8 100644 --- a/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp +++ b/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp @@ -6,7 +6,6 @@ #include #include "../Page.h" -#include "../htmlEntities.h" const char* HEADER_TAGS[] = {"h1", "h2", "h3", "h4", "h5", "h6"}; constexpr int NUM_HEADER_TAGS = sizeof(HEADER_TAGS) / sizeof(HEADER_TAGS[0]); @@ -97,7 +96,7 @@ void XMLCALL ChapterHtmlSlimParser::startElement(void* userData, const XML_Char* if (strcmp(name, "br") == 0) { self->startNewTextBlock(self->currentTextBlock->getStyle()); } else { - self->startNewTextBlock(TextBlock::JUSTIFIED); + self->startNewTextBlock((TextBlock::Style)self->paragraphAlignment); } } else if (matches(name, BOLD_TAGS, NUM_BOLD_TAGS)) { self->boldUntilDepth = std::min(self->boldUntilDepth, self->depth); @@ -130,17 +129,32 @@ void XMLCALL ChapterHtmlSlimParser::characterData(void* userData, const XML_Char // Currently looking at whitespace, if there's anything in the partWordBuffer, flush it if (self->partWordBufferIndex > 0) { self->partWordBuffer[self->partWordBufferIndex] = '\0'; - self->currentTextBlock->addWord(std::move(replaceHtmlEntities(self->partWordBuffer)), fontStyle); + self->currentTextBlock->addWord(self->partWordBuffer, fontStyle); self->partWordBufferIndex = 0; } // Skip the whitespace char continue; } + // Skip soft-hyphen with UTF-8 representation (U+00AD) = 0xC2 0xAD + const XML_Char SHY_BYTE_1 = static_cast(0xC2); + const XML_Char SHY_BYTE_2 = static_cast(0xAD); + // 1. Check for the start of the 2-byte Soft Hyphen sequence + if (s[i] == SHY_BYTE_1) { + // 2. Check if the next byte exists AND if it completes the sequence + // We must check i + 1 < len to prevent reading past the end of the buffer. + if ((i + 1 < len) && (s[i + 1] == SHY_BYTE_2)) { + // Sequence 0xC2 0xAD found! + // Skip the current byte (0xC2) and the next byte (0xAD) + i++; // Increment 'i' one more time to skip the 0xAD byte + continue; // Skip the rest of the loop and move to the next iteration + } + } + // If we're about to run out of space, then cut the word off and start a new one if (self->partWordBufferIndex >= MAX_WORD_SIZE) { self->partWordBuffer[self->partWordBufferIndex] = '\0'; - self->currentTextBlock->addWord(std::move(replaceHtmlEntities(self->partWordBuffer)), fontStyle); + self->currentTextBlock->addWord(self->partWordBuffer, fontStyle); self->partWordBufferIndex = 0; } @@ -182,7 +196,7 @@ void XMLCALL ChapterHtmlSlimParser::endElement(void* userData, const XML_Char* n } self->partWordBuffer[self->partWordBufferIndex] = '\0'; - self->currentTextBlock->addWord(std::move(replaceHtmlEntities(self->partWordBuffer)), fontStyle); + self->currentTextBlock->addWord(self->partWordBuffer, fontStyle); self->partWordBufferIndex = 0; } } @@ -206,7 +220,7 @@ void XMLCALL ChapterHtmlSlimParser::endElement(void* userData, const XML_Char* n } bool ChapterHtmlSlimParser::parseAndBuildPages() { - startNewTextBlock(TextBlock::JUSTIFIED); + startNewTextBlock((TextBlock::Style)this->paragraphAlignment); const XML_Parser parser = XML_ParserCreate(nullptr); int done; diff --git a/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.h b/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.h index 795c2c33..c559e157 100644 --- a/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.h +++ b/lib/Epub/Epub/parsers/ChapterHtmlSlimParser.h @@ -33,6 +33,7 @@ class ChapterHtmlSlimParser { int fontId; float lineCompression; bool extraParagraphSpacing; + uint8_t paragraphAlignment; uint16_t viewportWidth; uint16_t viewportHeight; @@ -46,7 +47,8 @@ class ChapterHtmlSlimParser { public: explicit ChapterHtmlSlimParser(const std::string& filepath, GfxRenderer& renderer, const int fontId, const float lineCompression, const bool extraParagraphSpacing, - const uint16_t viewportWidth, const uint16_t viewportHeight, + const uint8_t paragraphAlignment, const uint16_t viewportWidth, + const uint16_t viewportHeight, const std::function)>& completePageFn, const std::function& progressFn = nullptr) : filepath(filepath), @@ -54,6 +56,7 @@ class ChapterHtmlSlimParser { fontId(fontId), lineCompression(lineCompression), extraParagraphSpacing(extraParagraphSpacing), + paragraphAlignment(paragraphAlignment), viewportWidth(viewportWidth), viewportHeight(viewportHeight), completePageFn(completePageFn), diff --git a/lib/Epub/Epub/parsers/ContentOpfParser.cpp b/lib/Epub/Epub/parsers/ContentOpfParser.cpp index c9398778..2c90d01d 100644 --- a/lib/Epub/Epub/parsers/ContentOpfParser.cpp +++ b/lib/Epub/Epub/parsers/ContentOpfParser.cpp @@ -161,6 +161,7 @@ void XMLCALL ContentOpfParser::startElement(void* userData, const XML_Char* name std::string itemId; std::string href; std::string mediaType; + std::string properties; for (int i = 0; atts[i]; i += 2) { if (strcmp(atts[i], "id") == 0) { @@ -169,6 +170,8 @@ void XMLCALL ContentOpfParser::startElement(void* userData, const XML_Char* name href = self->baseContentPath + atts[i + 1]; } else if (strcmp(atts[i], "media-type") == 0) { mediaType = atts[i + 1]; + } else if (strcmp(atts[i], "properties") == 0) { + properties = atts[i + 1]; } } @@ -188,6 +191,15 @@ void XMLCALL ContentOpfParser::startElement(void* userData, const XML_Char* name href.c_str()); } } + + // EPUB 3: Check for nav document (properties contains "nav") + if (!properties.empty() && self->tocNavPath.empty()) { + // Properties is space-separated, check if "nav" is present as a word + if (properties == "nav" || properties.find("nav ") == 0 || properties.find(" nav") != std::string::npos) { + self->tocNavPath = href; + Serial.printf("[%lu] [COF] Found EPUB 3 nav document: %s\n", millis(), href.c_str()); + } + } return; } diff --git a/lib/Epub/Epub/parsers/ContentOpfParser.h b/lib/Epub/Epub/parsers/ContentOpfParser.h index 245fca3b..1940aaaf 100644 --- a/lib/Epub/Epub/parsers/ContentOpfParser.h +++ b/lib/Epub/Epub/parsers/ContentOpfParser.h @@ -35,6 +35,7 @@ class ContentOpfParser final : public Print { std::string title; std::string author; std::string tocNcxPath; + std::string tocNavPath; // EPUB 3 nav document path std::string coverItemHref; std::string textReferenceHref; diff --git a/lib/Epub/Epub/parsers/TocNavParser.cpp b/lib/Epub/Epub/parsers/TocNavParser.cpp new file mode 100644 index 00000000..b8a4e7fb --- /dev/null +++ b/lib/Epub/Epub/parsers/TocNavParser.cpp @@ -0,0 +1,184 @@ +#include "TocNavParser.h" + +#include + +#include "../BookMetadataCache.h" + +bool TocNavParser::setup() { + parser = XML_ParserCreate(nullptr); + if (!parser) { + Serial.printf("[%lu] [NAV] Couldn't allocate memory for parser\n", millis()); + return false; + } + + XML_SetUserData(parser, this); + XML_SetElementHandler(parser, startElement, endElement); + XML_SetCharacterDataHandler(parser, characterData); + return true; +} + +TocNavParser::~TocNavParser() { + if (parser) { + XML_StopParser(parser, XML_FALSE); + XML_SetElementHandler(parser, nullptr, nullptr); + XML_SetCharacterDataHandler(parser, nullptr); + XML_ParserFree(parser); + parser = nullptr; + } +} + +size_t TocNavParser::write(const uint8_t data) { return write(&data, 1); } + +size_t TocNavParser::write(const uint8_t* buffer, const size_t size) { + if (!parser) return 0; + + const uint8_t* currentBufferPos = buffer; + auto remainingInBuffer = size; + + while (remainingInBuffer > 0) { + void* const buf = XML_GetBuffer(parser, 1024); + if (!buf) { + Serial.printf("[%lu] [NAV] Couldn't allocate memory for buffer\n", millis()); + XML_StopParser(parser, XML_FALSE); + XML_SetElementHandler(parser, nullptr, nullptr); + XML_SetCharacterDataHandler(parser, nullptr); + XML_ParserFree(parser); + parser = nullptr; + return 0; + } + + const auto toRead = remainingInBuffer < 1024 ? remainingInBuffer : 1024; + memcpy(buf, currentBufferPos, toRead); + + if (XML_ParseBuffer(parser, static_cast(toRead), remainingSize == toRead) == XML_STATUS_ERROR) { + Serial.printf("[%lu] [NAV] Parse error at line %lu: %s\n", millis(), XML_GetCurrentLineNumber(parser), + XML_ErrorString(XML_GetErrorCode(parser))); + XML_StopParser(parser, XML_FALSE); + XML_SetElementHandler(parser, nullptr, nullptr); + XML_SetCharacterDataHandler(parser, nullptr); + XML_ParserFree(parser); + parser = nullptr; + return 0; + } + + currentBufferPos += toRead; + remainingInBuffer -= toRead; + remainingSize -= toRead; + } + return size; +} + +void XMLCALL TocNavParser::startElement(void* userData, const XML_Char* name, const XML_Char** atts) { + auto* self = static_cast(userData); + + // Track HTML structure loosely - we mainly care about finding