rw-deepseek-ocr

Author	SHA1	Message	Date
Aaron Roberts	da7957d7d5	Fix commit job and OCR text editing - OCR text is now shown in an editable textarea (plain_ocr mode) so users can correct it before committing - editedOcrText state tracks edits; commit job sends the edited value instead of the original result.text - Remove silent early-return guard that blocked commit when text was empty - Copy and download also use the edited text Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 17:11:49 +01:00
Aaron Roberts	fd747e6c23	Add job tracking with PostgreSQL, image storage, and review workflow - Add PostgreSQL service to docker-compose with health check and postgres_data volume - Mount ./ocr_images as bind volume for persistent image storage - Add backend/database.py with schema init and get_db() context manager - Add 5 new API endpoints: POST /api/jobs, GET /api/jobs (search), GET /api/jobs/{id}, GET /api/jobs/{id}/image, PUT /api/jobs/{id}/review - Jobs are saved with author/book/chapter/page metadata, auto UUID, and submitted_at timestamp - Jobs start as 'unreviewed'; review captures edited text, reviewer name, and reviewed_at - Add MetadataForm.jsx (author/book/chapter/page inputs) to the New Job panel - Add JobsPanel.jsx with search/filter, paginated list, and detail pane with review form - Add "Commit Job" button to ResultPanel (plain_ocr mode only) with success/error feedback - Add "New Job" / "Browse Jobs" navigation to the app header Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 16:48:12 +01:00
Ray Dumasia	e24f064042	Add CTRL-V support as suggested by @p-xiexin	2025-11-15 23:32:33 +00:00
Claude	e578276d3e	Add PDF processing and multi-format document conversion Features added: - PDF to image conversion with configurable DPI - Multi-page PDF processing with OCR - Export to Markdown, HTML, DOCX, and JSON formats - Automatic image extraction from PDFs - Formula and formatting preservation - Real-time progress tracking for multi-page documents Backend changes: - New /api/process-pdf endpoint for PDF processing - pdf_utils.py: PDF conversion and image extraction utilities - format_converter.py: Document format conversion (MD, HTML, DOCX) - Updated dependencies: PyMuPDF, img2pdf, python-docx, markdown Frontend changes: - File type toggle (Image OCR / PDF Processing) - PDFProcessor component with format selection - Updated ImageUpload to support both images and PDFs - Progress bars for multi-page processing - Download options for converted documents Documentation: - Updated README with PDF processing features - Added API documentation for /api/process-pdf endpoint - Added format conversion examples	2025-11-15 14:25:09 +00:00
Dennis Paul	23bbd1fc8d	show advanced settings toggle	2025-10-23 00:05:24 +02:00
Ray Dumasia	3efc4da7ff	Add in .env.example for setting ports, fix upload limit, fix bounding box, can now dismiss previous image, change markdown expectation to HTML - not MD. updated README with nvidia driver/container instructions	2025-10-21 21:35:17 +01:00
Ray Dumasia	aec04f6eb4	Initial commit	2025-10-21 01:32:09 +01:00

7 Commits