2025-11-14 14:47:19 +00:00

67 lines
2.6 KiB
Markdown

# Auto-Processing Code Archive
This directory contains the complex auto-processing system that was previously used for automatic document processing after file upload.
## Archived Components
### Core Processing Files
- `files_with_auto_processing.py` - Original files.py router with automatic processing
- `pipeline_controller.py` - Complex multi-phase pipeline orchestration
- `task_processors.py` - Document processing task handlers
### Advanced Queue Management (Created but not deployed)
- `memory_aware_queue.py` - Memory-based intelligent queue management
- `enhanced_upload_handler.py` - Advanced upload handler with queuing
- `enhanced_upload.py` - API endpoints for advanced upload system
## What This System Did
### Automatic Processing Pipeline
1. **File Upload** → Immediate processing trigger
2. **PDF Conversion** (synchronous, blocking)
3. **Phase 1**: Structure discovery (Tika, Page Images, Document Analysis, Split Map)
4. **Phase 2**: Docling processing (NO_OCR → OCR → VLM pipelines)
5. **Complex Dependencies**: Phase coordination, task sequencing
6. **Redis Queue Management**: Service limits, rate limits, dependency tracking
### Features
- Multi-phase processing pipelines
- Complex task dependency management
- Memory-aware queue limits
- Multi-user capacity management
- Real-time processing status
- WebSocket status updates
- Service-specific resource limits
- Task recovery on restart
## Why Archived
The system was overly complex for the current needs:
- **Complexity**: Multi-phase pipelines with complex dependencies
- **Blocking Operations**: Synchronous PDF conversion causing timeouts
- **Resource Management**: Over-engineered for single-user scenarios
- **User Experience**: Users had to wait for processing to complete
## New Simplified Approach
The new system focuses on:
- **Simple Upload**: Just store files and create database records
- **No Auto-Processing**: Users manually trigger processing when needed
- **Directory Support**: Upload entire folders with manifest tracking
- **Immediate Response**: Users get instant confirmation without waiting
## If You Need to Restore
To restore the auto-processing functionality:
1. Copy `files_with_auto_processing.py` back to `routers/database/files/files.py`
2. Ensure `pipeline_controller.py` and `task_processors.py` are in `modules/`
3. Update imports and dependencies
4. Re-enable background processing in upload handlers
## Migration Notes
The database schema and Redis structure remain compatible. The new simplified system can coexist with the archived processing logic if needed.
Date Archived: $(date)
Reason: Simplification for directory upload implementation