Multi-Database Management - Technical Roadmap
Current Implementation Status
✅ Phase 1: Basic Multi-Database Support (COMPLETED)
Date Completed: October 2, 2025 Issue: #7
Features Implemented:
- ✅ Database configuration storage (instance/database_config.json)
- ✅ Database creation with pre-populated default data (users, roles, permissions, etc.)
- ✅ Database validation and naming conventions
- ✅ Database listing and selection UI
- ✅ Database error recovery page
- ✅ Complete audit logging for all database operations
- ✅ RBAC protection (Super Admin only)
Current Limitation: ⚠️ Manual server restart required after database switch due to Flask-SQLAlchemy architecture limitations.
User Workflow: 1. User switches database via dropdown → Configuration updated 2. System shows alert with restart instructions in Italian 3. User manually stops server (Ctrl+C) and restarts 4. User logs in → Connected to new database
🚀 Future Enhancements Roadmap
Phase 2: Production Deployment with Graceful Reload (RECOMMENDED - Short Term)
Timeline: Next 3-6 months Priority: HIGH Complexity: MEDIUM
Objective: Enable automatic database switch without manual restart in production environment.
Technical Approach: Gunicorn/uWSGI + SIGHUP Signal
Architecture:
User → Switch DB → Backend Updates Config → Send SIGHUP to Gunicorn
↓
Workers gracefully reload with new database URI
↓
Frontend auto-detects reload complete
↓
User auto-logged back in
Implementation Steps:
- Production Server Setup: ```bash # Install Gunicorn pip install gunicorn
# Run with workers gunicorn --workers 4 --bind 0.0.0.0:5000 app:app ```
-
Backend Reload Endpoint (
features/admin/api/routes.py): ```python @admin_bp.route('/database/reload', methods=['POST']) @login_required @rbac_required('admin.*') def reload_server(): """Send graceful reload signal to Gunicorn after database switch""" import os import signal# Get parent process ID (Gunicorn master) parent_pid = os.getppid()
# Send SIGHUP for graceful reload os.kill(parent_pid, signal.SIGHUP)
return jsonify({ 'success': True, 'message': 'Server reload initiated. Please wait...' }) ```
-
Frontend Polling (
templates/admin/database/index.html): ```javascript // After switch, call reload endpoint await fetch('/admin/database/reload', { method: 'POST' });
// Poll for server readiness (max 30s) await waitForServerReload();
// Redirect to login with return URL window.location.href = '/auth/login?next=' + encodeURIComponent(window.location.pathname); ```
- Nginx Configuration (reverse proxy): ```nginx upstream racemes_backend { server 127.0.0.1:5000; keepalive 32; }
server { listen 80; server_name racemes.example.com;
location / {
proxy_pass http://racemes_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
} ```
Benefits: - ✅ Zero downtime during database switch - ✅ Workers reload one at a time (rolling restart) - ✅ Existing connections complete gracefully - ✅ New connections automatically use new database - ✅ Automatic user re-authentication flow
Estimated Effort: 2-3 days - 1 day: Production setup (Gunicorn + Nginx) - 1 day: Reload endpoint + frontend polling - 0.5 days: Testing and documentation
Phase 3: Development Hot-Reload with Watchdog (Optional - Short Term)
Timeline: 3-6 months Priority: MEDIUM Complexity: MEDIUM
Objective: Enable automatic reload in development environment (Flask development server).
Technical Approach: File Sentinel + External Watchdog Process
Architecture:
Backend → Create sentinel file (.db_switch_pending)
↓
Watchdog process monitors file
↓
Detects change → Restart Flask
↓
Remove sentinel → Reload complete
Implementation:
- Watchdog Service (
services/dev_reload_manager.py): ```python import os import time import subprocess from pathlib import Path
SENTINEL_FILE = Path('instance/.db_switch_pending') FLASK_PID_FILE = Path('instance/.flask.pid')
def watch_and_reload(): """Monitor sentinel file and restart Flask when needed""" while True: if SENTINEL_FILE.exists(): # Read target database from sentinel target_db = SENTINEL_FILE.read_text().strip()
# Gracefully terminate Flask
if FLASK_PID_FILE.exists():
pid = int(FLASK_PID_FILE.read_text())
os.kill(pid, signal.SIGTERM)
time.sleep(2)
# Restart Flask
subprocess.Popen(['python', 'app.py'])
# Remove sentinel
SENTINEL_FILE.unlink()
time.sleep(1)
```
- Backend Sentinel Creation (
features/admin/api/routes.py):python # After config update sentinel = Path('instance/.db_switch_pending') sentinel.write_text(db_name)
Benefits: - ✅ Works in development environment - ✅ No manual restart needed - ✅ Separate process handles lifecycle
Limitations: - ⚠️ Requires running watchdog process alongside Flask - ⚠️ Brief downtime during restart (~2-3 seconds)
Estimated Effort: 2 days
Phase 4: Advanced Multi-Database Architecture (Long Term)
Timeline: 6-12 months Priority: LOW (Future Enhancement) Complexity: HIGH
Objective: Support simultaneous access to multiple databases without restart.
Technical Approach: Custom Database Connection Manager
Use Cases: - Multi-site aggregation (view data across all client databases) - Cross-database analytics and reporting - Database migration tools (copy data between databases) - Multi-tenant SaaS architecture
Architecture:
class MultiDatabaseManager:
"""
Custom database manager supporting dynamic database switching
without application restart.
"""
def __init__(self):
self.engines = {} # db_name -> SQLAlchemy engine
self.sessions = {} # db_name -> scoped_session
self.current_db = None
self.lock = threading.Lock()
def switch_database(self, db_name: str):
"""Switch active database without restart"""
with self.lock:
if db_name not in self.engines:
# Create new engine for this database
db_path = os.path.abspath(os.path.join('instance', db_name))
db_uri = f"sqlite:///{db_path}"
self.engines[db_name] = create_engine(db_uri, pool_pre_ping=True)
self.sessions[db_name] = scoped_session(sessionmaker(bind=self.engines[db_name]))
self.current_db = db_name
def get_session(self):
"""Get session for currently active database"""
if self.current_db is None:
raise ValueError("No database selected")
return self.sessions[self.current_db]
def query_all_databases(self, query_func):
"""Execute query across all databases (aggregation)"""
results = {}
for db_name in self.engines.keys():
session = self.sessions[db_name]
results[db_name] = query_func(session)
return results
def close_all(self):
"""Cleanup all connections"""
for session in self.sessions.values():
session.remove()
for engine in self.engines.values():
engine.dispose()
Integration Points:
- Refactor all db.session calls to use db_manager.get_session()
- Add middleware to set current database from user context
- Update models to work with dynamic sessions
Benefits: - ✅ Zero restart required - ✅ Switch databases in milliseconds - ✅ Support multi-database queries - ✅ Future-proof for SaaS multi-tenancy
Challenges: - ⚠️ Requires refactoring entire codebase - ⚠️ Complex session management - ⚠️ Potential connection pool exhaustion (many databases) - ⚠️ Memory overhead (multiple engines loaded)
Estimated Effort: 2-3 weeks - 1 week: MultiDatabaseManager implementation - 1 week: Refactor all database access code - 3-5 days: Testing, optimization, documentation
Technical Constraints & Considerations
Flask-SQLAlchemy Limitations
Current Architecture:
# Flask-SQLAlchemy singleton pattern
db = SQLAlchemy(app)
# Engine is cached and bound to app at init time
db.engine # Read-only property, cannot be replaced
Why Hot-Swap Fails:
1. SQLAlchemy engine is created once at db.init_app(app)
2. Engine is cached in db._engines dictionary (keyed by app)
3. Cannot re-call init_app() → raises "instance already registered"
4. Cannot replace db.engine → property has no setter
5. Clearing db._engines → engine still referenced internally
Attempted Solutions (all failed):
- ❌ db.engine = new_engine → Property has no setter
- ❌ db._engines.clear() → Internal references remain
- ❌ db.init_app(app) → Already registered error
- ❌ db.engine.dispose() + config update → Old engine still used
- ❌ Logout/login flow → Session cleared but engine unchanged
Root Cause: Flask-SQLAlchemy is designed for single-database applications with static configuration loaded at startup.
Database File Locking (SQLite)
Important: SQLite databases are file-based and can be locked by active connections.
Best Practices:
- Always close sessions before switching: db.session.close()
- Dispose engine to release file locks: db.engine.dispose()
- Use pool_pre_ping=True to detect stale connections
- Consider WAL mode for better concurrency: PRAGMA journal_mode=WAL
Recommended Implementation Path
Immediate (Current State - ✅ DONE)
- Manual restart with clear Italian instructions
- Configuration persistence in JSON
- Full audit logging
Short Term (Next 3-6 months - 🎯 PRIORITY)
- Production deployment: Gunicorn + SIGHUP graceful reload
- Frontend: Polling for reload completion
- User experience: Automatic re-login after switch
Medium Term (6-12 months - Optional)
- Development: Watchdog-based auto-restart
- Monitoring: Health check endpoints for reload status
- Metrics: Track switch frequency and performance
Long Term (12+ months - Advanced)
- Architecture: Custom MultiDatabaseManager
- Features: Multi-database aggregation
- SaaS: True multi-tenancy support
Testing Strategy
Phase 2 Testing (Gunicorn Reload)
# Test graceful reload
1. Start Gunicorn with 4 workers
2. Create test load (concurrent requests)
3. Trigger database switch
4. Verify:
- Zero 5xx errors during reload
- Workers reload one at a time
- New requests use new database
- Active requests complete successfully
Phase 4 Testing (Multi-Database Manager)
# Test concurrent database access
def test_multi_db_switching():
# Switch rapidly between databases
for i in range(100):
db_manager.switch_database(f'test_{i % 3}.db')
user = db_manager.get_session().query(User).first()
assert user is not None
# Verify no connection leaks
assert len(db_manager.engines) == 3
Security Considerations
Phase 2 (Gunicorn Reload)
- ✅ Reload endpoint protected by
@rbac_required('admin.*') - ✅ SIGHUP signal only affects Gunicorn process (isolated)
- ✅ Audit log records who triggered reload
Phase 4 (Multi-Database Manager)
- ⚠️ Each database must enforce its own RBAC
- ⚠️ User session tied to specific database
- ⚠️ Cross-database queries need authorization checks
- ⚠️ Connection pool limits per database
Performance Impact
Phase 2 (Gunicorn Reload)
- Switch Time: 2-5 seconds (rolling restart of workers)
- Downtime: Zero (rolling restart ensures availability)
- Memory: Minimal overhead (same as normal operation)
Phase 4 (Multi-Database Manager)
- Switch Time: <100ms (in-memory pointer change)
- Downtime: Zero (no process restart)
- Memory: High (N databases × connection pool size)
- Connection Limits: Must configure pool size carefully
Example:
# 10 databases × 5 connections each = 50 total connections
db_manager = MultiDatabaseManager(pool_size=5, max_overflow=10)
Documentation References
Related Files
features/admin/api/routes.py- Database switch endpointfeatures/admin/lib/database_management_service.py- Core servicetemplates/admin/database/index.html- UI implementationinstance/database_config.json- Configuration storage
Issue Tracking
- GitHub Issue: #7 (Multi-Database Management)
- Status: Phase 1 Complete (Manual Restart)
Related Documentation
CLAUDE.md- Project architecture overviewdocs/architecture/- System architecture documentationdocs/deployment/- Production deployment guide (to be created)
Change Log
| Date | Version | Changes | Author |
|---|---|---|---|
| 2025-10-02 | 1.0 | Initial roadmap created | Claude Code |
| 2025-10-02 | 1.0 | Phase 1 completed (manual restart) | Claude Code |
Notes
Current Decision Rationale: After attempting multiple technical approaches for hot database switching (engine replacement, cache clearing, session management), we determined that Flask-SQLAlchemy's singleton architecture fundamentally prevents runtime database URI changes without process restart.
The manual restart solution is: - ✅ Honest and transparent to users - ✅ Simple and maintainable - ✅ Zero risk of data corruption or stale connections - ✅ Suitable for current development/testing phase
For production deployment, Phase 2 (Gunicorn + graceful reload) is the recommended path forward, providing automated restart with zero downtime.
Last Updated: October 2, 2025 Status: Phase 1 Complete - Phase 2 Planned