Files
orion/docs/development/database-migrations.md

329 lines
7.8 KiB
Markdown

# Database Migrations Guide
This guide covers advanced database migration workflows for developers working on schema changes.
## Overview
Our project uses Alembic for database migrations. All schema changes must go through the migration system to ensure:
- Reproducible deployments
- Team synchronization
- Production safety
- Rollback capability
## Migration Commands Reference
### Creating Migrations
```bash
# Auto-generate migration from model changes
make migrate-create message="add_user_profile_table"
# Create empty migration template for manual changes
make migrate-create-manual message="add_custom_indexes"
```
### Applying Migrations
```bash
# Apply all pending migrations
make migrate-up
# Rollback last migration
make migrate-down
# Rollback to specific revision
make migrate-down-to revision="abc123"
```
### Migration Status
```bash
# Show current migration status
make migrate-status
# Show detailed migration history
alembic history --verbose
# Show specific migration details
make migrate-show revision="abc123"
```
### Backup and Safety
```bash
# Create database backup before major changes
make backup-db
# Verify database setup
make verify-setup
```
## Development Workflows
### Adding New Database Fields
1. **Modify your SQLAlchemy model**:
```python
# In models/database/user.py
class User(Base):
# ... existing fields
profile_image = Column(String, nullable=True) # NEW FIELD
```
2. **Generate migration**:
```bash
make migrate-create message="add_profile_image_to_users"
```
3. **Review generated migration**:
```python
# Check alembic/versions/xxx_add_profile_image_to_users.py
def upgrade() -> None:
op.add_column('users', sa.Column('profile_image', sa.String(), nullable=True))
def downgrade() -> None:
op.drop_column('users', 'profile_image')
```
4. **Apply migration**:
```bash
make migrate-up
```
### Adding Database Indexes
1. **Create manual migration**:
```bash
make migrate-create-manual message="add_performance_indexes"
```
2. **Edit the migration file**:
```python
def upgrade() -> None:
# Add indexes for better performance
op.create_index('idx_products_marketplace_shop', 'products', ['marketplace', 'shop_name'])
op.create_index('idx_users_email_active', 'users', ['email', 'is_active'])
def downgrade() -> None:
op.drop_index('idx_users_email_active', table_name='users')
op.drop_index('idx_products_marketplace_shop', table_name='products')
```
3. **Apply migration**:
```bash
make migrate-up
```
### Complex Schema Changes
For complex changes that require data transformation:
1. **Create migration with data handling**:
```python
def upgrade() -> None:
# Create new column
op.add_column('products', sa.Column('normalized_price', sa.Numeric(10, 2)))
# Migrate data
connection = op.get_bind()
connection.execute(
text("UPDATE products SET normalized_price = CAST(price AS NUMERIC) WHERE price ~ '^[0-9.]+$'")
)
# Make column non-nullable after data migration
op.alter_column('products', 'normalized_price', nullable=False)
def downgrade() -> None:
op.drop_column('products', 'normalized_price')
```
## Production Deployment
### Pre-Deployment Checklist
- [ ] All migrations tested locally
- [ ] Database backup created
- [ ] Migration rollback plan prepared
- [ ] Team notified of schema changes
### Deployment Process
```bash
# 1. Pre-deployment checks
make pre-deploy-check
# 2. Backup production database
make backup-db
# 3. Deploy with migrations
make deploy-prod # This includes migrate-up
```
### Rollback Process
```bash
# If deployment fails, rollback
make rollback-prod # This includes migrate-down
```
## Best Practices
### Migration Naming
Use clear, descriptive names:
```bash
# Good examples
make migrate-create message="add_user_profile_table"
make migrate-create message="remove_deprecated_product_fields"
make migrate-create message="add_indexes_for_search_performance"
# Avoid vague names
make migrate-create message="update_database" # Too vague
make migrate-create message="fix_stuff" # Not descriptive
```
### Safe Schema Changes
**Always Safe**:
- Adding nullable columns
- Adding indexes
- Adding new tables
- Increasing column size (varchar(50) → varchar(100))
**Potentially Unsafe** (require careful planning):
- Dropping columns
- Changing column types
- Adding non-nullable columns without defaults
- Renaming tables or columns
**Multi-Step Process for Unsafe Changes**:
```python
# Step 1: Add new column
def upgrade() -> None:
op.add_column('users', sa.Column('email_new', sa.String(255)))
# Step 2: Migrate data (separate migration)
def upgrade() -> None:
connection = op.get_bind()
connection.execute(text("UPDATE users SET email_new = email"))
# Step 3: Switch columns (separate migration)
def upgrade() -> None:
op.drop_column('users', 'email')
op.alter_column('users', 'email_new', new_column_name='email')
```
### Testing Migrations
1. **Test on copy of production data**:
```bash
# Restore production backup to test database
# Run migrations on test database
# Verify data integrity
```
2. **Test rollback process**:
```bash
make migrate-up # Apply migration
# Test application functionality
make migrate-down # Test rollback
# Verify rollback worked correctly
```
## Advanced Features
### Environment-Specific Migrations
Use migration context to handle different environments:
```python
from alembic import context
def upgrade() -> None:
# Only add sample data in development
if context.get_x_argument(as_dictionary=True).get('dev_data', False):
# Add development sample data
pass
# Always apply schema changes
op.create_table(...)
```
Run with environment flag:
```bash
alembic upgrade head -x dev_data=true
```
### Data Migrations
For large data transformations, use batch processing:
```python
def upgrade() -> None:
connection = op.get_bind()
# Process in batches to avoid memory issues
batch_size = 1000
offset = 0
while True:
result = connection.execute(
text(f"SELECT id, old_field FROM products LIMIT {batch_size} OFFSET {offset}")
)
rows = result.fetchall()
if not rows:
break
for row in rows:
# Transform data
new_value = transform_function(row.old_field)
connection.execute(
text("UPDATE products SET new_field = :new_val WHERE id = :id"),
{"new_val": new_value, "id": row.id}
)
offset += batch_size
```
## Troubleshooting
### Common Issues
**Migration conflicts**:
```bash
# When multiple developers create migrations simultaneously
# Resolve by creating a merge migration
alembic merge -m "merge migrations" head1 head2
```
**Failed migration**:
```bash
# Check current state
make migrate-status
# Manually fix database if needed
# Then mark migration as applied
alembic stamp head
```
**Out of sync database**:
```bash
# Reset to known good state
make backup-db
alembic downgrade base
make migrate-up
```
### Recovery Procedures
1. **Database corruption**: Restore from backup, replay migrations
2. **Failed deployment**: Use rollback process, investigate issue
3. **Development issues**: Reset local database, pull latest migrations
## Integration with CI/CD
Our deployment pipeline automatically:
1. Runs migration checks in CI
2. Creates database backups before deployment
3. Applies migrations during deployment
4. Provides rollback capability
Migration failures will halt deployment to prevent data corruption.
## Further Reading
- [Alembic Official Documentation](https://alembic.sqlalchemy.org/)
- [Database Schema Documentation](database-schema.md)
- [Deployment Guide](../deployment/production.md)