# Product Migration - Database Changes Plan ## Overview This document details the database schema changes required for Phase 1 of the Multi-Marketplace Product Architecture. It serves as the implementation guide for the Alembic migrations. **Related Documents:** - [Multi-Marketplace Product Architecture](./multi-marketplace-product-architecture.md) - Full architecture design - [Marketplace Integration Architecture](../../architecture/marketplace-integration.md) - System-wide integration plan --- ## Current State Analysis ### Existing Tables #### `marketplace_products` (Current) | Column | Type | Constraints | Notes | |--------|------|-------------|-------| | id | Integer | PK, Index | | | marketplace_product_id | String | Unique, Index, NOT NULL | Feed product ID | | title | String | NOT NULL | Single language only | | description | String | | Single language only | | link | String | | | | image_link | String | | | | availability | String | Index | | | price | String | | Raw price string "19.99 EUR" | | sale_price | String | | Raw price string | | brand | String | Index | | | gtin | String | Index | | | mpn | String | | | | condition | String | | | | adult | String | | | | multipack | Integer | | | | is_bundle | String | | | | age_group | String | | | | color | String | | | | gender | String | | | | material | String | | | | pattern | String | | | | size | String | | | | size_type | String | | | | size_system | String | | | | item_group_id | String | | | | google_product_category | String | Index | | | product_type | String | | Raw feed value | | custom_label_0-4 | String | | | | additional_image_link | String | | Single string, not array | | unit_pricing_measure | String | | | | unit_pricing_base_measure | String | | | | identifier_exists | String | | | | shipping | String | | | | currency | String | | | | marketplace | String | Index, Default='Letzshop' | | | vendor_name | String | Index | | | created_at | DateTime | | TimestampMixin | | updated_at | DateTime | | TimestampMixin | **Indexes:** - `idx_marketplace_vendor` (marketplace, vendor_name) - `idx_marketplace_brand` (marketplace, brand) #### `products` (Current) | Column | Type | Constraints | Notes | |--------|------|-------------|-------| | id | Integer | PK, Index | | | vendor_id | Integer | FK → vendors.id, NOT NULL | | | marketplace_product_id | Integer | FK → marketplace_products.id, NOT NULL | | | product_id | String | | Vendor's internal SKU | | price | Float | | Override | | sale_price | Float | | Override | | currency | String | | Override | | availability | String | | Override | | condition | String | | Override | | is_featured | Boolean | Default=False | | | is_active | Boolean | Default=True | | | display_order | Integer | Default=0 | | | min_quantity | Integer | Default=1 | | | max_quantity | Integer | | | | created_at | DateTime | | TimestampMixin | | updated_at | DateTime | | TimestampMixin | **Constraints:** - `uq_product` UNIQUE (vendor_id, marketplace_product_id) **Indexes:** - `idx_product_active` (vendor_id, is_active) - `idx_product_featured` (vendor_id, is_featured) ### Issues with Current Schema | Issue | Impact | Solution | |-------|--------|----------| | No translation support | Cannot support multi-language feeds | Add translation tables | | No product type distinction | Cannot differentiate physical/digital | Add product_type enum | | No digital product fields | Cannot support game keys, downloads | Add digital-specific columns | | Price as String | Harder to filter/sort by price | Add parsed numeric price | | Single additional_image_link | Can't store multiple images properly | Add JSON array column | | No override pattern properties | No `effective_*` helpers | Add to model layer | | One-to-one relationship | Same product can't exist for multiple vendors | Fix to one-to-many | --- ## Target Schema ### Visual Diagram ``` ┌─────────────────────────────────┐ │ marketplace_products │ ├─────────────────────────────────┤ │ id (PK) │ │ marketplace_product_id (UNIQUE) │ │ marketplace │ │ vendor_name │ │ │ │ # Product Type (NEW) │ │ product_type (ENUM) │ │ is_digital │ │ digital_delivery_method (ENUM) │ │ platform │ │ region_restrictions (JSON) │ │ license_type │ │ │ │ # Pricing (ENHANCED) │ │ price (String) [raw] │ │ price_numeric (Float) [NEW] │ │ sale_price (String) [raw] │ │ sale_price_numeric (Float) [NEW]│ │ currency │ │ │ │ # Media (ENHANCED) │ │ image_link │ │ additional_image_link [legacy] │ │ additional_images (JSON) [NEW] │ │ │ │ # Attributes (NEW) │ │ attributes (JSON) │ │ │ │ # Status (NEW) │ │ is_active │ │ │ │ # Renamed │ │ product_type_raw [was product_type] │ │ │ │ # Preserved Google Shopping │ │ brand, gtin, mpn, condition... │ │ google_product_category... │ │ custom_label_0-4... │ └─────────────────────────────────┘ │ │ 1:N ▼ ┌─────────────────────────────────┐ │ marketplace_product_translations │ ├─────────────────────────────────┤ │ id (PK) │ │ marketplace_product_id (FK) │ │ language ('en','fr','de','lb') │ │ │ │ # Localized Content │ │ title (NOT NULL) │ │ description │ │ short_description │ │ │ │ # SEO │ │ meta_title │ │ meta_description │ │ url_slug │ │ │ │ # Source Tracking │ │ source_import_id │ │ source_file │ │ │ │ created_at, updated_at │ ├─────────────────────────────────┤ │ UNIQUE(marketplace_product_id, │ │ language) │ └─────────────────────────────────┘ ┌─────────────────────────────────┐ │ products │ ├─────────────────────────────────┤ │ id (PK) │ │ vendor_id (FK) │ │ marketplace_product_id (FK) │ │ │ │ # Renamed │ │ vendor_sku [was product_id] │ │ │ │ # Existing Overrides │ │ price │ │ sale_price │ │ currency │ │ availability │ │ condition │ │ │ │ # New Overrides │ │ brand (NEW) │ │ primary_image_url (NEW) │ │ additional_images (JSON) (NEW) │ │ download_url (NEW) │ │ license_type (NEW) │ │ fulfillment_email_template (NEW)│ │ │ │ # Vendor-Specific │ │ is_featured │ │ is_active │ │ display_order │ │ min_quantity │ │ max_quantity │ │ │ │ created_at, updated_at │ ├─────────────────────────────────┤ │ UNIQUE(vendor_id, │ │ marketplace_product_id) │ └─────────────────────────────────┘ │ │ 1:N ▼ ┌─────────────────────────────────┐ │ product_translations │ ├─────────────────────────────────┤ │ id (PK) │ │ product_id (FK) │ │ language │ │ │ │ # Overridable (NULL = inherit) │ │ title │ │ description │ │ short_description │ │ meta_title │ │ meta_description │ │ url_slug │ │ │ │ created_at, updated_at │ ├─────────────────────────────────┤ │ UNIQUE(product_id, language) │ └─────────────────────────────────┘ ``` --- ## Migration Plan ### Migration 1: Add Product Type and Digital Fields **File:** `alembic/versions/xxxx_add_product_type_digital_fields.py` **Changes:** ```sql -- Create ENUMs CREATE TYPE product_type_enum AS ENUM ('physical', 'digital', 'service', 'subscription'); CREATE TYPE digital_delivery_enum AS ENUM ('download', 'email', 'in_app', 'streaming', 'license_key'); -- Add columns to marketplace_products ALTER TABLE marketplace_products ADD COLUMN product_type product_type_enum NOT NULL DEFAULT 'physical'; ALTER TABLE marketplace_products ADD COLUMN is_digital BOOLEAN NOT NULL DEFAULT false; ALTER TABLE marketplace_products ADD COLUMN digital_delivery_method digital_delivery_enum; ALTER TABLE marketplace_products ADD COLUMN platform VARCHAR; ALTER TABLE marketplace_products ADD COLUMN region_restrictions JSON; ALTER TABLE marketplace_products ADD COLUMN license_type VARCHAR; ALTER TABLE marketplace_products ADD COLUMN source_url VARCHAR; ALTER TABLE marketplace_products ADD COLUMN attributes JSON; ALTER TABLE marketplace_products ADD COLUMN additional_images JSON; ALTER TABLE marketplace_products ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true; ALTER TABLE marketplace_products ADD COLUMN price_numeric FLOAT; ALTER TABLE marketplace_products ADD COLUMN sale_price_numeric FLOAT; -- Rename product_type to product_type_raw (keep original feed value) ALTER TABLE marketplace_products RENAME COLUMN product_type TO product_type_raw; -- Add index CREATE INDEX idx_mp_product_type ON marketplace_products (product_type, is_digital); ``` **Rollback:** ```sql DROP INDEX idx_mp_product_type; ALTER TABLE marketplace_products RENAME COLUMN product_type_raw TO product_type; ALTER TABLE marketplace_products DROP COLUMN sale_price_numeric; ALTER TABLE marketplace_products DROP COLUMN price_numeric; ALTER TABLE marketplace_products DROP COLUMN is_active; ALTER TABLE marketplace_products DROP COLUMN additional_images; ALTER TABLE marketplace_products DROP COLUMN attributes; ALTER TABLE marketplace_products DROP COLUMN source_url; ALTER TABLE marketplace_products DROP COLUMN license_type; ALTER TABLE marketplace_products DROP COLUMN region_restrictions; ALTER TABLE marketplace_products DROP COLUMN platform; ALTER TABLE marketplace_products DROP COLUMN digital_delivery_method; ALTER TABLE marketplace_products DROP COLUMN is_digital; ALTER TABLE marketplace_products DROP COLUMN product_type; DROP TYPE digital_delivery_enum; DROP TYPE product_type_enum; ``` --- ### Migration 2: Create Translation Tables **File:** `alembic/versions/xxxx_create_translation_tables.py` **Changes:** ```sql -- Create marketplace_product_translations CREATE TABLE marketplace_product_translations ( id SERIAL PRIMARY KEY, marketplace_product_id INTEGER NOT NULL REFERENCES marketplace_products(id) ON DELETE CASCADE, language VARCHAR(5) NOT NULL, title VARCHAR NOT NULL, description TEXT, short_description VARCHAR(500), meta_title VARCHAR(70), meta_description VARCHAR(160), url_slug VARCHAR(255), source_import_id INTEGER, source_file VARCHAR, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, CONSTRAINT uq_marketplace_product_translation UNIQUE (marketplace_product_id, language) ); CREATE INDEX idx_mpt_language ON marketplace_product_translations (language); CREATE INDEX idx_mpt_mp_id ON marketplace_product_translations (marketplace_product_id); -- Create product_translations CREATE TABLE product_translations ( id SERIAL PRIMARY KEY, product_id INTEGER NOT NULL REFERENCES products(id) ON DELETE CASCADE, language VARCHAR(5) NOT NULL, title VARCHAR, description TEXT, short_description VARCHAR(500), meta_title VARCHAR(70), meta_description VARCHAR(160), url_slug VARCHAR(255), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, CONSTRAINT uq_product_translation UNIQUE (product_id, language) ); CREATE INDEX idx_pt_product_language ON product_translations (product_id, language); ``` **Rollback:** ```sql DROP TABLE product_translations; DROP TABLE marketplace_product_translations; ``` --- ### Migration 3: Add Override Fields to Products **File:** `alembic/versions/xxxx_add_product_override_fields.py` **Changes:** ```sql -- Rename product_id to vendor_sku ALTER TABLE products RENAME COLUMN product_id TO vendor_sku; -- Add new override columns ALTER TABLE products ADD COLUMN brand VARCHAR; ALTER TABLE products ADD COLUMN primary_image_url VARCHAR; ALTER TABLE products ADD COLUMN additional_images JSON; ALTER TABLE products ADD COLUMN download_url VARCHAR; ALTER TABLE products ADD COLUMN license_type VARCHAR; ALTER TABLE products ADD COLUMN fulfillment_email_template VARCHAR; -- Add index for vendor_sku CREATE INDEX idx_product_vendor_sku ON products (vendor_id, vendor_sku); ``` **Rollback:** ```sql DROP INDEX idx_product_vendor_sku; ALTER TABLE products DROP COLUMN fulfillment_email_template; ALTER TABLE products DROP COLUMN license_type; ALTER TABLE products DROP COLUMN download_url; ALTER TABLE products DROP COLUMN additional_images; ALTER TABLE products DROP COLUMN primary_image_url; ALTER TABLE products DROP COLUMN brand; ALTER TABLE products RENAME COLUMN vendor_sku TO product_id; ``` --- ### Migration 4: Data Migration **File:** `alembic/versions/xxxx_migrate_product_data.py` **Changes:** ```sql -- Migrate existing title/description to translations table (default language: 'en') INSERT INTO marketplace_product_translations ( marketplace_product_id, language, title, description, created_at, updated_at ) SELECT id, 'en', title, description, created_at, updated_at FROM marketplace_products WHERE title IS NOT NULL ON CONFLICT (marketplace_product_id, language) DO NOTHING; -- Parse prices to numeric (handled in Python for complex parsing) -- This will be done via a Python data migration function ``` **Python Migration Function:** ```python def parse_and_update_prices(connection): """Parse price strings to numeric values.""" import re # Get all marketplace products result = connection.execute( text("SELECT id, price, sale_price FROM marketplace_products") ) for row in result: price_numeric = parse_price(row.price) sale_price_numeric = parse_price(row.sale_price) connection.execute( text(""" UPDATE marketplace_products SET price_numeric = :price, sale_price_numeric = :sale_price WHERE id = :id """), {"id": row.id, "price": price_numeric, "sale_price": sale_price_numeric} ) def parse_price(price_str: str) -> float | None: """Parse price string like '19.99 EUR' to float.""" if not price_str: return None # Extract numeric value numbers = re.findall(r'[\d.,]+', str(price_str)) if numbers: num_str = numbers[0].replace(',', '.') try: return float(num_str) except ValueError: pass return None ``` **Rollback:** ```sql -- Data migration is one-way, original columns preserved -- No rollback needed for data ``` --- ## Migration Execution Order | Order | Migration | Risk Level | Notes | |-------|-----------|------------|-------| | 1 | Add product type & digital fields | Low | All columns nullable or have defaults | | 2 | Create translation tables | Low | New tables, no existing data affected | | 3 | Add product override fields | Low | All columns nullable | | 4 | Data migration | Medium | Copies data, original preserved | --- ## Model Layer Updates Required ### MarketplaceProduct Model ```python # Add to models/database/marketplace_product.py class ProductType(str, Enum): PHYSICAL = "physical" DIGITAL = "digital" SERVICE = "service" SUBSCRIPTION = "subscription" class DigitalDeliveryMethod(str, Enum): DOWNLOAD = "download" EMAIL = "email" IN_APP = "in_app" STREAMING = "streaming" LICENSE_KEY = "license_key" class MarketplaceProduct(Base, TimestampMixin): # ... existing fields ... # NEW FIELDS product_type = Column(SQLEnum(ProductType), default=ProductType.PHYSICAL, nullable=False) is_digital = Column(Boolean, default=False, nullable=False) digital_delivery_method = Column(SQLEnum(DigitalDeliveryMethod), nullable=True) platform = Column(String, nullable=True) region_restrictions = Column(JSON, nullable=True) license_type = Column(String, nullable=True) source_url = Column(String, nullable=True) attributes = Column(JSON, nullable=True) additional_images = Column(JSON, nullable=True) is_active = Column(Boolean, default=True, nullable=False) price_numeric = Column(Float, nullable=True) sale_price_numeric = Column(Float, nullable=True) product_type_raw = Column(String) # Renamed from product_type # NEW RELATIONSHIP translations = relationship( "MarketplaceProductTranslation", back_populates="marketplace_product", cascade="all, delete-orphan" ) # Change to one-to-many vendor_products = relationship("Product", back_populates="marketplace_product") ``` ### MarketplaceProductTranslation Model (NEW) ```python # models/database/marketplace_product_translation.py class MarketplaceProductTranslation(Base, TimestampMixin): __tablename__ = "marketplace_product_translations" id = Column(Integer, primary_key=True) marketplace_product_id = Column( Integer, ForeignKey("marketplace_products.id", ondelete="CASCADE"), nullable=False ) language = Column(String(5), nullable=False) title = Column(String, nullable=False) description = Column(Text, nullable=True) short_description = Column(String(500), nullable=True) meta_title = Column(String(70), nullable=True) meta_description = Column(String(160), nullable=True) url_slug = Column(String(255), nullable=True) source_import_id = Column(Integer, nullable=True) source_file = Column(String, nullable=True) marketplace_product = relationship( "MarketplaceProduct", back_populates="translations" ) __table_args__ = ( UniqueConstraint("marketplace_product_id", "language", name="uq_marketplace_product_translation"), Index("idx_mpt_language", "language"), ) ``` ### Product Model Updates ```python # Update models/database/product.py class Product(Base, TimestampMixin): # ... existing fields ... # RENAMED vendor_sku = Column(String) # Was: product_id # NEW OVERRIDE FIELDS brand = Column(String, nullable=True) primary_image_url = Column(String, nullable=True) additional_images = Column(JSON, nullable=True) download_url = Column(String, nullable=True) license_type = Column(String, nullable=True) fulfillment_email_template = Column(String, nullable=True) # NEW RELATIONSHIP translations = relationship( "ProductTranslation", back_populates="product", cascade="all, delete-orphan" ) # OVERRIDABLE FIELDS LIST OVERRIDABLE_FIELDS = [ "price", "sale_price", "currency", "brand", "condition", "availability", "primary_image_url", "additional_images", "download_url", "license_type" ] # EFFECTIVE PROPERTIES @property def effective_price(self) -> float | None: if self.price is not None: return self.price return self.marketplace_product.price_numeric if self.marketplace_product else None @property def effective_brand(self) -> str | None: if self.brand is not None: return self.brand return self.marketplace_product.brand if self.marketplace_product else None # ... other effective_* properties ... def get_override_info(self) -> dict: """Get all fields with inheritance flags.""" mp = self.marketplace_product return { "price": self.effective_price, "price_overridden": self.price is not None, "price_source": mp.price_numeric if mp else None, # ... other fields ... } def reset_field_to_source(self, field_name: str) -> bool: """Reset a field to inherit from marketplace product.""" if field_name in self.OVERRIDABLE_FIELDS: setattr(self, field_name, None) return True return False ``` ### ProductTranslation Model (NEW) ```python # models/database/product_translation.py class ProductTranslation(Base, TimestampMixin): __tablename__ = "product_translations" id = Column(Integer, primary_key=True) product_id = Column( Integer, ForeignKey("products.id", ondelete="CASCADE"), nullable=False ) language = Column(String(5), nullable=False) title = Column(String, nullable=True) # NULL = inherit description = Column(Text, nullable=True) short_description = Column(String(500), nullable=True) meta_title = Column(String(70), nullable=True) meta_description = Column(String(160), nullable=True) url_slug = Column(String(255), nullable=True) product = relationship("Product", back_populates="translations") OVERRIDABLE_FIELDS = [ "title", "description", "short_description", "meta_title", "meta_description", "url_slug" ] def get_effective_title(self) -> str | None: if self.title is not None: return self.title return self._get_marketplace_translation_field("title") def _get_marketplace_translation_field(self, field: str) -> str | None: mp = self.product.marketplace_product if mp: for t in mp.translations: if t.language == self.language: return getattr(t, field, None) return None __table_args__ = ( UniqueConstraint("product_id", "language", name="uq_product_translation"), Index("idx_pt_product_language", "product_id", "language"), ) ``` --- ## Code Updates Required ### Files to Update | File | Changes | |------|---------| | `models/database/__init__.py` | Export new models | | `models/database/marketplace_product.py` | Add new fields, enums, relationship | | `models/database/product.py` | Rename column, add override fields/properties | | `models/schema/product.py` | Update Pydantic schemas | | `app/services/product_service.py` | Add reset logic, translation support | | `app/utils/csv_processor.py` | Update to use translations | | `tests/` | Update all product-related tests | ### Import Service Updates The CSV processor needs to: 1. Accept language parameter 2. Create/update translations instead of direct title/description 3. Parse prices to numeric 4. Detect digital products --- ## Open Questions (Requires Decision) ### 1. Keep Original Title/Description Columns? **Option A: Keep as cache/fallback** - Pros: Simpler migration, backward compatible - Cons: Data duplication **Option B: Remove after migration** - Pros: Cleaner schema - Cons: Breaking change, requires code updates **Recommendation:** Option A for Phase 1, deprecate later --- ### 2. Default Language for Existing Data? Current data appears to be English. Confirm before migration. **Default:** `'en'` --- ### 3. Price Parsing Strategy? **Option A: Parse during migration** - Run Python migration to parse all existing prices **Option B: Parse on next import only** - Leave existing data, parse going forward **Recommendation:** Option A - parse during migration for consistency --- ## Testing Checklist Before running migrations in production: - [ ] Run migrations on local database - [ ] Run migrations on staging/test database with production data copy - [ ] Verify data integrity after migration - [ ] Run full test suite - [ ] Test import with multi-language CSV - [ ] Test override pattern (set override, reset to source) - [ ] Test translation inheritance - [ ] Performance test with large dataset --- ## Rollback Plan If issues occur: 1. Migrations are designed to be reversible 2. Each migration has explicit downgrade function 3. Original data is preserved (title/description columns kept) 4. Run `alembic downgrade -1` for each migration in reverse order --- ## Next Steps 1. [ ] Confirm open questions above 2. [ ] Create Alembic migration files 3. [ ] Update SQLAlchemy models 4. [ ] Update Pydantic schemas 5. [ ] Update services 6. [ ] Update tests 7. [ ] Run on staging environment 8. [ ] Deploy to production