Files

Samir Boulahtit c494c5b5c6 API and database models refactoring

2025-09-20 20:17:16 +02:00

12 KiB

Raw Blame History

Ecommerce Backend API v2.0

A robust, production-ready FastAPI backend for ecommerce product catalog and inventory management with advanced CSV import capabilities.

Key Improvements from v1

Architecture Improvements

Modular Design: Separated concerns into utility modules, middleware, and models
Database Optimization: Added proper indexing strategy and foreign key relationships
Connection Pooling: PostgreSQL support with connection pooling for production scalability
Background Processing: Asynchronous CSV import with job tracking

Security Enhancements

JWT Authentication: Token-based authentication with role-based access control
Rate Limiting: Sliding window rate limiter to prevent API abuse
Input Validation: Enhanced Pydantic models with comprehensive validation

Performance Optimizations

Batch Processing: CSV imports processed in configurable batches
Database Indexes: Strategic indexing for common query patterns
Streaming Export: Memory-efficient CSV export for large datasets
Caching Ready: Architecture supports Redis integration

Data Processing

Robust GTIN Handling: Centralized GTIN normalization and validation
Multi-currency Support: Advanced price parsing with currency extraction
International Content: Multi-encoding CSV support for global data

Project Structure

ecommerce_api/
├── main.py                 # FastAPI application entry point
├── models/
│   ├── database_models.py  # SQLAlchemy ORM models
│   └── api_models.py       # Pydantic API models
├── utils/
│   ├── data_processing.py  # GTIN and price processing utilities
│   ├── csv_processor.py    # CSV import/export handling
│   └── database.py         # Database configuration
├── middleware/
│   ├── auth.py            # JWT authentication
│   ├── rate_limiter.py    # Rate limiting implementation
│   └── logging_middleware.py # Request/response logging
├── config/
│   └── settings.py        # Application configuration
├── tests/
│   └── test_utils.py      # Unit tests
├── alembic/               # Database migrations
├── docker-compose.yml     # Docker deployment
├── Dockerfile            # Container definition
├── requirements.txt      # Python dependencies
└── README.md            # This file

Quick Start

1. Development Setup

# Clone the repository
git clone <repository-url>
cd ecommerce-api

# Set up virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your database configuration

2. Database Setup

For SQLite (Development):

# Your .env file should have:
# DATABASE_URL=sqlite:///./ecommerce.db

# Initialize Alembic (only needed once)
alembic init alembic

# Update alembic/env.py with the provided configuration (see below)

# Create initial migration
alembic revision --autogenerate -m "Initial migration"

# Apply migrations
alembic upgrade head

For PostgreSQL (Production):

# 1. Create PostgreSQL database
createdb ecommerce_db

# 2. Update .env file:
# DATABASE_URL=postgresql://username:password@localhost:5432/ecommerce_db

# 3. Initialize and run migrations
alembic init alembic
# Update alembic/env.py and alembic.ini (see configuration section)
alembic revision --autogenerate -m "Initial migration" 
alembic upgrade head

Important Alembic Configuration:

After running alembic init alembic, you must update two files:

1. Update alembic/env.py:

from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
import os
import sys

# Add your project directory to the Python path
sys.path.append(os.path.dirname(os.path.dirname(__file__)))

from models.database import Base
from config.settings import settings

# Alembic Config object
config = context.config

# Override sqlalchemy.url with our settings
config.set_main_option("sqlalchemy.url", settings.database_url)

if config.config_file_name is not None:
    fileConfig(config.config_file_name)

target_metadata = Base.metadata

def run_migrations_offline() -> None:
    """Run migrations in 'offline' mode."""
    url = config.get_main_option("sqlalchemy.url")
    context.configure(
        url=url,
        target_metadata=target_metadata,
        literal_binds=True,
        dialect_opts={"paramstyle": "named"},
    )

    with context.begin_transaction():
        context.run_migrations()

def run_migrations_online() -> None:
    """Run migrations in 'online' mode."""
    connectable = engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )

    with connectable.connect() as connection:
        context.configure(
            connection=connection, target_metadata=target_metadata
        )

        with context.begin_transaction():
            context.run_migrations()

if context.is_offline_mode():
    run_migrations_offline()
else:
    run_migrations_online()

2. Update alembic.ini sqlalchemy.url line:

# For SQLite:
sqlalchemy.url = sqlite:///./ecommerce.db

# For PostgreSQL:
sqlalchemy.url = postgresql://username:password@localhost:5432/ecommerce_db

3. Configuration

Edit .env file with your settings:

DATABASE_URL=postgresql://user:password@localhost:5432/ecommerce_db
JWT_SECRET_KEY=your-super-secret-key-change-in-production
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=False

3. Run Development Server

# Using make
make dev

# Or directly
uvicorn main:app --reload --host 0.0.0.0 --port 8000

4. Docker Deployment

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f api

# Stop services
docker-compose down

API Endpoints

Authentication

POST /auth/login - Get JWT token
POST /auth/refresh - Refresh token

Products

GET /products - List products with filtering and search
POST /products - Create new product
GET /products/{product_id} - Get product with stock info
PUT /products/{product_id} - Update product
DELETE /products/{product_id} - Delete product and associated stock

CSV Operations

POST /import-csv - Start background CSV import
GET /import-status/{job_id} - Check import job status
GET /export-csv - Export products as CSV (streaming)

Stock Management

POST /stock - Set exact stock quantity
POST /stock/add - Add to existing stock
POST /stock/remove - Remove from stock
GET /stock/{gtin} - Get stock summary by GTIN
GET /stock/{gtin}/total - Get total stock for GTIN
GET /stock - List all stock entries with filtering
PUT /stock/{stock_id} - Update stock entry
DELETE /stock/{stock_id} - Delete stock entry

System

GET / - API information
GET /health - Health check
GET /stats - System statistics

Advanced Features

Background CSV Import

The API supports background processing of large CSV files:

# Start import
response = requests.post('/import-csv', json={
    'url': 'https://example.com/products.csv',
    'batch_size': 1000
})

job_id = response.json()['job_id']

# Check status
status = requests.get(f'/import-status/{job_id}')

Rate Limiting

Built-in rate limiting protects against API abuse:

Default: 100 requests per hour per client
CSV imports: 10 per hour
Configurable per endpoint

Search and Filtering

Advanced product search capabilities:

# Search in title and description
GET /products?search=laptop

# Filter by brand and category
GET /products?brand=Apple&category=Electronics

# Combine filters
GET /products?brand=Samsung&availability=in stock&search=phone

Data Validation

Comprehensive validation for all inputs:

GTIN format validation and normalization
Price parsing with currency extraction
Required field validation
Type conversion and sanitization

Database Schema

Products Table

Full product catalog with Google Shopping compatibility
Indexed fields: gtin, brand, google_product_category, availability
Timestamps for creation and updates

Stock Table

Location-based inventory tracking
GTIN-based product linking
Unique constraint on GTIN+location combinations
Composite indexes for efficient queries

Import Jobs Table

Track background import operations
Status monitoring and error handling
Performance metrics

Development

Running Tests

# All tests
make test

# Specific test file
pytest tests/test_utils.py -v

# With coverage
pytest --cov=. tests/

Code Quality

# Format code
make format

# Lint code
make lint

# Type checking
mypy .

Database Migrations

Creating Migrations:

# After making changes to models/database_models.py, create a new migration
alembic revision --autogenerate -m "Description of changes"

# Review the generated migration file in alembic/versions/
# Edit if needed, then apply:
alembic upgrade head

Common Migration Commands:

# Check current migration status
alembic current

# View migration history
alembic history

# Upgrade to specific revision
alembic upgrade <revision_id>

# Downgrade one step
alembic downgrade -1

# Downgrade to specific revision
alembic downgrade <revision_id>

# Reset database (WARNING: destroys all data)
alembic downgrade base
alembic upgrade head

Troubleshooting Alembic:

# If you get template errors, reinitialize Alembic:
rm -rf alembic/
alembic init alembic
# Then update alembic/env.py and alembic.ini as shown above

# If migrations conflict, you may need to merge:
alembic merge -m "Merge migrations" <rev1> <rev2>

Production Deployment

Environment Setup

Database: PostgreSQL 13+ recommended
Cache: Redis for session storage and rate limiting
Reverse Proxy: Nginx for SSL termination and load balancing
Monitoring: Consider adding Prometheus metrics

Security Checklist

Change default JWT secret key
Set up HTTPS/TLS
Configure CORS appropriately
Set up database connection limits
Enable request logging
Configure rate limiting per your needs
Set up monitoring and alerting

Docker Production

# docker-compose.prod.yml
version: '3.8'
services:
  api:
    build: .
    environment:
      - DEBUG=False
      - DATABASE_URL=${DATABASE_URL}
      - JWT_SECRET_KEY=${JWT_SECRET_KEY}
    restart: unless-stopped
    # Add your production configuration

Performance Considerations

Database Optimization

Use PostgreSQL for production workloads
Monitor query performance with EXPLAIN ANALYZE
Consider read replicas for read-heavy workloads
Regular VACUUM and ANALYZE operations

CSV Import Performance

Batch size affects memory usage vs. speed
Larger batches = faster import but more memory
Monitor import job status for optimization

API Response Times

Database indexes are crucial for filtering
Use pagination for large result sets
Consider caching frequently accessed data

Troubleshooting

Common Issues

CSV Import Failures
- Check encoding: try different separators
- Validate required columns: product_id, title
- Monitor import job status for specific errors
Database Connection Issues
- Verify DATABASE_URL format
- Check connection limits
- Ensure database server is accessible
Authentication Problems
- Verify JWT_SECRET_KEY is set
- Check token expiration settings
- Validate token format

Logging

Logs are structured and include:

Request/response times
Error details with stack traces
Import job progress
Rate limiting events

# View live logs
tail -f logs/app.log

# Docker logs
docker-compose logs -f api

Contributing

Fork the repository
Create a feature branch
Make changes with tests
Run quality checks: make lint test
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and questions:

Check the troubleshooting section
Review existing GitHub issues
Create a new issue with detailed information
For security issues, contact maintainers directly

12 KiB Raw Blame History