# Ecommerce Backend API v2.0 A robust, production-ready FastAPI backend for ecommerce product catalog and inventory management with advanced CSV import capabilities. ## Key Improvements from v1 ### Architecture Improvements - **Modular Design**: Separated concerns into utility modules, middleware, and models - **Database Optimization**: Added proper indexing strategy and foreign key relationships - **Connection Pooling**: PostgreSQL support with connection pooling for production scalability - **Background Processing**: Asynchronous CSV import with job tracking ### Security Enhancements - **JWT Authentication**: Token-based authentication with role-based access control - **Rate Limiting**: Sliding window rate limiter to prevent API abuse - **Input Validation**: Enhanced Pydantic models with comprehensive validation ### Performance Optimizations - **Batch Processing**: CSV imports processed in configurable batches - **Database Indexes**: Strategic indexing for common query patterns - **Streaming Export**: Memory-efficient CSV export for large datasets - **Caching Ready**: Architecture supports Redis integration ### Data Processing - **Robust GTIN Handling**: Centralized GTIN normalization and validation - **Multi-currency Support**: Advanced price parsing with currency extraction - **International Content**: Multi-encoding CSV support for global data ## Project Structure ``` ecommerce_api/ ├── main.py # FastAPI application entry point ├── models/ │ ├── database_models.py # SQLAlchemy ORM models │ └── api_models.py # Pydantic API models ├── utils/ │ ├── data_processing.py # GTIN and price processing utilities │ ├── csv_processor.py # CSV import/export handling │ └── database.py # Database configuration ├── middleware/ │ ├── auth.py # JWT authentication │ ├── rate_limiter.py # Rate limiting implementation │ └── logging_middleware.py # Request/response logging ├── config/ │ └── settings.py # Application configuration ├── tests/ │ └── test_utils.py # Unit tests ├── alembic/ # Database migrations ├── docker-compose.yml # Docker deployment ├── Dockerfile # Container definition ├── requirements.txt # Python dependencies └── README.md # This file ``` ## Quick Start ### 1. Development Setup ```bash # Clone the repository git clone cd ecommerce-api # Set up virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Set up environment variables cp .env.example .env # Edit .env with your database configuration ``` ### 2. Database Setup **For SQLite (Development):** ```bash # Your .env file should have: # DATABASE_URL=sqlite:///./ecommerce.db # Initialize Alembic (only needed once) alembic init alembic # Update alembic/env.py with the provided configuration (see below) # Create initial migration alembic revision --autogenerate -m "Initial migration" # Apply migrations alembic upgrade head ``` **For PostgreSQL (Production):** ```bash # 1. Create PostgreSQL database createdb ecommerce_db # 2. Update .env file: # DATABASE_URL=postgresql://username:password@localhost:5432/ecommerce_db # 3. Initialize and run migrations alembic init alembic # Update alembic/env.py and alembic.ini (see configuration section) alembic revision --autogenerate -m "Initial migration" alembic upgrade head ``` **Important Alembic Configuration:** After running `alembic init alembic`, you must update two files: **1. Update `alembic/env.py`:** ```python from logging.config import fileConfig from sqlalchemy import engine_from_config, pool from alembic import context import os import sys # Add your project directory to the Python path sys.path.append(os.path.dirname(os.path.dirname(__file__))) from models.database_models import Base from config.settings import settings # Alembic Config object config = context.config # Override sqlalchemy.url with our settings config.set_main_option("sqlalchemy.url", settings.database_url) if config.config_file_name is not None: fileConfig(config.config_file_name) target_metadata = Base.metadata def run_migrations_offline() -> None: """Run migrations in 'offline' mode.""" url = config.get_main_option("sqlalchemy.url") context.configure( url=url, target_metadata=target_metadata, literal_binds=True, dialect_opts={"paramstyle": "named"}, ) with context.begin_transaction(): context.run_migrations() def run_migrations_online() -> None: """Run migrations in 'online' mode.""" connectable = engine_from_config( config.get_section(config.config_ini_section, {}), prefix="sqlalchemy.", poolclass=pool.NullPool, ) with connectable.connect() as connection: context.configure( connection=connection, target_metadata=target_metadata ) with context.begin_transaction(): context.run_migrations() if context.is_offline_mode(): run_migrations_offline() else: run_migrations_online() ``` **2. Update `alembic.ini` sqlalchemy.url line:** ```ini # For SQLite: sqlalchemy.url = sqlite:///./ecommerce.db # For PostgreSQL: sqlalchemy.url = postgresql://username:password@localhost:5432/ecommerce_db ``` ### 3. Configuration Edit `.env` file with your settings: ```env DATABASE_URL=postgresql://user:password@localhost:5432/ecommerce_db JWT_SECRET_KEY=your-super-secret-key-change-in-production API_HOST=0.0.0.0 API_PORT=8000 DEBUG=False ``` ### 3. Run Development Server ```bash # Using make make dev # Or directly uvicorn main:app --reload --host 0.0.0.0 --port 8000 ``` ### 4. Docker Deployment ```bash # Start all services docker-compose up -d # View logs docker-compose logs -f api # Stop services docker-compose down ``` ## API Endpoints ### Authentication - `POST /auth/login` - Get JWT token - `POST /auth/refresh` - Refresh token ### Products - `GET /products` - List products with filtering and search - `POST /products` - Create new product - `GET /products/{product_id}` - Get product with stock info - `PUT /products/{product_id}` - Update product - `DELETE /products/{product_id}` - Delete product and associated stock ### CSV Operations - `POST /import-csv` - Start background CSV import - `GET /import-status/{job_id}` - Check import job status - `GET /export-csv` - Export products as CSV (streaming) ### Stock Management - `POST /stock` - Set exact stock quantity - `POST /stock/add` - Add to existing stock - `POST /stock/remove` - Remove from stock - `GET /stock/{gtin}` - Get stock summary by GTIN - `GET /stock/{gtin}/total` - Get total stock for GTIN - `GET /stock` - List all stock entries with filtering - `PUT /stock/{stock_id}` - Update stock entry - `DELETE /stock/{stock_id}` - Delete stock entry ### System - `GET /` - API information - `GET /health` - Health check - `GET /stats` - System statistics ## Advanced Features ### Background CSV Import The API supports background processing of large CSV files: ```python # Start import response = requests.post('/import-csv', json={ 'url': 'https://example.com/products.csv', 'batch_size': 1000 }) job_id = response.json()['job_id'] # Check status status = requests.get(f'/import-status/{job_id}') ``` ### Rate Limiting Built-in rate limiting protects against API abuse: - Default: 100 requests per hour per client - CSV imports: 10 per hour - Configurable per endpoint ### Search and Filtering Advanced product search capabilities: ```bash # Search in title and description GET /products?search=laptop # Filter by brand and category GET /products?brand=Apple&category=Electronics # Combine filters GET /products?brand=Samsung&availability=in stock&search=phone ``` ### Data Validation Comprehensive validation for all inputs: - GTIN format validation and normalization - Price parsing with currency extraction - Required field validation - Type conversion and sanitization ## Database Schema ### Products Table - Full product catalog with Google Shopping compatibility - Indexed fields: `gtin`, `brand`, `google_product_category`, `availability` - Timestamps for creation and updates ### Stock Table - Location-based inventory tracking - GTIN-based product linking - Unique constraint on GTIN+location combinations - Composite indexes for efficient queries ### Import Jobs Table - Track background import operations - Status monitoring and error handling - Performance metrics ## Development ### Running Tests ```bash # All tests make test # Specific test file pytest tests/test_utils.py -v # With coverage pytest --cov=. tests/ ``` ### Code Quality ```bash # Format code make format # Lint code make lint # Type checking mypy . ``` ### Database Migrations **Creating Migrations:** ```bash # After making changes to models/database_models.py, create a new migration alembic revision --autogenerate -m "Description of changes" # Review the generated migration file in alembic/versions/ # Edit if needed, then apply: alembic upgrade head ``` **Common Migration Commands:** ```bash # Check current migration status alembic current # View migration history alembic history # Upgrade to specific revision alembic upgrade # Downgrade one step alembic downgrade -1 # Downgrade to specific revision alembic downgrade # Reset database (WARNING: destroys all data) alembic downgrade base alembic upgrade head ``` **Troubleshooting Alembic:** ```bash # If you get template errors, reinitialize Alembic: rm -rf alembic/ alembic init alembic # Then update alembic/env.py and alembic.ini as shown above # If migrations conflict, you may need to merge: alembic merge -m "Merge migrations" ``` ## Production Deployment ### Environment Setup 1. **Database**: PostgreSQL 13+ recommended 2. **Cache**: Redis for session storage and rate limiting 3. **Reverse Proxy**: Nginx for SSL termination and load balancing 4. **Monitoring**: Consider adding Prometheus metrics ### Security Checklist - [ ] Change default JWT secret key - [ ] Set up HTTPS/TLS - [ ] Configure CORS appropriately - [ ] Set up database connection limits - [ ] Enable request logging - [ ] Configure rate limiting per your needs - [ ] Set up monitoring and alerting ### Docker Production ```yaml # docker-compose.prod.yml version: '3.8' services: api: build: . environment: - DEBUG=False - DATABASE_URL=${DATABASE_URL} - JWT_SECRET_KEY=${JWT_SECRET_KEY} restart: unless-stopped # Add your production configuration ``` ## Performance Considerations ### Database Optimization - Use PostgreSQL for production workloads - Monitor query performance with `EXPLAIN ANALYZE` - Consider read replicas for read-heavy workloads - Regular `VACUUM` and `ANALYZE` operations ### CSV Import Performance - Batch size affects memory usage vs. speed - Larger batches = faster import but more memory - Monitor import job status for optimization ### API Response Times - Database indexes are crucial for filtering - Use pagination for large result sets - Consider caching frequently accessed data ## Troubleshooting ### Common Issues 1. **CSV Import Failures** - Check encoding: try different separators - Validate required columns: `product_id`, `title` - Monitor import job status for specific errors 2. **Database Connection Issues** - Verify DATABASE_URL format - Check connection limits - Ensure database server is accessible 3. **Authentication Problems** - Verify JWT_SECRET_KEY is set - Check token expiration settings - Validate token format ### Logging Logs are structured and include: - Request/response times - Error details with stack traces - Import job progress - Rate limiting events ```bash # View live logs tail -f logs/app.log # Docker logs docker-compose logs -f api ``` ## Contributing 1. Fork the repository 2. Create a feature branch 3. Make changes with tests 4. Run quality checks: `make lint test` 5. Submit a pull request ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Support For issues and questions: 1. Check the troubleshooting section 2. Review existing GitHub issues 3. Create a new issue with detailed information 4. For security issues, contact maintainers directly