# Ecommerce Backend API v2.0

A robust, production-ready FastAPI backend for ecommerce product catalog and inventory management with advanced CSV import capabilities.

## Key Improvements from v1

### Architecture Improvements
- **Modular Design**: Separated concerns into utility modules, middleware, and models
- **Database Optimization**: Added proper indexing strategy and foreign key relationships
- **Connection Pooling**: PostgreSQL support with connection pooling for production scalability
- **Background Processing**: Asynchronous CSV import with job tracking

### Security Enhancements
- **JWT Authentication**: Token-based authentication with role-based access control
- **Rate Limiting**: Sliding window rate limiter to prevent API abuse
- **Input Validation**: Enhanced Pydantic models with comprehensive validation

### Performance Optimizations
- **Batch Processing**: CSV imports processed in configurable batches
- **Database Indexes**: Strategic indexing for common query patterns
- **Streaming Export**: Memory-efficient CSV export for large datasets
- **Caching Ready**: Architecture supports Redis integration

### Data Processing
- **Robust GTIN Handling**: Centralized GTIN normalization and validation
- **Multi-currency Support**: Advanced price parsing with currency extraction
- **International Content**: Multi-encoding CSV support for global data

## Project Structure

```
ecommerce_api/
├── main.py                 # FastAPI application entry point
├── models/
│   ├── database_models.py  # SQLAlchemy ORM models
│   └── api_models.py       # Pydantic API models
├── utils/
│   ├── data_processing.py  # GTIN and price processing utilities
│   ├── csv_processor.py    # CSV import/export handling
│   └── database.py         # Database configuration
├── middleware/
│   ├── auth.py            # JWT authentication
│   ├── rate_limiter.py    # Rate limiting implementation
│   └── logging_middleware.py # Request/response logging
├── config/
│   └── settings.py        # Application configuration
├── tests/
│   └── test_utils.py      # Unit tests
├── alembic/               # Database migrations
├── docker-compose.yml     # Docker deployment
├── Dockerfile            # Container definition
├── requirements.txt      # Python dependencies
└── README.md            # This file
```

## Quick Start

### 1. Development Setup

```bash
# Clone the repository
git clone <repository-url>
cd ecommerce-api

# Set up virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your database configuration
```

### 2. Database Setup

**For SQLite (Development):**
```bash
# Your .env file should have:
# DATABASE_URL=sqlite:///./ecommerce.db

# Initialize Alembic (only needed once)
alembic init alembic

# Update alembic/env.py with the provided configuration (see below)

# Create initial migration
alembic revision --autogenerate -m "Initial migration"

# Apply migrations
alembic upgrade head
```

**For PostgreSQL (Production):**
```bash
# 1. Create PostgreSQL database
createdb ecommerce_db

# 2. Update .env file:
# DATABASE_URL=postgresql://username:password@localhost:5432/ecommerce_db

# 3. Initialize and run migrations
alembic init alembic
# Update alembic/env.py and alembic.ini (see configuration section)
alembic revision --autogenerate -m "Initial migration" 
alembic upgrade head
```

**Important Alembic Configuration:**

After running `alembic init alembic`, you must update two files:

**1. Update `alembic/env.py`:**
```python
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
import os
import sys

# Add your project directory to the Python path
sys.path.append(os.path.dirname(os.path.dirname(__file__)))

from models.database_models import Base
from config.settings import settings

# Alembic Config object
config = context.config

# Override sqlalchemy.url with our settings
config.set_main_option("sqlalchemy.url", settings.database_url)

if config.config_file_name is not None:
    fileConfig(config.config_file_name)

target_metadata = Base.metadata

def run_migrations_offline() -> None:
    """Run migrations in 'offline' mode."""
    url = config.get_main_option("sqlalchemy.url")
    context.configure(
        url=url,
        target_metadata=target_metadata,
        literal_binds=True,
        dialect_opts={"paramstyle": "named"},
    )

    with context.begin_transaction():
        context.run_migrations()

def run_migrations_online() -> None:
    """Run migrations in 'online' mode."""
    connectable = engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )

    with connectable.connect() as connection:
        context.configure(
            connection=connection, target_metadata=target_metadata
        )

        with context.begin_transaction():
            context.run_migrations()

if context.is_offline_mode():
    run_migrations_offline()
else:
    run_migrations_online()
```

**2. Update `alembic.ini` sqlalchemy.url line:**
```ini
# For SQLite:
sqlalchemy.url = sqlite:///./ecommerce.db

# For PostgreSQL:
sqlalchemy.url = postgresql://username:password@localhost:5432/ecommerce_db
```

### 3. Configuration

Edit `.env` file with your settings:

```env
DATABASE_URL=postgresql://user:password@localhost:5432/ecommerce_db
JWT_SECRET_KEY=your-super-secret-key-change-in-production
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=False
```

### 3. Run Development Server

```bash
# Using make
make dev

# Or directly
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```

### 4. Docker Deployment

```bash
# Start all services
docker-compose up -d

# View logs
docker-compose logs -f api

# Stop services
docker-compose down
```

## API Endpoints

### Authentication
- `POST /auth/login` - Get JWT token
- `POST /auth/refresh` - Refresh token

### Products
- `GET /products` - List products with filtering and search
- `POST /products` - Create new product
- `GET /products/{product_id}` - Get product with stock info
- `PUT /products/{product_id}` - Update product
- `DELETE /products/{product_id}` - Delete product and associated stock

### CSV Operations
- `POST /import-csv` - Start background CSV import
- `GET /import-status/{job_id}` - Check import job status
- `GET /export-csv` - Export products as CSV (streaming)

### Stock Management
- `POST /stock` - Set exact stock quantity
- `POST /stock/add` - Add to existing stock
- `POST /stock/remove` - Remove from stock
- `GET /stock/{gtin}` - Get stock summary by GTIN
- `GET /stock/{gtin}/total` - Get total stock for GTIN
- `GET /stock` - List all stock entries with filtering
- `PUT /stock/{stock_id}` - Update stock entry
- `DELETE /stock/{stock_id}` - Delete stock entry

### System
- `GET /` - API information
- `GET /health` - Health check
- `GET /stats` - System statistics

## Advanced Features

### Background CSV Import

The API supports background processing of large CSV files:

```python
# Start import
response = requests.post('/import-csv', json={
    'url': 'https://example.com/products.csv',
    'batch_size': 1000
})

job_id = response.json()['job_id']

# Check status
status = requests.get(f'/import-status/{job_id}')
```

### Rate Limiting

Built-in rate limiting protects against API abuse:

- Default: 100 requests per hour per client
- CSV imports: 10 per hour
- Configurable per endpoint

### Search and Filtering

Advanced product search capabilities:

```bash
# Search in title and description
GET /products?search=laptop

# Filter by brand and category
GET /products?brand=Apple&category=Electronics

# Combine filters
GET /products?brand=Samsung&availability=in stock&search=phone
```

### Data Validation

Comprehensive validation for all inputs:

- GTIN format validation and normalization
- Price parsing with currency extraction
- Required field validation
- Type conversion and sanitization

## Database Schema

### Products Table
- Full product catalog with Google Shopping compatibility
- Indexed fields: `gtin`, `brand`, `google_product_category`, `availability`
- Timestamps for creation and updates

### Stock Table
- Location-based inventory tracking
- GTIN-based product linking
- Unique constraint on GTIN+location combinations
- Composite indexes for efficient queries

### Import Jobs Table
- Track background import operations
- Status monitoring and error handling
- Performance metrics

## Development

### Running Tests

```bash
# All tests
make test

# Specific test file
pytest tests/test_utils.py -v

# With coverage
pytest --cov=. tests/
```

### Code Quality

```bash
# Format code
make format

# Lint code
make lint

# Type checking
mypy .
```

### Database Migrations

**Creating Migrations:**
```bash
# After making changes to models/database_models.py, create a new migration
alembic revision --autogenerate -m "Description of changes"

# Review the generated migration file in alembic/versions/
# Edit if needed, then apply:
alembic upgrade head
```

**Common Migration Commands:**
```bash
# Check current migration status
alembic current

# View migration history
alembic history

# Upgrade to specific revision
alembic upgrade <revision_id>

# Downgrade one step
alembic downgrade -1

# Downgrade to specific revision
alembic downgrade <revision_id>

# Reset database (WARNING: destroys all data)
alembic downgrade base
alembic upgrade head
```

**Troubleshooting Alembic:**
```bash
# If you get template errors, reinitialize Alembic:
rm -rf alembic/
alembic init alembic
# Then update alembic/env.py and alembic.ini as shown above

# If migrations conflict, you may need to merge:
alembic merge -m "Merge migrations" <rev1> <rev2>
```

## Production Deployment

### Environment Setup

1. **Database**: PostgreSQL 13+ recommended
2. **Cache**: Redis for session storage and rate limiting
3. **Reverse Proxy**: Nginx for SSL termination and load balancing
4. **Monitoring**: Consider adding Prometheus metrics

### Security Checklist

- [ ] Change default JWT secret key
- [ ] Set up HTTPS/TLS
- [ ] Configure CORS appropriately
- [ ] Set up database connection limits
- [ ] Enable request logging
- [ ] Configure rate limiting per your needs
- [ ] Set up monitoring and alerting

### Docker Production

```yaml
# docker-compose.prod.yml
version: '3.8'
services:
  api:
    build: .
    environment:
      - DEBUG=False
      - DATABASE_URL=${DATABASE_URL}
      - JWT_SECRET_KEY=${JWT_SECRET_KEY}
    restart: unless-stopped
    # Add your production configuration
```

## Performance Considerations

### Database Optimization
- Use PostgreSQL for production workloads
- Monitor query performance with `EXPLAIN ANALYZE`
- Consider read replicas for read-heavy workloads
- Regular `VACUUM` and `ANALYZE` operations

### CSV Import Performance
- Batch size affects memory usage vs. speed
- Larger batches = faster import but more memory
- Monitor import job status for optimization

### API Response Times
- Database indexes are crucial for filtering
- Use pagination for large result sets
- Consider caching frequently accessed data

## Troubleshooting

### Common Issues

1. **CSV Import Failures**
   - Check encoding: try different separators
   - Validate required columns: `product_id`, `title`
   - Monitor import job status for specific errors

2. **Database Connection Issues**
   - Verify DATABASE_URL format
   - Check connection limits
   - Ensure database server is accessible

3. **Authentication Problems**
   - Verify JWT_SECRET_KEY is set
   - Check token expiration settings
   - Validate token format

### Logging

Logs are structured and include:
- Request/response times
- Error details with stack traces
- Import job progress
- Rate limiting events

```bash
# View live logs
tail -f logs/app.log

# Docker logs
docker-compose logs -f api
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make changes with tests
4. Run quality checks: `make lint test`
5. Submit a pull request

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Support

For issues and questions:
1. Check the troubleshooting section
2. Review existing GitHub issues
3. Create a new issue with detailed information
4. For security issues, contact maintainers directly