Files
orion/README.md
2025-09-05 17:27:39 +02:00

500 lines
12 KiB
Markdown

# Ecommerce Backend API v2.0
A robust, production-ready FastAPI backend for ecommerce product catalog and inventory management with advanced CSV import capabilities.
## Key Improvements from v1
### Architecture Improvements
- **Modular Design**: Separated concerns into utility modules, middleware, and models
- **Database Optimization**: Added proper indexing strategy and foreign key relationships
- **Connection Pooling**: PostgreSQL support with connection pooling for production scalability
- **Background Processing**: Asynchronous CSV import with job tracking
### Security Enhancements
- **JWT Authentication**: Token-based authentication with role-based access control
- **Rate Limiting**: Sliding window rate limiter to prevent API abuse
- **Input Validation**: Enhanced Pydantic models with comprehensive validation
### Performance Optimizations
- **Batch Processing**: CSV imports processed in configurable batches
- **Database Indexes**: Strategic indexing for common query patterns
- **Streaming Export**: Memory-efficient CSV export for large datasets
- **Caching Ready**: Architecture supports Redis integration
### Data Processing
- **Robust GTIN Handling**: Centralized GTIN normalization and validation
- **Multi-currency Support**: Advanced price parsing with currency extraction
- **International Content**: Multi-encoding CSV support for global data
## Project Structure
```
ecommerce_api/
├── main.py # FastAPI application entry point
├── models/
│ ├── database_models.py # SQLAlchemy ORM models
│ └── api_models.py # Pydantic API models
├── utils/
│ ├── data_processing.py # GTIN and price processing utilities
│ ├── csv_processor.py # CSV import/export handling
│ └── database.py # Database configuration
├── middleware/
│ ├── auth.py # JWT authentication
│ ├── rate_limiter.py # Rate limiting implementation
│ └── logging_middleware.py # Request/response logging
├── config/
│ └── settings.py # Application configuration
├── tests/
│ └── test_utils.py # Unit tests
├── alembic/ # Database migrations
├── docker-compose.yml # Docker deployment
├── Dockerfile # Container definition
├── requirements.txt # Python dependencies
└── README.md # This file
```
## Quick Start
### 1. Development Setup
```bash
# Clone the repository
git clone <repository-url>
cd ecommerce-api
# Set up virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your database configuration
```
### 2. Database Setup
**For SQLite (Development):**
```bash
# Your .env file should have:
# DATABASE_URL=sqlite:///./ecommerce.db
# Initialize Alembic (only needed once)
alembic init alembic
# Update alembic/env.py with the provided configuration (see below)
# Create initial migration
alembic revision --autogenerate -m "Initial migration"
# Apply migrations
alembic upgrade head
```
**For PostgreSQL (Production):**
```bash
# 1. Create PostgreSQL database
createdb ecommerce_db
# 2. Update .env file:
# DATABASE_URL=postgresql://username:password@localhost:5432/ecommerce_db
# 3. Initialize and run migrations
alembic init alembic
# Update alembic/env.py and alembic.ini (see configuration section)
alembic revision --autogenerate -m "Initial migration"
alembic upgrade head
```
**Important Alembic Configuration:**
After running `alembic init alembic`, you must update two files:
**1. Update `alembic/env.py`:**
```python
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
import os
import sys
# Add your project directory to the Python path
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
from models.database_models import Base
from config.settings import settings
# Alembic Config object
config = context.config
# Override sqlalchemy.url with our settings
config.set_main_option("sqlalchemy.url", settings.database_url)
if config.config_file_name is not None:
fileConfig(config.config_file_name)
target_metadata = Base.metadata
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode."""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online() -> None:
"""Run migrations in 'online' mode."""
connectable = engine_from_config(
config.get_section(config.config_ini_section, {}),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
context.configure(
connection=connection, target_metadata=target_metadata
)
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
```
**2. Update `alembic.ini` sqlalchemy.url line:**
```ini
# For SQLite:
sqlalchemy.url = sqlite:///./ecommerce.db
# For PostgreSQL:
sqlalchemy.url = postgresql://username:password@localhost:5432/ecommerce_db
```
### 3. Configuration
Edit `.env` file with your settings:
```env
DATABASE_URL=postgresql://user:password@localhost:5432/ecommerce_db
JWT_SECRET_KEY=your-super-secret-key-change-in-production
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=False
```
### 3. Run Development Server
```bash
# Using make
make dev
# Or directly
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
### 4. Docker Deployment
```bash
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f api
# Stop services
docker-compose down
```
## API Endpoints
### Authentication
- `POST /auth/login` - Get JWT token
- `POST /auth/refresh` - Refresh token
### Products
- `GET /products` - List products with filtering and search
- `POST /products` - Create new product
- `GET /products/{product_id}` - Get product with stock info
- `PUT /products/{product_id}` - Update product
- `DELETE /products/{product_id}` - Delete product and associated stock
### CSV Operations
- `POST /import-csv` - Start background CSV import
- `GET /import-status/{job_id}` - Check import job status
- `GET /export-csv` - Export products as CSV (streaming)
### Stock Management
- `POST /stock` - Set exact stock quantity
- `POST /stock/add` - Add to existing stock
- `POST /stock/remove` - Remove from stock
- `GET /stock/{gtin}` - Get stock summary by GTIN
- `GET /stock/{gtin}/total` - Get total stock for GTIN
- `GET /stock` - List all stock entries with filtering
- `PUT /stock/{stock_id}` - Update stock entry
- `DELETE /stock/{stock_id}` - Delete stock entry
### System
- `GET /` - API information
- `GET /health` - Health check
- `GET /stats` - System statistics
## Advanced Features
### Background CSV Import
The API supports background processing of large CSV files:
```python
# Start import
response = requests.post('/import-csv', json={
'url': 'https://example.com/products.csv',
'batch_size': 1000
})
job_id = response.json()['job_id']
# Check status
status = requests.get(f'/import-status/{job_id}')
```
### Rate Limiting
Built-in rate limiting protects against API abuse:
- Default: 100 requests per hour per client
- CSV imports: 10 per hour
- Configurable per endpoint
### Search and Filtering
Advanced product search capabilities:
```bash
# Search in title and description
GET /products?search=laptop
# Filter by brand and category
GET /products?brand=Apple&category=Electronics
# Combine filters
GET /products?brand=Samsung&availability=in stock&search=phone
```
### Data Validation
Comprehensive validation for all inputs:
- GTIN format validation and normalization
- Price parsing with currency extraction
- Required field validation
- Type conversion and sanitization
## Database Schema
### Products Table
- Full product catalog with Google Shopping compatibility
- Indexed fields: `gtin`, `brand`, `google_product_category`, `availability`
- Timestamps for creation and updates
### Stock Table
- Location-based inventory tracking
- GTIN-based product linking
- Unique constraint on GTIN+location combinations
- Composite indexes for efficient queries
### Import Jobs Table
- Track background import operations
- Status monitoring and error handling
- Performance metrics
## Development
### Running Tests
```bash
# All tests
make test
# Specific test file
pytest tests/test_utils.py -v
# With coverage
pytest --cov=. tests/
```
### Code Quality
```bash
# Format code
make format
# Lint code
make lint
# Type checking
mypy .
```
### Database Migrations
**Creating Migrations:**
```bash
# After making changes to models/database_models.py, create a new migration
alembic revision --autogenerate -m "Description of changes"
# Review the generated migration file in alembic/versions/
# Edit if needed, then apply:
alembic upgrade head
```
**Common Migration Commands:**
```bash
# Check current migration status
alembic current
# View migration history
alembic history
# Upgrade to specific revision
alembic upgrade <revision_id>
# Downgrade one step
alembic downgrade -1
# Downgrade to specific revision
alembic downgrade <revision_id>
# Reset database (WARNING: destroys all data)
alembic downgrade base
alembic upgrade head
```
**Troubleshooting Alembic:**
```bash
# If you get template errors, reinitialize Alembic:
rm -rf alembic/
alembic init alembic
# Then update alembic/env.py and alembic.ini as shown above
# If migrations conflict, you may need to merge:
alembic merge -m "Merge migrations" <rev1> <rev2>
```
## Production Deployment
### Environment Setup
1. **Database**: PostgreSQL 13+ recommended
2. **Cache**: Redis for session storage and rate limiting
3. **Reverse Proxy**: Nginx for SSL termination and load balancing
4. **Monitoring**: Consider adding Prometheus metrics
### Security Checklist
- [ ] Change default JWT secret key
- [ ] Set up HTTPS/TLS
- [ ] Configure CORS appropriately
- [ ] Set up database connection limits
- [ ] Enable request logging
- [ ] Configure rate limiting per your needs
- [ ] Set up monitoring and alerting
### Docker Production
```yaml
# docker-compose.prod.yml
version: '3.8'
services:
api:
build: .
environment:
- DEBUG=False
- DATABASE_URL=${DATABASE_URL}
- JWT_SECRET_KEY=${JWT_SECRET_KEY}
restart: unless-stopped
# Add your production configuration
```
## Performance Considerations
### Database Optimization
- Use PostgreSQL for production workloads
- Monitor query performance with `EXPLAIN ANALYZE`
- Consider read replicas for read-heavy workloads
- Regular `VACUUM` and `ANALYZE` operations
### CSV Import Performance
- Batch size affects memory usage vs. speed
- Larger batches = faster import but more memory
- Monitor import job status for optimization
### API Response Times
- Database indexes are crucial for filtering
- Use pagination for large result sets
- Consider caching frequently accessed data
## Troubleshooting
### Common Issues
1. **CSV Import Failures**
- Check encoding: try different separators
- Validate required columns: `product_id`, `title`
- Monitor import job status for specific errors
2. **Database Connection Issues**
- Verify DATABASE_URL format
- Check connection limits
- Ensure database server is accessible
3. **Authentication Problems**
- Verify JWT_SECRET_KEY is set
- Check token expiration settings
- Validate token format
### Logging
Logs are structured and include:
- Request/response times
- Error details with stack traces
- Import job progress
- Rate limiting events
```bash
# View live logs
tail -f logs/app.log
# Docker logs
docker-compose logs -f api
```
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make changes with tests
4. Run quality checks: `make lint test`
5. Submit a pull request
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Support
For issues and questions:
1. Check the troubleshooting section
2. Review existing GitHub issues
3. Create a new issue with detailed information
4. For security issues, contact maintainers directly