orion

Author	SHA1	Message	Date
Samir Boulahtit	1828ac85eb	feat(prospecting): add content scraping for POC builder (Workstream 3A) - New scrape_content() method in enrichment_service: extracts meta description, H1/H2 headings, paragraphs, images (filtered for size), social links, service items, and detected languages using BeautifulSoup - Scans 6 pages per prospect: /, /about, /a-propos, /services, /nos-services, /contact - Results stored as JSON in prospect.scraped_content_json - New endpoints: POST /content-scrape/{id} and /content-scrape/batch - Added to full_enrichment pipeline (Step 5, before security audit) - CONTENT_SCRAPE job type for scan-jobs tracking - "Content Scrape" batch button on scan-jobs page - Add beautifulsoup4 to requirements.txt Tested on batirenovation-strasbourg.fr: extracted 30 headings, 21 paragraphs, 13 images. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 22:26:56 +02:00
Samir Boulahtit	50a4fc38a7	feat(prospecting): add batch delay + fix Celery error_message field - Add PROSPECTING_BATCH_DELAY_SECONDS config (default 1.0s) — polite delay between prospects in batch scans to avoid rate limiting - Apply delay to all 5 batch API endpoints and all Celery tasks - Fix Celery tasks: error_message → error_log (matches model field) - Add batch-scanning.md docs with rate limiting guide, scaling estimates for 70k+ URL imports, and pipeline order recommendations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 21:55:24 +02:00
Samir Boulahtit	30f3dae5a3	feat(prospecting): add security audit report generation (Workstream 2B) - SecurityReportService generates standalone branded HTML reports from stored audit data (grade badge, simulated hacked site, detailed findings, business impact, call-to-action with contact info) - GET /security-audit/report/{prospect_id} returns HTMLResponse - "Generate Report" button on prospect detail security tab opens report in new browser tab (printable to PDF) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 21:41:40 +02:00
Samir Boulahtit	4c750f0268	feat(prospecting): implement security audit pipeline (Workstream 2A) Complete security audit integration into the enrichment pipeline: Backend: - SecurityAuditService with 7 passive checks: HTTPS, SSL cert, security headers, exposed files, cookies, server info, technology detection - Constants file with SECURITY_HEADERS, EXPOSED_PATHS, SEVERITY_SCORES - SecurityAuditResponse schema with JSON field validators + aliases - Endpoints: POST /security-audit/{id}, POST /security-audit/batch - Added to full_enrichment pipeline (Step 5, before scoring) - get_pending_security_audit() query in prospect_service Frontend: - Security tab on prospect detail page with grade badge (A+ to F), score/100, severity counts, HTTPS/SSL status, missing headers, exposed files, technologies, and full findings list - "Run Security Audit" button with loading state - "Security Audit" batch button on scan-jobs page Tested on batirenovation-strasbourg.fr: Grade D (50/100), 11 issues found (missing headers, exposed wp-login, server version disclosure). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 20:58:11 +02:00
Samir Boulahtit	f310363f7c	fix(prospecting): fix scan-jobs batch endpoints and add job tracking - Reorder routes: batch endpoints before /{prospect_id} to fix FastAPI route matching (was parsing "batch" as prospect_id → 422) - Add scan job tracking via stats_service.create_job/complete_job so the scan-jobs table gets populated after each batch run - Add contact scrape batch endpoint (POST /contacts/batch) with get_pending_contact_scrape query - Fix scan-jobs.js: explicit route map instead of naive replace - Normalize domain_name on create/update (strip protocol, www, slash) - Add domain_name to ProspectUpdate schema - Add proposal for contact scraper enum + regex fixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 23:31:33 +02:00
Samir Boulahtit	22ae63b414	refactor(prospecting): migrate SVC-006 transaction control to endpoint level Some checks failed CI / validate (push) Has been cancelled Details CI / ruff (push) Successful in 10s Details CI / dependency-scanning (push) Has been cancelled Details CI / docs (push) Has been cancelled Details CI / deploy (push) Has been cancelled Details CI / pytest (push) Has started running Details Move db.commit() from services to API endpoints and Celery tasks. Services now use db.flush() only; endpoints own the transaction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 16:40:09 +01:00
Samir Boulahtit	6d6eba75bf	feat(prospecting): add complete prospecting module for lead discovery and scoring Some checks failed CI / pytest (push) Failing after 48m31s Details CI / docs (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / ruff (push) Successful in 11s Details CI / validate (push) Successful in 23s Details CI / dependency-scanning (push) Successful in 28s Details Migrates scanning pipeline from marketing-.lu-domains app into Orion module. Supports digital (domain scan) and offline (manual capture) lead channels with enrichment, scoring, campaign management, and interaction tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 00:59:47 +01:00

7 Commits