orion

sboulahtit/orion

Fork 0

Commit Graph

Author	SHA1	Message	Date
Samir Boulahtit	1828ac85eb	feat(prospecting): add content scraping for POC builder (Workstream 3A) - New scrape_content() method in enrichment_service: extracts meta description, H1/H2 headings, paragraphs, images (filtered for size), social links, service items, and detected languages using BeautifulSoup - Scans 6 pages per prospect: /, /about, /a-propos, /services, /nos-services, /contact - Results stored as JSON in prospect.scraped_content_json - New endpoints: POST /content-scrape/{id} and /content-scrape/batch - Added to full_enrichment pipeline (Step 5, before security audit) - CONTENT_SCRAPE job type for scan-jobs tracking - "Content Scrape" batch button on scan-jobs page - Add beautifulsoup4 to requirements.txt Tested on batirenovation-strasbourg.fr: extracted 30 headings, 21 paragraphs, 13 images. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 22:26:56 +02:00
Samir Boulahtit	972ee1e5d0	feat(prospecting): add ProspectSecurityAudit model (Phase 1 foundation) - New model: ProspectSecurityAudit with score, grade, findings_json, severity counts, has_https, has_valid_ssl, missing_headers, exposed files, technologies, scan_error - Add last_security_audit_at timestamp to Prospect model - Add security_audit 1:1 relationship on Prospect Part of Phase 1: Security Audit in Enrichment Pipeline. Service, constants, migration, endpoints, and frontend to follow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 22:23:38 +02:00
Samir Boulahtit	6d6eba75bf	feat(prospecting): add complete prospecting module for lead discovery and scoring Some checks failed CI / pytest (push) Failing after 48m31s Details CI / docs (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / ruff (push) Successful in 11s Details CI / validate (push) Successful in 23s Details CI / dependency-scanning (push) Successful in 28s Details Migrates scanning pipeline from marketing-.lu-domains app into Orion module. Supports digital (domain scan) and offline (manual capture) lead channels with enrichment, scoring, campaign management, and interaction tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 00:59:47 +01:00

Author

SHA1

Message

Date

Samir Boulahtit

1828ac85eb

feat(prospecting): add content scraping for POC builder (Workstream 3A)

- New scrape_content() method in enrichment_service: extracts meta
  description, H1/H2 headings, paragraphs, images (filtered for size),
  social links, service items, and detected languages using BeautifulSoup
- Scans 6 pages per prospect: /, /about, /a-propos, /services,
  /nos-services, /contact
- Results stored as JSON in prospect.scraped_content_json
- New endpoints: POST /content-scrape/{id} and /content-scrape/batch
- Added to full_enrichment pipeline (Step 5, before security audit)
- CONTENT_SCRAPE job type for scan-jobs tracking
- "Content Scrape" batch button on scan-jobs page
- Add beautifulsoup4 to requirements.txt

Tested on batirenovation-strasbourg.fr: extracted 30 headings,
21 paragraphs, 13 images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-01 22:26:56 +02:00

Samir Boulahtit

972ee1e5d0

feat(prospecting): add ProspectSecurityAudit model (Phase 1 foundation)

- New model: ProspectSecurityAudit with score, grade, findings_json,
  severity counts, has_https, has_valid_ssl, missing_headers, exposed
  files, technologies, scan_error
- Add last_security_audit_at timestamp to Prospect model
- Add security_audit 1:1 relationship on Prospect

Part of Phase 1: Security Audit in Enrichment Pipeline. Service,
constants, migration, endpoints, and frontend to follow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-30 22:23:38 +02:00

Samir Boulahtit

6d6eba75bf

feat(prospecting): add complete prospecting module for lead discovery and scoring

CI / pytest (push) Failing after 48m31s

Details

CI / docs (push) Has been skipped

Details

CI / deploy (push) Has been skipped

Details

CI / ruff (push) Successful in 11s

Details

CI / validate (push) Successful in 23s

Details

CI / dependency-scanning (push) Successful in 28s

Details

Migrates scanning pipeline from marketing-.lu-domains app into Orion module.
Supports digital (domain scan) and offline (manual capture) lead channels
with enrichment, scoring, campaign management, and interaction tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-28 00:59:47 +01:00

3 Commits