fix(prospecting): fix contact scraper and add address extraction
Some checks failed
CI / validate (push) Has been cancelled
CI / dependency-scanning (push) Has been cancelled
CI / docs (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / ruff (push) Successful in 13s
CI / pytest (push) Has been cancelled

- Fix contact_type column: Enum(ContactType) → String(20) to match the
  migration (fixes "type contacttype does not exist" on insert)
- Rewrite scrape_contacts with structured-first approach:
  Phase 1: tel:/mailto: href extraction (high confidence)
  Phase 2: regex fallback with SVG/script stripping, international phone
           pattern (requires + prefix, min 10 digits)
  Phase 3: address extraction from Schema.org JSON-LD, <address> tags,
           and European street address regex (FR/DE/EN street keywords)
- URL-decode email values, strip tags to plain text for cross-element
  address matching
- Add /mentions-legales to scanned paths

Tested on batirenovation-strasbourg.fr: finds 3 contacts (email, phone,
address) vs 120+ false positives and a crash before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-30 21:18:43 +02:00
parent 1decb4572c
commit 754bfca87d
2 changed files with 62 additions and 3 deletions

View File

@@ -472,8 +472,15 @@
<p class="text-sm font-medium text-gray-800 dark:text-gray-200" x-text="store.store_name"></p>
<p class="text-xs text-gray-400 font-mono" x-text="store.store_code"></p>
</div>
<span class="px-2 py-1 text-xs rounded-full bg-purple-100 dark:bg-purple-900 text-purple-700 dark:text-purple-300"
x-text="store.role_name || 'Owner'"></span>
<div class="flex items-center gap-2">
<span class="px-2 py-0.5 text-xs rounded-full"
:class="store.is_pending
? 'bg-orange-100 text-orange-700 dark:bg-orange-900 dark:text-orange-200'
: 'bg-green-100 text-green-700 dark:bg-green-900 dark:text-green-200'"
x-text="store.is_pending ? '{{ _('common.pending') }}' : '{{ _('common.active') }}'"></span>
<span class="px-2 py-0.5 text-xs rounded-full bg-purple-100 dark:bg-purple-900 text-purple-700 dark:text-purple-300"
x-text="store.role_name || 'Owner'"></span>
</div>
</div>
</template>
</div>