fix(prospecting): fix contact scraper and add address extraction
Some checks failed
Some checks failed
- Fix contact_type column: Enum(ContactType) → String(20) to match the
migration (fixes "type contacttype does not exist" on insert)
- Rewrite scrape_contacts with structured-first approach:
Phase 1: tel:/mailto: href extraction (high confidence)
Phase 2: regex fallback with SVG/script stripping, international phone
pattern (requires + prefix, min 10 digits)
Phase 3: address extraction from Schema.org JSON-LD, <address> tags,
and European street address regex (FR/DE/EN street keywords)
- URL-decode email values, strip tags to plain text for cross-element
address matching
- Add /mentions-legales to scanned paths
Tested on batirenovation-strasbourg.fr: finds 3 contacts (email, phone,
address) vs 120+ false positives and a crash before.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -472,8 +472,15 @@
|
||||
<p class="text-sm font-medium text-gray-800 dark:text-gray-200" x-text="store.store_name"></p>
|
||||
<p class="text-xs text-gray-400 font-mono" x-text="store.store_code"></p>
|
||||
</div>
|
||||
<span class="px-2 py-1 text-xs rounded-full bg-purple-100 dark:bg-purple-900 text-purple-700 dark:text-purple-300"
|
||||
x-text="store.role_name || 'Owner'"></span>
|
||||
<div class="flex items-center gap-2">
|
||||
<span class="px-2 py-0.5 text-xs rounded-full"
|
||||
:class="store.is_pending
|
||||
? 'bg-orange-100 text-orange-700 dark:bg-orange-900 dark:text-orange-200'
|
||||
: 'bg-green-100 text-green-700 dark:bg-green-900 dark:text-green-200'"
|
||||
x-text="store.is_pending ? '{{ _('common.pending') }}' : '{{ _('common.active') }}'"></span>
|
||||
<span class="px-2 py-0.5 text-xs rounded-full bg-purple-100 dark:bg-purple-900 text-purple-700 dark:text-purple-300"
|
||||
x-text="store.role_name || 'Owner'"></span>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
|
||||
Reference in New Issue
Block a user