Add /admin/portal-seo to run SEO audits on nordabiznes.pl
using the same SEOAuditor used for company websites.
Tracks results over time for before/after comparison.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show company logos with website and social media links
to unauthenticated visitors below the existing landing
page content, improving local content indexability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace native title with styled tooltip that appears after half-second
hover. Larger, darker, with arrow pointer for better readability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show social media cards, SEO PageSpeed scores, and GBP stats
directly in admin view. Add "Profil publiczny" link to header.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add WWW discovery, Social Media audit, and Logo fetch buttons.
Replace spinner with progress bar showing step descriptions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Bulk discovery skips companies with any candidate (including rejected)
- Single discovery skips URLs from previously rejected domains
- Dashboard shows list of companies rejected by admin with note
that they won't be re-searched in bulk mode
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In-memory _bulk_jobs dict was per-worker in gunicorn (4 workers),
causing poll requests to miss job state. Now uses /tmp JSON files
visible to all workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous regex only matched 3-3-2-2 format. New universal pattern
catches any 10-digit NIP with dashes/spaces in any position.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase candidate pool from 3 to 5. Stop evaluating once a
candidate matches NIP/REGON/KRS (100% certainty).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root page often lacks NIP/REGON. Now scrapes /kontakt/, /contact,
/o-nas, /o-firmie to find strong verification signals. Stops early
when NIP/REGON/KRS found.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip paths from candidate URLs (e.g. /kontakt/, /about/) to always
save root domain. Deduplicates results pointing to same domain.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Users saw 0 candidates because page didn't refresh after bulk
discovery completed and modal was closed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"Jubiler Agat" now matches "agat-jubiler.pl" by checking individual
words in any order, not just concatenated substring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Evaluate top 3 Brave results instead of just taking the first one.
Add domain name matching signal (+2 pts when domain contains company name).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show Brave search description under company name and scraped page
text snippet (first 500 chars) as expandable row below each
candidate, helping admin verify if the URL matches the company.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added norda-biznes.info, bizraport.pl, aplikuj.pl, lexspace.pl,
drewnianeabc.pl, f-trust.pl, itspace.llc to directory blacklist
- Delay first poll by 3s so thread has time to populate total
- Better completion messages (show count, handle 0 remaining)
- Increase poll interval to 3s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bulk query now excludes companies that already have pending/accepted
candidates, so only truly new companies are processed via Brave API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added imsig.pl, monitorfirm.pb.pl, zwiazekpracodawcow.pl,
transfermarkt.pl, mapcarta.com and other directories/portals
that returned false positives in first production run.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Brave free tier rate limits aggressively (429 after ~1 req/s).
Added retry logic (3 attempts: 3s, 6s, 9s waits) and increased
inter-company delay from 2s to 5s. Error candidates are now
cleaned up before retry to allow re-discovery.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single latest_result field with cumulative log array and
offset-based polling to prevent missed entries and race conditions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Automated discovery using Brave Search API to find company websites,
scrape verification data (NIP/REGON/KRS/email/phone), and present
candidates with match badges in the data quality dashboard.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Treat & as word connector (P&P, S&K stay as single tokens)
- Add prefix matching with legal suffix stripping (Sp. z o.o., S.A.)
- Add reverse prefix for brand vs legal name (Pixlab Softwarehouse ↔ Pixlab Sp. z o.o.)
- Compound names like TERMO-BUD still correctly rejected (no space separator)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TERMO vs TERMO-BUD was incorrectly accepted (score 1.0) because
denominator only counted company words. Now uses max(company, google)
so extra words in either name lower the score.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace substring matching with word-boundary tokenized matching
- Short names (1-2 words): require ALL significant words to match
- Longer names (3+): require at least 50% word overlap
- Pick best-scoring result instead of first match
- Add company_name validation to competitor_monitoring_service
- Show Google profile name in dashboard hints for admin verification
- Display mismatch warning when Google name differs from company name
Prevents cases like "IT Space" matching "Body Space" (score 0.50 < 1.00 threshold).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add clickable field coverage bars to filter companies missing specific data
- Add quick-action buttons (Registry/SEO/GBP) per company in dashboard table
- Add stale data detection (>6 months) with yellow badges
- Implement weighted priority score (contacts 34%, audits 17%)
- Add data hints in admin company detail showing where to find missing data
- Add "Available data" section showing Google Business data ready to apply
- Add POST /api/company/<id>/apply-hint endpoint for one-click data fill
- Extend website content updater with phone/email extraction (AI + regex)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract 12-field completeness scoring to utils/data_quality.py service
- Auto-update data_quality_score and data_quality label on company data changes
- Add /admin/data-quality dashboard with field coverage stats, quality distribution, and sortable company table
- Add bulk enrichment with background processing, step selection, and progress tracking
- Flow GBP phone/website to Company record when company fields are empty
- Display Google opening hours on public company profile
- Add BulkEnrichmentJob model and migration 075
- Refactor arm_company.py to support selective steps and progress callbacks
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously search_place() blindly returned the first result, which
could be a completely unrelated business. Now validates that at least
one significant word from the company name appears in the Google
result before accepting it. Prevents wrong GBP profiles being linked
to companies (e.g. Rozsadni Bracia getting Zielony Zolwik's profile).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change section margins from --spacing-xl (2rem) to --spacing-md (1rem)
and reduce inner padding for a more compact layout.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the dark/light background toggle from logo selection modal,
company detail pages (admin and public), and the toggle-logo-bg API
endpoint. The feature didn't meet UX requirements.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add explicit white background to gallery cards so only the inner
image container changes to dark, keeping card borders, text and
badges on their original light backgrounds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When choosing a logo, admins can now switch between light and dark
backgrounds to see which works better for transparent logos. The
selected background preference is automatically saved when confirming.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The toggle was only on the admin panel - added it next to
"Pobierz logo" on the public company detail page too.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a per-company setting to display logos on dark background,
useful for logos with white text or light-colored elements.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
img.get('class') can return an empty list [], causing IndexError
when accessing [0]. Added `or ['']` fallback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The script was calling /firma?nip=X (wrong endpoint) instead of using
fetch_ceidg_by_nip() which does two-phase /firmy?nip=X then /firma/{id}.
Now uses the same service and field mapping as the admin panel button.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The script was calling auditor.audit_company() but not auditor.save_audit_result(),
causing profiles to be found but never persisted to the database.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- GBP: access .completeness_score attribute + call save_audit()
- Social: count saved DB records instead of parsing audit result dict
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- KRS: search_by_nip() returns dict, not just KRS number
- SEO: SEOAuditor(database_url) + audit_company(company_dict)
- Social: SocialMediaAuditor() + audit_company(company_dict)
- GBP: GBPAuditService(db) + audit_company(company_id)
- Support multiple company IDs in one invocation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows running the same enrichment workflow as the "Uzbrój firmę" button
directly from the command line, without needing browser/admin login.
Usage: python3 scripts/arm_company.py <company_id> [--force]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two buttons side by side: "Uzbrój firmę" runs only missing steps,
"Zaktualizuj dane" forces re-run of all steps regardless of status.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The logo path was hardcoded to .webp even when the actual file was .svg,
causing broken image display for SVG logos like Orlex Design.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Logo check only looked for .webp files, missing SVG logos like
Orlex Design. Now checks both extensions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>