- Bulk discovery skips companies with any candidate (including rejected)
- Single discovery skips URLs from previously rejected domains
- Dashboard shows list of companies rejected by admin with note
that they won't be re-searched in bulk mode
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous regex only matched 3-3-2-2 format. New universal pattern
catches any 10-digit NIP with dashes/spaces in any position.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase candidate pool from 3 to 5. Stop evaluating once a
candidate matches NIP/REGON/KRS (100% certainty).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root page often lacks NIP/REGON. Now scrapes /kontakt/, /contact,
/o-nas, /o-firmie to find strong verification signals. Stops early
when NIP/REGON/KRS found.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip paths from candidate URLs (e.g. /kontakt/, /about/) to always
save root domain. Deduplicates results pointing to same domain.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"Jubiler Agat" now matches "agat-jubiler.pl" by checking individual
words in any order, not just concatenated substring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Evaluate top 3 Brave results instead of just taking the first one.
Add domain name matching signal (+2 pts when domain contains company name).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added norda-biznes.info, bizraport.pl, aplikuj.pl, lexspace.pl,
drewnianeabc.pl, f-trust.pl, itspace.llc to directory blacklist
- Delay first poll by 3s so thread has time to populate total
- Better completion messages (show count, handle 0 remaining)
- Increase poll interval to 3s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added imsig.pl, monitorfirm.pb.pl, zwiazekpracodawcow.pl,
transfermarkt.pl, mapcarta.com and other directories/portals
that returned false positives in first production run.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Brave free tier rate limits aggressively (429 after ~1 req/s).
Added retry logic (3 attempts: 3s, 6s, 9s waits) and increased
inter-company delay from 2s to 5s. Error candidates are now
cleaned up before retry to allow re-discovery.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Automated discovery using Brave Search API to find company websites,
scrape verification data (NIP/REGON/KRS/email/phone), and present
candidates with match badges in the data quality dashboard.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>