Direct slug check (e.g. linkedin.com/company/waterm) can match
a different company with the same name. LinkedIn public metadata
is too minimal to verify location/industry without API access.
Rely on Brave Search with title/description validation instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add direct URL check for linkedin.com/company/{slug} before Brave Search
- Prioritize /company/ over /in/ in search result ranking
- Use targeted query "company_name linkedin.com/company" first
- Fall back to personal profile search only if company page not found
- Verify page title matches company name to avoid false positives
Fixes: WATERM showed employee's personal profile instead of existing
company page at linkedin.com/company/waterm
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace placeholder _search_brave() with real Brave API integration
- Fix LinkedIn URL construction: /in/ profiles were incorrectly built as /company/
- Add word-boundary matching to validate search results against company name
- Track source (website_scrape vs brave_search) per platform in audit results
- Increase search results from 5 to 10 for better coverage
Fixes: WATERM LinkedIn profile not detected (website has no LinkedIn link,
but Brave Search finds the personal /in/ profile)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DATABASE_URL and PAGESPEED_API_KEY are read at module level (import
time), so load_dotenv must run before third-party imports that
reference these variables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add load_dotenv() to seo_audit.py so it reads GOOGLE_PAGESPEED_API_KEY
and DATABASE_URL from .env without requiring manual env var passing.
Fixes PageSpeed 429 errors when running audit via SSH.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
seo_audit.py was missing SSL columns (has_ssl, ssl_expires_at,
ssl_issuer) in its INSERT/UPDATE query, causing all SEO-audited
companies to show has_ssl=false regardless of actual certificate status.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The scraper was matching facebook.com/tr (Meta Pixel tracking endpoint)
as a valid Facebook profile handle. Added 'tr', 'privacy', 'policies',
'ads', 'business', 'legal', 'flx' to the Facebook exclusion list.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Display up to 3 next events with RSVP status instead of just one
- Add import script for WhatsApp Norda group data (Feb 2026):
events, company updates, Alter Energy, Croatia announcement
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix ~190 hardcoded Polish strings missing diacritical characters
across seo_audit.html, gbp_audit.html, social_audit.html
- Fix encoding issue in SEO scraper: requests defaults to ISO-8859-1
when server omits charset, causing mojibake for UTF-8 pages.
Now uses apparent_encoding detection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Company settings page with 4 OAuth cards (GBP, Search Console, Facebook, Instagram)
- 3 API service clients: GBP Management, Search Console, Facebook Graph
- OAuth enrichment in GBP audit (owner responses, posts), social media (FB/IG Graph API),
and SEO prompt (Search Console data)
- Fix OAuth callback redirects to point to company settings page
- All integrations have graceful fallback when no OAuth credentials configured
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Google replaced First Input Delay (FID) with Interaction to Next Paint
(INP) as a Core Web Vital in March 2024. This renames the DB column
from first_input_delay_ms to interaction_to_next_paint_ms, updates the
PageSpeed client to prefer the INP audit key, and fixes all references
across routes, services, scripts, and report generators. Updated INP
thresholds: good ≤200ms, needs improvement ≤500ms.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GBP audit:
- Fix review_response_rate bug: check ownerResponse instead of authorAttribution.displayName
- Mark has_posts/has_products/has_qa as OAuth-dependent in AI prompt
- Add review_keywords and description_keywords to AI prompt
SEO audit:
- Replace deprecated FID with INP (Core Web Vital since March 2024)
- Pass 10 additional metrics to AI prompt: FCP, TTFB, TBT, Speed Index,
meta title/desc length, html lang, Schema.org field details
- Update templates with INP thresholds (200ms/500ms)
Social media audit:
- Calculate engagement_rate from industry base rates × activity multiplier
- Calculate posting_frequency_score (0-10 based on posts_count_30d)
- Enrich AI prompt with page_name, freq_score, engagement, last_post_date
- Add avg engagement rate and brand name consistency check to prompt
Completeness: 52% → ~68% (estimated)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- google_places_service.py: Google Places API integration
- competitor_monitoring_service.py: Competitor tracking service
- scripts/competitor_monitor_cron.py, scripts/generate_audit_report.py
- blueprints/admin/routes_competitors.py, templates/admin/competitor_dashboard.html
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add regex pattern for Facebook /p/PageName-ID/ multi-segment URLs
- Add 'p' to Facebook exclusion list (bare /p is always truncated)
- Add minimum length validation for extracted social handles
- Strip Instagram tracking params (?igsh=, &utm_source=) from handles
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The regex was capturing 'profile.php' as a username instead of extracting
the numeric ID from profile.php?id=XXX links. Added dedicated pattern for
profile.php URLs and added profile.php to exclusion list.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ~20 remaining is_admin references across backend, templates and scripts
with proper SystemRole checks. Column is_admin stays as deprecated (synced by
set_role()) until DB migration removes it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes email mismatches that would have created duplicate accounts:
- Artur Wiertel: norda-biznes.info → waterm.pl (existing prod ID=3)
- Andrzej Gorczycki: zukwejherowo.pl → ekofabrykawejherowo.pl (prod ID=41)
Adds secretary (Magdalena Klóska) and deactivates corrupted duplicate (ID=40).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- pytest framework with fixtures for auth (auth_client, admin_client)
- Unit tests for SearchService
- Integration tests for auth flow
- Security tests (OWASP Top 10: SQL injection, XSS, CSRF)
- Smoke tests for production health and backup monitoring
- E2E tests with Playwright (basic structure)
- DR tests for backup/restore procedures
- GitHub Actions CI/CD workflow (.github/workflows/test.yml)
- Coverage configuration (.coveragerc) with 80% minimum
- DR documentation and restore script
Staging environment: VM 248, staging.nordabiznes.pl
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Removed limits on services (was 10) and keywords (was 8)
- Added new extraction categories:
- products: physical/digital products
- brands: partners, certifications (VMware, Veeam, etc.)
- specializations: specific competencies
- target_customers: customer types (SMB, enterprise, etc.)
- regions: geographic coverage
- Merged all data into services_extracted and main_keywords
- Increased content limit to 20000 chars for AI
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- New script: scripts/website_content_updater.py
- Uses Gemini 3 Flash (free tier) for AI extraction
- Extracts: services_extracted, main_keywords, content_summary
- Supports: single company, batch, stale-days filtering, dry-run
- Rate limiting: 2s between API calls
- Documented cron setup in CLAUDE.md
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add krs_raw_data, krs_fetched_at, krs_registration_date,
krs_representation, krs_activities columns to Company model
- Save complete KRS API response for full data access
- Display in company profile:
- Board members (zarząd) with functions and avatars
- Shareholders (wspólnicy) with share amounts
- Representation method (sposób reprezentacji)
- Business activities (PKD codes)
- Registration date with years active
- KRS address with region info
- OPP (public benefit) status
- Metadata (stan_z_dnia, data_odpisu)
- Add migration 037_krs_extended_data.sql
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add enrich_companies_from_registries() that tries KRS first for companies
with KRS number, then falls back to CEIDG
- Add update_company_from_krs() to save KRS data to Company model
- Fix CEIDG search to use 'nip' parameter directly
- Keep enrich_companies_from_ceidg() as alias for compatibility
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add new Company fields: ceidg_id, ceidg_status, pkd_codes (JSONB),
correspondence address, owner_citizenships, ceidg_raw_data
- Add enrich_companies_from_ceidg() to fetch full CEIDG details
- Add fetch_full_ceidg_details() for detailed API calls
- Add update_company_from_ceidg() to save all CEIDG fields
- Add --enrich and --apply flags for batch enrichment
- Add migration 036_ceidg_extended_data.sql
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add search_ceidg_by_name() for API v3 name-based queries
- Add search_missing_nip_companies() to find NIP for companies without NIP
- Add --missing-nip flag to search for all companies missing NIP
- Add --apply-nip flag to save found NIPs to database
- Fix API endpoint: /api/ceidg/v3/firmy (not /firma)
- Correctly extract NIP from wlasciciel object in response
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- Remove position: sticky from konto sidebar (dane, prywatnosc, bezpieczenstwo, blokady)
- Add "Firmy" link to admin dropdown menu (before "Użytkownicy")
- Add scan_websites_for_nip.py script for data quality
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Zmiana nazwy: "Norda Biznes Hub" → "Norda Biznes Partner"
- Aktualizacja modelu AI: Gemini 2.0 Flash → Gemini 3 Flash
- Zachowano historyczne odniesienia w timeline i dokumentacji
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- New admin page /admin/model-comparison for comparing AI responses
- Side-by-side comparison: old model (2.5 Flash-Lite) vs new (3 Flash)
- Questions from real conversations (Artur Wiertel, Maciej Pienczyn)
- Run simulation button to generate new responses
- Added link in admin menu under "Porównanie modeli"
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Dodanie parent_id do tabeli categories
- Model Category z relacją parent/subcategories
- 4 główne grupy: Usługi, Budownictwo, Handel, Produkcja
- Skrypt assign_category_parents.py do przypisania podkategorii
- Migracja 030_add_category_hierarchy.sql
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Landing page: przycisk "Norda Partner" + kontakt Izby (email, WhatsApp)
- Landing page: link "Strefa Gościa" → norda-biznes.info
- Menu "Więcej": dodano "Strefa Gościa (Izba)" dla zalogowanych
- Forum: ukryto filtry kategorii/statusów (uproszczenie UX)
- README: zmiana "AI Assistant" → "NordaGPT"
- Skrypt import firmy testowej "Kaszubia 2030"
- .gitignore: wykluczenie notatek ze spotkań (MEETING_*.md)
Zmiany na podstawie spotkania 2026-01-28 i uwag Artura Wiertla.
Wzór nawigacji: Vaillant.pl (Klienci indywidualni / Profesjonaliści)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Skrypt importu miał błędną godzinę 18:00. Faktyczne spotkania
"Chwila dla Biznesu" odbywają się o 19:00.
Zaktualizowano:
- Komentarz w linii 50
- Opis wydarzenia (description)
- time_start: time(19, 0)
Istniejące wydarzenia w bazie zostały zaktualizowane ręcznie (UPDATE SQL).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Dodano skrypt cron do automatycznej ekstrakcji wiedzy (scripts/cron_extract_knowledge.py)
- Dodano panel deduplikacji faktów (/admin/zopk/knowledge/fact-duplicates)
- Dodano API i funkcje auto-weryfikacji encji i faktów
- Dodano panel Timeline ZOPK (/admin/zopk/timeline) z CRUD
- Rozszerzono dashboard bazy wiedzy o statystyki weryfikacji i przyciski auto-weryfikacji
- Dodano migrację 016_zopk_milestones.sql dla tabeli kamieni milowych
- Naprawiono duplikat modelu ZOPKMilestone w database.py
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Zmieniono 'processed' -> 'success' i 'generated' -> 'success' aby
pasowały do wartości zwracanych przez batch_extract() i
generate_chunk_embeddings().
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uruchamia po kolei: scraping treści, ekstrakcję AI, generowanie embeddingów.
Do użycia w cron job co godzinę.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Problem: Newsy z Google News RSS miały source_domain='news.google.com'
i favicon Google zamiast prawdziwego źródła.
Rozwiązanie: Nowy skrypt fix_google_news_sources.py który:
- Wyciąga nazwę źródła z tytułu (po " - ")
- Mapuje 59 źródeł na ich prawdziwe domeny
- Aktualizuje source_domain i image_url (favicon)
Wynik: 143/143 newsów zaktualizowanych z poprawnymi źródłami.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Zamieniono requests.Session() na bezpośredni requests.get()
- Dodano max_depth=3 jako zabezpieczenie przed nieskończoną rekurencją
- Jawne zamykanie response.close() po każdym request
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Dodano context manager (with) dla sesji requests
- Jawne zamykanie odpowiedzi HTTP (response.close())
- Dodano flush=True do print dla natychmiastowego outputu
- Rozwiązuje problem 725+ otwartych połączeń
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Strategia pobierania obrazków:
1. Rozwiń URL Google News do oryginalnego źródła
2. Pobierz og:image z meta tagów strony
3. Fallback: logo domeny (Clearbit API)
4. Fallback: favicon (Google Favicon API)
Użycie: python scripts/fetch_news_images.py [--dry-run] [--limit N]
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Nowa kolumna member_since w tabeli companies
- Karta "Członek Izby NORDA od" na profilu firmy (niebieski kolor #3b82f6)
- Wyświetlanie liczby lat w Izbie
- Import 57 dat przystąpienia z pliku Excel od Artura
- Skrypt import_member_since.py do importu dat
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>