Commit Graph

22 Commits

Author SHA1 Message Date
39cd257f4e Fix YouTube detection overwriting valid matches
- Add 'channel', 'c', 'user', '@' etc. to YouTube exclusion list
- Add 'bold_themes', 'boldthemes' to Twitter/Facebook exclusions (theme creators)
- Fix pattern matching loop to stop after first valid match per platform
- Prevents fallback pattern from overwriting correct channel ID with 'channel'

Fixes issue where youtube.com/channel/ID was being overwritten with
youtube.com/channel/channel by the second fallback pattern.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:36:06 +01:00
c319777d58 Social Media audit: progress bar improvements
- Add detailed logging to SocialMediaAuditor (website scan, Brave search, results)
- Slow down progress bar animation (400ms instead of 200ms) for better readability
- Bold "ZNALEZIONO" text for found platforms
- Display Google rating and review count in progress
- Increase wait time before modal close (4 seconds)
- Add console.log for debugging audit response

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:29:17 +01:00
8fed190303 fix(social-audit): Convert opening_hours dict to JSON for JSONB column
Fixes: psycopg2.ProgrammingError: can't adapt type 'dict'

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:14:01 +01:00
5ed97ac1dd auto-claude: subtask-5-1 - Fix opening_hours and photos data passing in audit_company
Fixed a bug where google_opening_hours and google_photos_count were being
fetched from the Google Places API but not passed through to the result
dictionary correctly:

- Changed 'opening_hours' key to 'google_opening_hours' to match what
  save_audit_result() expects
- Added 'google_photos_count' to the result dictionary

Verified with dry-run: INPI company now shows opening hours schedule
and 10 photos count from Google Business Profile.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 23:08:19 +01:00
aacf2cf54b auto-claude: subtask-2-3 - Update save_audit_result() to store google_opening_hours and google_photos_count
- Added google_opening_hours and google_photos_count to INSERT column list
- Added corresponding placeholders to VALUES list
- Added to ON CONFLICT UPDATE SET clause
- Added to parameter dictionary reading from google_reviews result

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 23:00:22 +01:00
5f2cfa06fd auto-claude: subtask-2-2 - Update get_place_details() to return photos count
- Add google_photos_count to result dictionary initialization
- Extract photos count from API response using len(place['photos'])
- Update logging to include photos count in output

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 22:59:04 +01:00
5fa80f9efa auto-claude: subtask-2-1 - Add 'photos' to fields list in GooglePlacesSearcher
Added 'photos' field to the fields list in get_place_details() method
to enable fetching business photos from Google Places API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 22:58:04 +01:00
06c22539d7 auto-claude: subtask-4-2 - Add --company-slug support and dotenv loading
- Add --company-slug argument to social_media_audit.py for easier testing
- Add get_company_id_by_slug() method to SocialMediaAuditor class
- Add python-dotenv support to load .env file from project root
- Create verify_google_places.py script for direct API testing

Note: Full verification blocked - current API key (PageSpeed) doesn't have
Places API enabled. Requires enabling Places API in Google Cloud Console
for project NORDABIZNES (gen-lang-client-0540794446).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:49:59 +01:00
3d69d53550 auto-claude: subtask-3-4 - Run all tests and verify they pass
Fixed bug in social media exclusion logic that was too aggressive.
The substring check `any(ex in match.lower() for ex in excludes)`
was incorrectly excluding valid usernames containing exclusion
strings (e.g., 'testcompany' was excluded because it contained 'p').

Changed to exact match only to properly handle Instagram post URLs
(`instagram.com/p/...`) without false positives on valid usernames.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:41:54 +01:00
3bdbde1621 auto-claude: subtask-2-3 - Update SocialMediaAuditor to use GooglePlacesSearcher
- Add google_places_searcher attribute to SocialMediaAuditor
- Initialize GooglePlacesSearcher if GOOGLE_PLACES_API_KEY env var is set
- Update audit_company() to use Places API directly when available
- Fallback to Brave Search when API key not configured
- Log which data source is being used for reviews

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:31:22 +01:00
b389287697 auto-claude: subtask-2-2 - Replace placeholder search_google_reviews() method
Implemented actual Google reviews data collection in BraveSearcher class:
- Uses GooglePlacesSearcher to find company and get place details
- Returns google_rating, google_reviews_count, opening_hours, business_status
- Falls back to Brave Search API parsing when Google API key not available
- Added _search_brave_for_reviews() helper for fallback implementation
- Proper error handling and logging throughout

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:29:50 +01:00
4110ef63b5 auto-claude: subtask-2-1 - Add GooglePlacesSearcher class to social_media_audit.py
Implements GooglePlacesSearcher class with:
- find_place() method: searches for business by name and city
  using Google Places findplacefromtext API
- get_place_details() method: retrieves rating, review count,
  opening hours, business status, phone, and website

Features:
- Uses GOOGLE_PLACES_API_KEY environment variable
- Comprehensive error handling (timeout, request errors)
- Polish language locale support
- Follows existing BraveSearcher class pattern

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:27:49 +01:00
af003798a7 Napraw połączenie z bazą danych w skryptach SEO (localhost zamiast zewnętrznego IP) 2026-01-08 15:57:39 +01:00
feaf5d5a49 auto-claude: 8.2 - Fix SQL ANY() to IN() for SQLite compatibility
- Changed PostgreSQL-specific ANY(:ids) to use IN clause with
  dynamic placeholders for SQLite/PostgreSQL compatibility
- Verified SEO audit dry-run extracts all metrics correctly:
  - HTTP status, load time, final URL
  - Meta title, H1 count, image analysis
  - Structured data detection
  - robots.txt, sitemap.xml, indexability
  - Overall SEO score calculation (95 for pixlab.pl)

Note: Company ID 26 has no website configured, tested with ID 1 instead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 09:21:12 +01:00
15ddbba8b5 auto-claude: 7.1 - Create scripts/seo_report_generator.py that generates HTML reports and JSON exports
Features:
- Single company HTML reports with full SEO audit data
- Batch HTML summary reports for multiple companies
- JSON exports for integration with other tools
- SEO recommendations based on audit findings
- CLI interface with --company-id, --batch, --all selection
- Output format options: --html, --json
- Score visualization with color-coded badges
- Core Web Vitals section with threshold indicators
- Issues and recommendations sections
- Statistics calculation for batch reports
- Polish language support in reports

Usage examples:
- python seo_report_generator.py --company-id 26 --html
- python seo_report_generator.py --all --html --output ./reports
- python seo_report_generator.py --batch 1-10 --json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 09:13:18 +01:00
c24c545cfe auto-claude: 4.3 - Add save_audit_result method with ON CONFLICT DO UPDATE
- Enhanced save_audit_result method with complete column coverage
- Added missing columns to idempotent upsert query:
  - broken_links_count (for future link checking)
  - viewport_configured (derived from meta viewport tag)
  - is_mobile_friendly (derived from viewport content)
  - has_hreflang (for international SEO detection)
- All 45+ SEO columns now properly mapped for database upserts
- ON CONFLICT (company_id) DO UPDATE ensures idempotent operations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 08:00:12 +01:00
c8eb0829d9 auto-claude: 4.2 - Add CLI argument parsing, progress logging, and error handling
Enhanced scripts/seo_audit.py with comprehensive CLI improvements:

CLI Arguments:
- --company-id: Audit single company by ID
- --company-ids: Audit multiple companies (comma-separated)
- --batch: Audit range of companies (e.g., 1-10)
- --all: Audit all companies
- --dry-run: Print results without database writes
- --verbose/-v: Debug output
- --quiet/-q: Suppress progress output
- --json: JSON output for scripting
- --database-url: Override DATABASE_URL env var

Progress Logging:
- ETA calculation based on average time per company
- Progress counter [X/Y] for each company
- Status indicators (SUCCESS/SKIPPED/FAILED/TIMEOUT)

Summary Reporting:
- Detailed breakdown by result category
- Edge case counts (no_website, unavailable, timeout, ssl_errors)
- PageSpeed API quota tracking (start/used/remaining)
- Visual score distribution with bar charts
- Failed audits listing with error messages

Error Handling:
- Proper exit codes (0-5) for different scenarios
- Categorization of errors (timeout, connection, SSL, unavailable)
- Database connection error handling
- Quota exceeded handling
- Batch argument validation with helpful error messages

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 03:11:22 +01:00
2bebb46f02 auto-claude: 4.1 - Create scripts/seo_audit.py with SEOAuditor class
Implements SEOAuditor class following social_media_audit.py pattern:
- __init__: Initialize database connection and analysis components
- get_companies: Fetch companies by ID, batch, or all
- audit_company: Full SEO audit (PageSpeed, on-page, technical)
- save_audit_result: Upsert to company_website_analysis table
- run_audit: Orchestration with progress logging and summary

Features:
- Integrates GooglePageSpeedClient for Lighthouse scores
- Uses OnPageSEOAnalyzer for meta tags, headings, images, links
- Uses TechnicalSEOChecker for robots.txt, sitemap, canonical
- Calculates overall SEO score from weighted components
- CLI support: --company-id, --batch, --all, --dry-run, --json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 02:16:36 +01:00
81fc27dfa9 auto-claude: 3.2 - Add TechnicalSEOChecker class to scripts/seo_analyzer.py
Adds TechnicalSEOChecker class that performs technical SEO audits:
- robots.txt: checks existence, parses directives (Disallow, Allow, Sitemap)
  detects if blocks Googlebot or all bots
- sitemap.xml: checks existence, validates XML, counts URLs, detects sitemap index
- Canonical URLs: detects canonical tag, checks if self-referencing or cross-domain
- Noindex tags: checks meta robots and X-Robots-Tag HTTP header
- Redirect chains: follows up to 10 redirects, detects loops, HTTPS upgrades,
  www redirects, and mixed content issues

Includes:
- 8 dataclasses for structured results (RobotsTxtResult, SitemapResult, etc.)
- TechnicalSEOResult container for complete analysis
- check_technical_seo() convenience function
- CLI support: --technical/-t flag for technical-only analysis
- --all/-a flag for combined on-page and technical analysis
- --json/-j flag for JSON output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 02:12:47 +01:00
0c257f5e48 auto-claude: 3.1 - Create scripts/seo_analyzer.py with OnPageSEOAnalyzer
Add comprehensive on-page SEO analyzer that extracts:
- Meta tags (title, description, keywords, robots, viewport, canonical)
- Open Graph metadata (og:title, og:description, og:image, etc.)
- Twitter Card metadata (card type, site, creator, etc.)
- Heading structure (h1-h6 counts, hierarchy validation)
- Image alt text analysis (missing, empty, quality issues)
- Link analysis (internal/external/nofollow/broken)
- Structured data detection (JSON-LD, Microdata, RDFa)
- Word count and document attributes (DOCTYPE, lang)

Uses dataclasses for structured results following pagespeed_client.py pattern.
Includes CLI interface for testing individual URLs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 02:07:10 +01:00
9f58e3f8e1 auto-claude: 2.1 - Create scripts/pagespeed_client.py with GooglePageSpeedClient
Implements Google PageSpeed Insights API client with:
- GooglePageSpeedClient class for making API calls
- Exponential backoff retry logic (3 retries, 1-60s backoff)
- RateLimiter class with daily quota tracking (25k req/day)
- Quota persistence to .pagespeed_quota.json
- Support for mobile/desktop strategies
- Core Web Vitals extraction (LCP, FCP, CLS, TTFB)
- Lighthouse audit scores (performance, accessibility, SEO, best-practices)
- Structured dataclasses for results (PageSpeedResult, PageSpeedScore, CoreWebVitals)
- Custom exceptions (QuotaExceededError, RateLimitError, PageSpeedAPIError)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 02:00:37 +01:00
02fc67bf40 Initial commit 2026-01-01 14:01:49 +01:00