Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Production moved from on-prem VM 249 (10.22.68.249) to OVH VPS (57.128.200.27, inpi-vps-waw01). Updated ALL documentation, slash commands, memory files, architecture docs, and deploy procedures. Added |local_time Jinja filter (UTC→Europe/Warsaw) and converted 155 .strftime() calls across 71 templates so timestamps display in Polish timezone regardless of server timezone. Also includes: created_by_id tracking, abort import fix, ICS calendar fix for missing end times, Pros Poland data cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1346 lines
37 KiB
Markdown
1346 lines
37 KiB
Markdown
# SEO Audit Flow
|
||
|
||
**Document Version:** 1.0
|
||
**Last Updated:** 2026-01-10
|
||
**Status:** Production LIVE
|
||
**Flow Type:** Admin-Triggered Website SEO Analysis
|
||
|
||
---
|
||
|
||
## Overview
|
||
|
||
This document describes the **complete SEO audit flow** for the Norda Biznes Partner application, covering:
|
||
|
||
- **Admin Dashboard** (`/admin/seo` route)
|
||
- **Single Company Audit** (admin-triggered via UI/API)
|
||
- **Batch Audit** (script-based for all companies)
|
||
- **PageSpeed Insights API Integration** for performance metrics
|
||
- **On-Page SEO Analysis** (meta tags, headings, images, links)
|
||
- **Technical SEO Checks** (robots.txt, sitemap, canonical URLs)
|
||
- **Database Storage** in `company_website_analysis` table
|
||
- **Results Display** on admin dashboard and company profiles
|
||
|
||
**Key Technology:**
|
||
- **PageSpeed API:** Google PageSpeed Insights (Lighthouse)
|
||
- **Analysis Engine:** SEOAuditor (scripts/seo_audit.py)
|
||
- **On-Page Analyzer:** OnPageSEOAnalyzer (scripts/seo_analyzer.py)
|
||
- **Technical Checker:** TechnicalSEOChecker (scripts/seo_analyzer.py)
|
||
- **Database:** PostgreSQL (company_website_analysis table)
|
||
|
||
**Key Features:**
|
||
- Full website analysis (PageSpeed + On-Page + Technical SEO)
|
||
- Admin dashboard with sortable table and score distribution
|
||
- Color-coded score badges (green 90-100, yellow 50-89, red 0-49)
|
||
- Filtering by category, score range, and company name
|
||
- Single company audit trigger from admin UI
|
||
- Batch audit script for all companies (`scripts/seo_audit.py`)
|
||
- API quota tracking (25,000 requests/day free tier)
|
||
|
||
**API Costs & Performance:**
|
||
- **API:** Google PageSpeed Insights (Free tier: 25,000 queries/day)
|
||
- **Pricing:** Free for up to 25,000 requests/day, $5/1000 queries after
|
||
- **Typical Audit Time:** 5-15 seconds per company
|
||
- **Actual Cost:** $0.00 (free tier, 80 companies = 80 audits << 25,000 limit)
|
||
|
||
---
|
||
|
||
## 1. High-Level SEO Audit Flow
|
||
|
||
### 1.1 Complete SEO Audit Flow Diagram
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
Admin[Admin User] -->|1. Navigate to /admin/seo| Browser[Browser]
|
||
Browser -->|2. GET /admin/seo| Flask[Flask App<br/>app.py]
|
||
Flask -->|3. Check permissions| AuthCheck{Is Admin?}
|
||
|
||
AuthCheck -->|No| Deny[403 Forbidden]
|
||
AuthCheck -->|Yes| Dashboard[Admin SEO Dashboard<br/>admin_seo_dashboard.html]
|
||
|
||
Dashboard -->|4. Render dashboard| Browser
|
||
Browser -->|5. Display stats & table| AdminUI[Admin UI]
|
||
|
||
AdminUI -->|6. Click 'Uruchom audyt'<br/>for single company| TriggerSingle[Trigger Single Audit]
|
||
TriggerSingle -->|7. POST /api/seo/audit| Flask
|
||
|
||
AdminUI -->|8. Click 'Uruchom audyt'<br/>for batch| TriggerBatch[Trigger Batch Audit]
|
||
TriggerBatch -->|9. Run script| Script[scripts/seo_audit.py]
|
||
|
||
Flask -->|10. Verify admin| PermCheck{Is Admin?}
|
||
PermCheck -->|No| Error403[403 Error]
|
||
PermCheck -->|Yes| CreateAuditor[Create SEOAuditor]
|
||
|
||
CreateAuditor -->|11. Initialize| Auditor[SEOAuditor<br/>seo_audit.py]
|
||
Auditor -->|12. Fetch page| Website[Company Website]
|
||
|
||
Website -->|13. HTML + HTTP status| Auditor
|
||
Auditor -->|14. Analyze HTML| OnPageAnalyzer[OnPageSEOAnalyzer]
|
||
|
||
OnPageAnalyzer -->|15. Extract meta tags<br/>headings, images| OnPageResult[On-Page Results]
|
||
|
||
Auditor -->|16. Technical checks| TechnicalChecker[TechnicalSEOChecker]
|
||
TechnicalChecker -->|17. Check robots.txt<br/>sitemap, canonical| TechResult[Technical Results]
|
||
|
||
Auditor -->|18. Check quota| QuotaCheck{Quota > 0?}
|
||
QuotaCheck -->|No| SkipPageSpeed[Skip PageSpeed]
|
||
QuotaCheck -->|Yes| PageSpeedClient[GooglePageSpeedClient]
|
||
|
||
PageSpeedClient -->|19. API call| PageSpeedAPI[Google PageSpeed Insights<br/>Lighthouse]
|
||
PageSpeedAPI -->|20. Scores + CWV| PageSpeedResult[PageSpeed Results]
|
||
|
||
PageSpeedResult -->|21. Combine results| Auditor
|
||
OnPageResult -->|22. Combine results| Auditor
|
||
TechResult -->|23. Combine results| Auditor
|
||
SkipPageSpeed -->|24. Combine results| Auditor
|
||
|
||
Auditor -->|25. Calculate overall score| ScoreCalc[Score Calculator]
|
||
ScoreCalc -->|26. Overall SEO score| AuditResult[Complete Audit Result]
|
||
|
||
AuditResult -->|27. Save to DB| DB[(company_website_analysis)]
|
||
DB -->|28. Saved| Auditor
|
||
|
||
Auditor -->|29. Return results| Flask
|
||
Flask -->|30. JSON response| Browser
|
||
Browser -->|31. Reload dashboard| AdminUI
|
||
|
||
Script -->|32. Batch process| Auditor
|
||
Script -->|33. For each company| Auditor
|
||
|
||
style Auditor fill:#4CAF50
|
||
style PageSpeedClient fill:#2196F3
|
||
style OnPageAnalyzer fill:#FF9800
|
||
style TechnicalChecker fill:#9C27B0
|
||
style DB fill:#E91E63
|
||
```
|
||
|
||
### 1.2 Admin Dashboard View Flow
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant Admin as Admin User
|
||
participant Browser as Browser
|
||
participant Flask as Flask App
|
||
participant DB as PostgreSQL
|
||
|
||
Admin->>Browser: Navigate to /admin/seo
|
||
Browser->>Flask: GET /admin/seo
|
||
Flask->>Flask: Check is_admin permission
|
||
|
||
alt Not Admin
|
||
Flask-->>Browser: Redirect to dashboard
|
||
else Is Admin
|
||
Flask->>DB: Query companies + SEO analysis
|
||
DB-->>Flask: Companies with scores
|
||
Flask->>Flask: Calculate stats (avg, distribution)
|
||
Flask-->>Browser: Render admin_seo_dashboard.html
|
||
Browser-->>Admin: Display dashboard with stats & table
|
||
end
|
||
|
||
Admin->>Browser: Click filter/sort
|
||
Browser->>Browser: Client-side filtering (JavaScript)
|
||
Browser-->>Admin: Updated table view
|
||
|
||
Admin->>Browser: Click "Uruchom audyt" for company
|
||
Browser->>Browser: Show confirmation modal
|
||
Admin->>Browser: Confirm audit
|
||
Browser->>Flask: POST /api/seo/audit {slug: "company-slug"}
|
||
|
||
Flask->>Flask: Verify admin + rate limit (10/hour)
|
||
Flask->>DB: Find company by slug
|
||
DB-->>Flask: Company record
|
||
|
||
Note over Flask,DB: SEO audit process (see next diagram)
|
||
|
||
Flask-->>Browser: JSON {success: true, scores: {...}}
|
||
Browser->>Browser: Show success modal
|
||
Browser->>Browser: Reload page after 1.5s
|
||
Browser->>Flask: GET /admin/seo (refresh)
|
||
Flask->>DB: Query companies + updated scores
|
||
DB-->>Flask: Companies with new scores
|
||
Flask-->>Browser: Updated dashboard
|
||
Browser-->>Admin: Display updated scores
|
||
```
|
||
|
||
---
|
||
|
||
## 2. SEO Audit Process Details
|
||
|
||
### 2.1 Single Company Audit Flow
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant Flask as Flask App
|
||
participant Auditor as SEOAuditor
|
||
participant Web as Company Website
|
||
participant OnPage as OnPageSEOAnalyzer
|
||
participant Tech as TechnicalSEOChecker
|
||
participant PageSpeed as GooglePageSpeedClient
|
||
participant API as PageSpeed Insights API
|
||
participant DB as PostgreSQL
|
||
|
||
Flask->>Auditor: audit_company(company_dict)
|
||
|
||
Note over Auditor: 1. FETCH PAGE
|
||
Auditor->>Web: HTTP GET website_url
|
||
Web-->>Auditor: HTML content + status (200/404/500)
|
||
|
||
Note over Auditor,OnPage: 2. ON-PAGE ANALYSIS
|
||
Auditor->>OnPage: analyze_html(html, base_url)
|
||
OnPage->>OnPage: Extract meta tags (title, description, keywords)
|
||
OnPage->>OnPage: Count headings (H1, H2, H3)
|
||
OnPage->>OnPage: Analyze images (total, alt text)
|
||
OnPage->>OnPage: Count links (internal, external)
|
||
OnPage->>OnPage: Detect structured data (JSON-LD, Schema.org)
|
||
OnPage->>OnPage: Extract Open Graph tags
|
||
OnPage->>OnPage: Extract Twitter Card tags
|
||
OnPage->>OnPage: Count words on homepage
|
||
OnPage-->>Auditor: OnPageSEOResult
|
||
|
||
Note over Auditor,Tech: 3. TECHNICAL CHECKS
|
||
Auditor->>Tech: check_url(final_url)
|
||
Tech->>Web: GET /robots.txt
|
||
Web-->>Tech: robots.txt content or 404
|
||
Tech->>Tech: Parse robots.txt (exists, blocks Googlebot)
|
||
|
||
Tech->>Web: GET /sitemap.xml
|
||
Web-->>Tech: sitemap.xml content or 404
|
||
Tech->>Tech: Validate XML sitemap
|
||
|
||
Tech->>Tech: Check meta robots tags
|
||
Tech->>Tech: Check canonical URL
|
||
Tech->>Tech: Detect redirect chains
|
||
Tech-->>Auditor: TechnicalSEOResult
|
||
|
||
Note over Auditor,API: 4. PAGESPEED INSIGHTS
|
||
Auditor->>PageSpeed: Check remaining quota
|
||
PageSpeed-->>Auditor: quota_remaining (e.g., 24,950/25,000)
|
||
|
||
alt Quota Available
|
||
Auditor->>PageSpeed: analyze_url(url, strategy=MOBILE)
|
||
PageSpeed->>API: POST runPagespeed?url=...&strategy=mobile
|
||
API->>API: Run Lighthouse audit (5-15 seconds)
|
||
API-->>PageSpeed: Lighthouse results JSON
|
||
PageSpeed->>PageSpeed: Extract scores (0-100)
|
||
PageSpeed->>PageSpeed: Extract Core Web Vitals (LCP, FID, CLS)
|
||
PageSpeed->>PageSpeed: Extract audits (failed checks)
|
||
PageSpeed-->>Auditor: PageSpeedResult
|
||
else No Quota
|
||
Auditor->>Auditor: Skip PageSpeed (save quota)
|
||
end
|
||
|
||
Note over Auditor: 5. CALCULATE SCORES
|
||
Auditor->>Auditor: _calculate_onpage_score(onpage)
|
||
Auditor->>Auditor: _calculate_technical_score(technical)
|
||
Auditor->>Auditor: _calculate_overall_score(all_results)
|
||
|
||
Note over Auditor: Score weights:
|
||
Note over Auditor: PageSpeed SEO: 3x
|
||
Note over Auditor: PageSpeed Perf: 2x
|
||
Note over Auditor: On-Page: 2x
|
||
Note over Auditor: Technical: 2x
|
||
|
||
Note over Auditor,DB: 6. SAVE TO DATABASE
|
||
Auditor->>DB: UPSERT company_website_analysis
|
||
Note over DB: ON CONFLICT (company_id) DO UPDATE
|
||
DB-->>Auditor: Saved successfully
|
||
|
||
Auditor-->>Flask: Complete audit result dict
|
||
```
|
||
|
||
### 2.2 Batch Audit Script Flow
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
Start[Start: python seo_audit.py --all] --> Init[Initialize SEOAuditor]
|
||
Init --> GetCompanies[Get companies from DB<br/>ORDER BY id]
|
||
|
||
GetCompanies --> Loop{For each company}
|
||
Loop -->|Next company| CheckWebsite{Has website?}
|
||
|
||
CheckWebsite -->|No| Skip[Skip: No website]
|
||
Skip --> Loop
|
||
|
||
CheckWebsite -->|Yes| CheckQuota{Quota > 0?}
|
||
CheckQuota -->|No| QuotaWarn[Warn: Quota exceeded<br/>Skip PageSpeed]
|
||
QuotaWarn --> AuditPartial[Audit without PageSpeed]
|
||
|
||
CheckQuota -->|Yes| AuditFull[Full Audit<br/>PageSpeed + OnPage + Technical]
|
||
|
||
AuditPartial --> SaveResult[Save to database]
|
||
AuditFull --> SaveResult
|
||
|
||
SaveResult --> UpdateStats[Update summary stats]
|
||
UpdateStats --> Sleep[Sleep 1s<br/>Rate limiting]
|
||
Sleep --> Loop
|
||
|
||
Loop -->|Done| PrintSummary[Print Summary Report]
|
||
|
||
PrintSummary --> ShowStats[Show score distribution<br/>Failed audits<br/>Quota usage]
|
||
ShowStats --> End[Exit with code]
|
||
|
||
style AuditFull fill:#4CAF50
|
||
style AuditPartial fill:#FF9800
|
||
style QuotaWarn fill:#F44336
|
||
```
|
||
|
||
---
|
||
|
||
## 3. Score Calculation
|
||
|
||
### 3.1 Overall SEO Score Formula
|
||
|
||
The overall SEO score is a **weighted average** of four components:
|
||
|
||
```
|
||
Overall Score = (
|
||
(PageSpeed SEO × 3) +
|
||
(PageSpeed Performance × 2) +
|
||
(On-Page Score × 2) +
|
||
(Technical Score × 2)
|
||
) / Total Weight
|
||
```
|
||
|
||
**Weights:**
|
||
- PageSpeed SEO: **3x** (most important for search rankings)
|
||
- PageSpeed Performance: **2x** (user experience)
|
||
- On-Page Score: **2x** (content optimization)
|
||
- Technical Score: **2x** (crawlability and indexability)
|
||
|
||
**Score Ranges:**
|
||
- **90-100 (Green):** Excellent SEO
|
||
- **50-89 (Yellow):** Needs improvement
|
||
- **0-49 (Red):** Poor SEO
|
||
|
||
### 3.2 On-Page Score Calculation
|
||
|
||
**Starting Score:** 100 (perfect)
|
||
|
||
**Deductions:**
|
||
|
||
| Issue | Deduction | Check |
|
||
|-------|-----------|-------|
|
||
| Missing meta title | -15 | `meta_tags['title']` is empty |
|
||
| Title too short/long | -5 | Length < 30 or > 70 characters |
|
||
| Missing meta description | -10 | `meta_tags['description']` is empty |
|
||
| Description too short/long | -5 | Length < 120 or > 160 characters |
|
||
| No canonical URL | -5 | `meta_tags['canonical_url']` is empty |
|
||
| No H1 heading | -10 | `headings['h1_count']` == 0 |
|
||
| Multiple H1 headings | -5 | `headings['h1_count']` > 1 |
|
||
| Improper heading hierarchy | -5 | H3 without H2, etc. |
|
||
| >50% images missing alt | -10 | `images_without_alt / total_images` > 0.5 |
|
||
| >20% images missing alt | -5 | `images_without_alt / total_images` > 0.2 |
|
||
| No structured data | -5 | No JSON-LD or Schema.org |
|
||
| No Open Graph tags | -3 | No `og:title` |
|
||
|
||
**Example:**
|
||
```python
|
||
# Perfect page
|
||
score = 100
|
||
# Missing meta description (-10)
|
||
# 1 image without alt out of 10 (-0, < 20%)
|
||
# No structured data (-5)
|
||
final_score = 100 - 10 - 5 = 85 (Good)
|
||
```
|
||
|
||
### 3.3 Technical Score Calculation
|
||
|
||
**Starting Score:** 100 (perfect)
|
||
|
||
**Deductions:**
|
||
|
||
| Issue | Deduction | Check |
|
||
|-------|-----------|-------|
|
||
| No robots.txt | -10 | `robots_txt['exists']` == False |
|
||
| Robots blocks Googlebot | -20 | `robots_txt['blocks_googlebot']` == True |
|
||
| No sitemap.xml | -10 | `sitemap['exists']` == False |
|
||
| Invalid sitemap XML | -5 | `sitemap['is_valid_xml']` == False |
|
||
| >3 redirects in chain | -10 | `redirect_chain['chain_length']` > 3 |
|
||
| >1 redirect | -5 | `redirect_chain['chain_length']` > 1 |
|
||
| Redirect loop detected | -20 | `redirect_chain['has_redirect_loop']` == True |
|
||
| Not indexable | -15 | `indexability['is_indexable']` == False |
|
||
| Canonical to different domain | -10 | Points to external site |
|
||
|
||
**Example:**
|
||
```python
|
||
# Typical site
|
||
score = 100
|
||
# No robots.txt (-10)
|
||
# Has sitemap.xml (+0)
|
||
# 1 redirect (-5)
|
||
# Indexable (+0)
|
||
final_score = 100 - 10 - 5 = 85 (Good)
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Database Schema
|
||
|
||
### 4.1 CompanyWebsiteAnalysis Table
|
||
|
||
The `company_website_analysis` table stores comprehensive SEO audit results.
|
||
|
||
**Location:** `database.py` (lines ~429-520)
|
||
|
||
**Key Fields:**
|
||
|
||
```sql
|
||
CREATE TABLE company_website_analysis (
|
||
-- Identity
|
||
id SERIAL PRIMARY KEY,
|
||
company_id INTEGER REFERENCES companies(id) UNIQUE,
|
||
analyzed_at TIMESTAMP DEFAULT NOW(),
|
||
|
||
-- Basic Info
|
||
website_url VARCHAR(500),
|
||
final_url VARCHAR(500), -- After redirects
|
||
http_status_code INTEGER,
|
||
load_time_ms INTEGER,
|
||
|
||
-- PageSpeed Scores (0-100)
|
||
pagespeed_seo_score INTEGER,
|
||
pagespeed_performance_score INTEGER,
|
||
pagespeed_accessibility_score INTEGER,
|
||
pagespeed_best_practices_score INTEGER,
|
||
pagespeed_audits JSONB, -- Failed Lighthouse audits
|
||
|
||
-- On-Page SEO
|
||
meta_title VARCHAR(500),
|
||
meta_description TEXT,
|
||
meta_keywords TEXT,
|
||
h1_count INTEGER,
|
||
h2_count INTEGER,
|
||
h3_count INTEGER,
|
||
h1_text VARCHAR(500),
|
||
total_images INTEGER,
|
||
images_without_alt INTEGER,
|
||
images_with_alt INTEGER,
|
||
internal_links_count INTEGER,
|
||
external_links_count INTEGER,
|
||
broken_links_count INTEGER,
|
||
has_structured_data BOOLEAN,
|
||
structured_data_types TEXT[], -- ['Organization', 'LocalBusiness']
|
||
structured_data_json JSONB,
|
||
|
||
-- Technical SEO
|
||
has_canonical BOOLEAN,
|
||
canonical_url VARCHAR(500),
|
||
is_indexable BOOLEAN,
|
||
noindex_reason VARCHAR(100),
|
||
has_sitemap BOOLEAN,
|
||
has_robots_txt BOOLEAN,
|
||
viewport_configured BOOLEAN,
|
||
is_mobile_friendly BOOLEAN,
|
||
|
||
-- Core Web Vitals
|
||
largest_contentful_paint_ms INTEGER, -- LCP (Good: <2500ms)
|
||
first_input_delay_ms INTEGER, -- FID (Good: <100ms)
|
||
cumulative_layout_shift NUMERIC(4,2), -- CLS (Good: <0.1)
|
||
|
||
-- Open Graph
|
||
has_og_tags BOOLEAN,
|
||
og_title VARCHAR(500),
|
||
og_description TEXT,
|
||
og_image VARCHAR(500),
|
||
has_twitter_cards BOOLEAN,
|
||
|
||
-- Language & International
|
||
html_lang VARCHAR(10),
|
||
has_hreflang BOOLEAN,
|
||
|
||
-- Word Count
|
||
word_count_homepage INTEGER,
|
||
|
||
-- Audit Metadata
|
||
seo_audit_version VARCHAR(20),
|
||
seo_audited_at TIMESTAMP,
|
||
seo_audit_errors TEXT[],
|
||
seo_overall_score INTEGER,
|
||
seo_health_score INTEGER,
|
||
seo_issues JSONB
|
||
);
|
||
|
||
-- Indexes
|
||
CREATE INDEX idx_cwa_company_id ON company_website_analysis(company_id);
|
||
CREATE INDEX idx_cwa_analyzed_at ON company_website_analysis(analyzed_at);
|
||
CREATE INDEX idx_cwa_seo_audited_at ON company_website_analysis(seo_audited_at);
|
||
```
|
||
|
||
### 4.2 Upsert Pattern
|
||
|
||
The audit uses **ON CONFLICT DO UPDATE** for idempotent saves:
|
||
|
||
```sql
|
||
INSERT INTO company_website_analysis (
|
||
company_id, analyzed_at, website_url, ...
|
||
) VALUES (
|
||
:company_id, :analyzed_at, :website_url, ...
|
||
)
|
||
ON CONFLICT (company_id) DO UPDATE SET
|
||
analyzed_at = EXCLUDED.analyzed_at,
|
||
website_url = EXCLUDED.website_url,
|
||
pagespeed_seo_score = EXCLUDED.pagespeed_seo_score,
|
||
-- ... all fields updated
|
||
seo_audited_at = EXCLUDED.seo_audited_at;
|
||
```
|
||
|
||
**Benefits:**
|
||
- Safe to run multiple times (idempotent)
|
||
- Always keeps latest audit results
|
||
- No duplicate records
|
||
- Atomic operation (transaction-safe)
|
||
|
||
---
|
||
|
||
## 5. API Endpoints
|
||
|
||
### 5.1 Admin SEO Dashboard
|
||
|
||
**Route:** `GET /admin/seo`
|
||
**Authentication:** Required (Admin only)
|
||
**Location:** `app.py` lines 4093-4192
|
||
|
||
**Purpose:** Display SEO metrics dashboard for all companies
|
||
|
||
**Query Parameters:**
|
||
- `company` (optional): Company slug to highlight/filter
|
||
|
||
**Response:** HTML (admin_seo_dashboard.html template)
|
||
|
||
**Dashboard Features:**
|
||
- Summary stats (score distribution, average, not audited count)
|
||
- Sortable table by name, category, scores, date
|
||
- Filters by category, score range, company name
|
||
- Color-coded score badges
|
||
- Last audit date with staleness indicator
|
||
- Actions: view profile, trigger single audit
|
||
|
||
**Access Control:**
|
||
```python
|
||
if not current_user.is_admin:
|
||
flash('Brak uprawnień do tej strony.', 'error')
|
||
return redirect(url_for('dashboard'))
|
||
```
|
||
|
||
### 5.2 Get SEO Audit Results (Read)
|
||
|
||
**Route:** `GET /api/seo/audit`
|
||
**Authentication:** Not required (public API)
|
||
**Location:** `app.py` lines 3870-3914
|
||
|
||
**Purpose:** Retrieve existing SEO audit results for a company
|
||
|
||
**Query Parameters:**
|
||
- `company_id` (integer): Company ID
|
||
- `slug` (string): Company slug
|
||
|
||
**Response:**
|
||
```json
|
||
{
|
||
"company_id": 26,
|
||
"company_name": "PIXLAB Sp. z o.o.",
|
||
"company_slug": "pixlab-sp-z-o-o",
|
||
"website": "https://pixlab.pl",
|
||
"pagespeed": {
|
||
"seo_score": 92,
|
||
"performance_score": 78,
|
||
"accessibility_score": 95,
|
||
"best_practices_score": 88,
|
||
"audits": {...}
|
||
},
|
||
"on_page": {
|
||
"meta_title": "PIXLAB - Oprogramowanie na miarę",
|
||
"meta_description": "Tworzymy dedykowane oprogramowanie...",
|
||
"h1_count": 1,
|
||
"total_images": 12,
|
||
"images_without_alt": 0,
|
||
"has_structured_data": true
|
||
},
|
||
"technical": {
|
||
"has_robots_txt": true,
|
||
"has_sitemap": true,
|
||
"is_indexable": true,
|
||
"is_mobile_friendly": true
|
||
},
|
||
"overall_score": 88,
|
||
"audited_at": "2026-01-10T10:30:00"
|
||
}
|
||
```
|
||
|
||
### 5.3 Trigger SEO Audit (Write)
|
||
|
||
**Route:** `POST /api/seo/audit`
|
||
**Authentication:** Required (Admin only)
|
||
**Rate Limit:** 10 requests per hour per user
|
||
**Location:** `app.py` lines 3943-4086
|
||
|
||
**Purpose:** Trigger a new SEO audit for a company
|
||
|
||
**Request Body:**
|
||
```json
|
||
{
|
||
"company_id": 26,
|
||
"slug": "pixlab-sp-z-o-o"
|
||
}
|
||
```
|
||
|
||
**Response (Success):**
|
||
```json
|
||
{
|
||
"success": true,
|
||
"message": "Audyt SEO dla firmy \"PIXLAB Sp. z o.o.\" został zakończony pomyślnie.",
|
||
"audit_version": "1.0.0",
|
||
"triggered_by": "admin@nordabiznes.pl",
|
||
"triggered_at": "2026-01-10T10:35:00",
|
||
"company_id": 26,
|
||
"company_name": "PIXLAB Sp. z o.o.",
|
||
"pagespeed": {...},
|
||
"on_page": {...},
|
||
"technical": {...},
|
||
"overall_score": 88
|
||
}
|
||
```
|
||
|
||
**Response (Error - No Website):**
|
||
```json
|
||
{
|
||
"success": false,
|
||
"error": "Firma \"PIXLAB Sp. z o.o.\" nie ma zdefiniowanej strony internetowej.",
|
||
"company_id": 26,
|
||
"company_name": "PIXLAB Sp. z o.o."
|
||
}
|
||
```
|
||
|
||
**Response (Error - Quota Exceeded):**
|
||
```json
|
||
{
|
||
"success": false,
|
||
"error": "PageSpeed API quota exceeded. Try again tomorrow.",
|
||
"company_id": 26
|
||
}
|
||
```
|
||
|
||
**Access Control:**
|
||
```python
|
||
if not current_user.is_admin:
|
||
return jsonify({
|
||
'success': False,
|
||
'error': 'Brak uprawnień. Tylko administrator może uruchamiać audyty SEO.'
|
||
}), 403
|
||
```
|
||
|
||
**Rate Limiting:**
|
||
```python
|
||
@limiter.limit("10 per hour")
|
||
```
|
||
|
||
---
|
||
|
||
## 6. PageSpeed Insights API Integration
|
||
|
||
### 6.1 API Configuration
|
||
|
||
**Service File:** `scripts/pagespeed_client.py`
|
||
|
||
**Endpoint:** `https://www.googleapis.com/pagespeedonline/v5/runPagespeed`
|
||
|
||
**Authentication:** API Key (GOOGLE_PAGESPEED_API_KEY)
|
||
|
||
**Free Tier:**
|
||
- 25,000 queries per day
|
||
- $5 per 1,000 queries after free tier
|
||
|
||
**API Key:**
|
||
- **Name in Google Cloud:** "Page SPEED SEO Audit v2"
|
||
- **Project:** NORDABIZNES (gen-lang-client-0540794446)
|
||
- **Storage:** `.env` file (GOOGLE_PAGESPEED_API_KEY)
|
||
|
||
### 6.2 API Request
|
||
|
||
```python
|
||
params = {
|
||
'url': 'https://example.com',
|
||
'key': GOOGLE_PAGESPEED_API_KEY,
|
||
'strategy': 'mobile', # or 'desktop'
|
||
'category': ['performance', 'accessibility', 'best-practices', 'seo']
|
||
}
|
||
|
||
response = requests.get(
|
||
'https://www.googleapis.com/pagespeedonline/v5/runPagespeed',
|
||
params=params,
|
||
timeout=30
|
||
)
|
||
```
|
||
|
||
### 6.3 API Response Structure
|
||
|
||
```json
|
||
{
|
||
"lighthouseResult": {
|
||
"categories": {
|
||
"performance": {"score": 0.78},
|
||
"accessibility": {"score": 0.95},
|
||
"best-practices": {"score": 0.88},
|
||
"seo": {"score": 0.92}
|
||
},
|
||
"audits": {
|
||
"largest-contentful-paint": {"numericValue": 2300},
|
||
"first-input-delay": {"numericValue": 85},
|
||
"cumulative-layout-shift": {"numericValue": 0.05},
|
||
"meta-description": {"score": 1.0},
|
||
"robots-txt": {"score": 1.0},
|
||
"is-crawlable": {"score": 1.0}
|
||
}
|
||
},
|
||
"loadingExperience": {
|
||
"metrics": {
|
||
"LARGEST_CONTENTFUL_PAINT_MS": {"category": "FAST"},
|
||
"FIRST_INPUT_DELAY_MS": {"category": "FAST"},
|
||
"CUMULATIVE_LAYOUT_SHIFT_SCORE": {"category": "FAST"}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 6.4 Quota Management
|
||
|
||
**Quota Tracking:**
|
||
```python
|
||
class GooglePageSpeedClient:
|
||
def __init__(self):
|
||
self.daily_quota = 25000
|
||
self.used_today = 0 # Reset daily at midnight
|
||
|
||
def get_remaining_quota(self) -> int:
|
||
"""Returns remaining API quota for today."""
|
||
return max(0, self.daily_quota - self.used_today)
|
||
|
||
def analyze_url(self, url: str) -> PageSpeedResult:
|
||
if self.get_remaining_quota() <= 0:
|
||
raise QuotaExceededError("Daily quota exceeded")
|
||
|
||
# Make API call
|
||
response = self._call_api(url)
|
||
self.used_today += 1
|
||
|
||
return self._parse_response(response)
|
||
```
|
||
|
||
**Quota Exceeded Handling:**
|
||
1. Check quota before audit: `if quota > 0`
|
||
2. If exceeded, skip PageSpeed but continue on-page/technical
|
||
3. Log warning: "PageSpeed quota exceeded, skipping"
|
||
4. Return partial audit result (no PageSpeed scores)
|
||
|
||
---
|
||
|
||
## 7. SEO Audit Script Usage
|
||
|
||
### 7.1 Command Line Interface
|
||
|
||
**Script Location:** `scripts/seo_audit.py`
|
||
|
||
**Basic Usage:**
|
||
```bash
|
||
# Audit single company by ID
|
||
python seo_audit.py --company-id 26
|
||
|
||
# Audit single company by slug
|
||
python seo_audit.py --company-slug pixlab-sp-z-o-o
|
||
|
||
# Audit batch of companies (rows 1-10)
|
||
python seo_audit.py --batch 1-10
|
||
|
||
# Audit all companies
|
||
python seo_audit.py --all
|
||
|
||
# Dry run (no database writes)
|
||
python seo_audit.py --company-id 26 --dry-run
|
||
|
||
# Export results to JSON
|
||
python seo_audit.py --all --json > seo_report.json
|
||
```
|
||
|
||
**Options:**
|
||
- `--company-id ID`: Audit single company by ID
|
||
- `--company-ids IDS`: Audit multiple companies (comma-separated: 1,5,10)
|
||
- `--batch RANGE`: Audit batch by row offset (e.g., 1-10)
|
||
- `--all`: Audit all companies
|
||
- `--dry-run`: Print results without saving to database
|
||
- `--verbose, -v`: Enable verbose/debug output
|
||
- `--quiet, -q`: Suppress progress output (only summary)
|
||
- `--json`: Output results as JSON
|
||
- `--database-url URL`: Override DATABASE_URL env var
|
||
|
||
### 7.2 Exit Codes
|
||
|
||
| Code | Meaning |
|
||
|------|---------|
|
||
| 0 | All audits completed successfully |
|
||
| 1 | Argument error or invalid input |
|
||
| 2 | Partial failures (some audits failed) |
|
||
| 3 | All audits failed |
|
||
| 4 | Database connection error |
|
||
| 5 | API quota exceeded |
|
||
|
||
### 7.3 Batch Audit Output
|
||
|
||
```
|
||
============================================================
|
||
SEO AUDIT STARTING
|
||
============================================================
|
||
Companies to audit: 80
|
||
Mode: LIVE
|
||
PageSpeed API quota remaining: 24,950
|
||
============================================================
|
||
|
||
[1/80] PIXLAB Sp. z o.o. (ID: 26) - ETA: calculating...
|
||
Fetching page: https://pixlab.pl
|
||
Page fetched successfully (850ms)
|
||
Running on-page SEO analysis...
|
||
On-page analysis complete
|
||
Running technical SEO checks...
|
||
Technical checks complete
|
||
Running PageSpeed Insights (quota: 24,949)...
|
||
PageSpeed complete - SEO: 92, Perf: 78
|
||
Saved SEO audit for company 26
|
||
→ SUCCESS: Overall SEO score: 88
|
||
|
||
[2/80] Hotel SPA Wieniawa (ID: 15) - ETA: 00:15:30
|
||
Fetching page: https://wieniawa.pl
|
||
...
|
||
|
||
======================================================================
|
||
SEO AUDIT COMPLETE
|
||
======================================================================
|
||
|
||
Mode: LIVE
|
||
Duration: 00:18:45
|
||
|
||
----------------------------------------------------------------------
|
||
RESULTS BREAKDOWN
|
||
----------------------------------------------------------------------
|
||
Total companies: 80
|
||
✓ Successful: 72
|
||
✗ Failed: 5
|
||
○ Skipped: 3
|
||
|
||
- No website: 3
|
||
- Unavailable: 2
|
||
- Timeout: 2
|
||
- SSL errors: 1
|
||
|
||
----------------------------------------------------------------------
|
||
PAGESPEED API QUOTA
|
||
----------------------------------------------------------------------
|
||
Quota at start: 24,950
|
||
Quota used: 72
|
||
Quota remaining: 24,878
|
||
|
||
----------------------------------------------------------------------
|
||
SEO SCORE DISTRIBUTION
|
||
----------------------------------------------------------------------
|
||
Companies with scores: 72
|
||
Average SEO score: 76.3
|
||
Highest score: 95
|
||
Lowest score: 42
|
||
|
||
Excellent (90-100): 18 ██████████████░░░░░░░░░░░░░░░░░░
|
||
Good (70-89): 38 ████████████████████████████████
|
||
Fair (50-69): 12 ████████░░░░░░░░░░░░░░░░░░░░░░░░
|
||
Poor (<50): 4 ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
|
||
|
||
----------------------------------------------------------------------
|
||
FAILED AUDITS
|
||
----------------------------------------------------------------------
|
||
🔴 Firma ABC - HTTP 404
|
||
⏱ Firma XYZ - Timeout after 30s
|
||
🔌 Firma DEF - Connection refused
|
||
|
||
======================================================================
|
||
```
|
||
|
||
### 7.4 Production Deployment
|
||
|
||
**On NORDABIZ-01 Server:**
|
||
|
||
```bash
|
||
# Connect to server
|
||
ssh maciejpi@57.128.200.27
|
||
|
||
# Navigate to application directory
|
||
cd /var/www/nordabiznes
|
||
|
||
# Activate virtual environment
|
||
source venv/bin/activate
|
||
|
||
# Run audit for all companies (production database)
|
||
cd scripts
|
||
python seo_audit.py --all
|
||
|
||
# Run audit for specific company
|
||
python seo_audit.py --company-id 26
|
||
|
||
# Dry run to test without saving
|
||
python seo_audit.py --all --dry-run
|
||
|
||
# Export results to JSON
|
||
python seo_audit.py --all --json > ~/seo_audit_$(date +%Y%m%d).json
|
||
```
|
||
|
||
**IMPORTANT - Database Connection:**
|
||
Scripts in `scripts/` must use **localhost (127.0.0.1)** for PostgreSQL:
|
||
|
||
```python
|
||
# CORRECT:
|
||
DATABASE_URL = 'postgresql://nordabiz_app:NordaBiz2025Secure@127.0.0.1:5432/nordabiz'
|
||
|
||
# WRONG (PostgreSQL doesn't accept external connections):
|
||
DATABASE_URL = 'postgresql://nordabiz_app:NordaBiz2025Secure@57.128.200.27:5432/nordabiz'
|
||
```
|
||
|
||
### 7.5 Cron Job (Automated Audits)
|
||
|
||
**Schedule weekly audit:**
|
||
```bash
|
||
# Edit crontab
|
||
crontab -e
|
||
|
||
# Add weekly audit (Sundays at 2 AM)
|
||
0 2 * * 0 cd /var/www/nordabiznes && /var/www/nordabiznes/venv/bin/python3 scripts/seo_audit.py --all >> /var/log/nordabiznes/seo_audit.log 2>&1
|
||
```
|
||
|
||
**Benefits:**
|
||
- Automatic SEO monitoring
|
||
- Detect score degradation
|
||
- Track improvements over time
|
||
- Email alerts on failures (future)
|
||
|
||
---
|
||
|
||
## 8. Security & Performance
|
||
|
||
### 8.1 Security Features
|
||
|
||
**1. Admin-Only Access:**
|
||
```python
|
||
if not current_user.is_admin:
|
||
return jsonify({'error': 'Brak uprawnień'}), 403
|
||
```
|
||
|
||
**2. Rate Limiting:**
|
||
```python
|
||
@limiter.limit("10 per hour")
|
||
```
|
||
- Prevents API abuse
|
||
- Protects PageSpeed quota
|
||
- Per-user rate limit
|
||
|
||
**3. CSRF Protection:**
|
||
```javascript
|
||
fetch('/api/seo/audit', {
|
||
headers: {
|
||
'X-CSRFToken': csrfToken
|
||
}
|
||
})
|
||
```
|
||
|
||
**4. Input Validation:**
|
||
```python
|
||
if not company_id and not slug:
|
||
return jsonify({'error': 'Podaj company_id lub slug'}), 400
|
||
```
|
||
|
||
**5. Database Permissions:**
|
||
```sql
|
||
GRANT ALL ON TABLE company_website_analysis TO nordabiz_app;
|
||
GRANT USAGE, SELECT ON SEQUENCE company_website_analysis_id_seq TO nordabiz_app;
|
||
```
|
||
|
||
### 8.2 Performance Optimizations
|
||
|
||
**1. Upsert Instead of Insert:**
|
||
- ON CONFLICT DO UPDATE (idempotent)
|
||
- No duplicate records
|
||
- Safe to re-run audits
|
||
|
||
**2. Database Indexing:**
|
||
```sql
|
||
CREATE INDEX idx_cwa_company_id ON company_website_analysis(company_id);
|
||
CREATE INDEX idx_cwa_seo_audited_at ON company_website_analysis(seo_audited_at);
|
||
```
|
||
|
||
**3. Batch Processing:**
|
||
- Process companies sequentially
|
||
- Sleep 1s between audits (rate limiting)
|
||
- Skip companies without websites
|
||
|
||
**4. API Quota Management:**
|
||
- Check quota before calling PageSpeed
|
||
- Skip PageSpeed if quota low
|
||
- Continue with on-page/technical only
|
||
|
||
**5. Timeout Handling:**
|
||
```python
|
||
response = requests.get(url, timeout=30)
|
||
```
|
||
- Prevents hanging requests
|
||
- Falls back gracefully
|
||
|
||
**6. Caching (Future):**
|
||
- Cache PageSpeed results for 7 days
|
||
- Skip re-audit if recent (<7 days old)
|
||
- Force refresh option for admins
|
||
|
||
---
|
||
|
||
## 9. Error Handling
|
||
|
||
### 9.1 Common Errors
|
||
|
||
**1. No Website URL:**
|
||
```json
|
||
{
|
||
"success": false,
|
||
"error": "Firma \"ABC\" nie ma zdefiniowanej strony internetowej.",
|
||
"company_id": 15
|
||
}
|
||
```
|
||
|
||
**2. Website Unreachable:**
|
||
```json
|
||
{
|
||
"success": false,
|
||
"error": "Audyt nie powiódł się: HTTP 404, Timeout after 30s",
|
||
"company_id": 26
|
||
}
|
||
```
|
||
|
||
**3. SSL Certificate Error:**
|
||
```
|
||
⚠ SSL error for https://example.com
|
||
Trying HTTP fallback: http://example.com
|
||
✓ Fallback successful
|
||
```
|
||
|
||
**4. PageSpeed API Quota Exceeded:**
|
||
```json
|
||
{
|
||
"success": false,
|
||
"error": "PageSpeed API quota exceeded. Try again tomorrow."
|
||
}
|
||
```
|
||
|
||
**5. Database Connection Error:**
|
||
```
|
||
❌ Error: Database connection failed: connection refused
|
||
Exit code: 4
|
||
```
|
||
|
||
### 9.2 Error Recovery
|
||
|
||
**1. SSL Errors → HTTP Fallback:**
|
||
```python
|
||
try:
|
||
response = requests.get(https_url)
|
||
except requests.exceptions.SSLError:
|
||
http_url = https_url.replace('https://', 'http://')
|
||
response = requests.get(http_url)
|
||
```
|
||
|
||
**2. Timeout → Skip Company:**
|
||
```python
|
||
try:
|
||
response = requests.get(url, timeout=30)
|
||
except requests.exceptions.Timeout:
|
||
result['errors'].append('Timeout after 30s')
|
||
# Continue to next company
|
||
```
|
||
|
||
**3. Quota Exceeded → Skip PageSpeed:**
|
||
```python
|
||
if quota_remaining > 0:
|
||
run_pagespeed_audit()
|
||
else:
|
||
logger.warning("Quota exceeded, skipping PageSpeed")
|
||
# Continue with on-page/technical only
|
||
```
|
||
|
||
**4. Database Error → Rollback:**
|
||
```python
|
||
try:
|
||
db.execute(query)
|
||
db.commit()
|
||
except SQLAlchemyError as e:
|
||
db.rollback()
|
||
logger.error(f"Database error: {e}")
|
||
```
|
||
|
||
---
|
||
|
||
## 10. Monitoring & Maintenance
|
||
|
||
### 10.1 Health Checks
|
||
|
||
**Check SEO Audit Status:**
|
||
```bash
|
||
# Check latest audit dates
|
||
psql -U nordabiz_app -d nordabiz -c "
|
||
SELECT
|
||
c.name,
|
||
cwa.seo_audited_at,
|
||
cwa.pagespeed_seo_score,
|
||
cwa.seo_overall_score
|
||
FROM companies c
|
||
LEFT JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active'
|
||
ORDER BY cwa.seo_audited_at DESC NULLS LAST
|
||
LIMIT 10;
|
||
"
|
||
```
|
||
|
||
**Check Quota Usage:**
|
||
```bash
|
||
# Check how many audits today
|
||
psql -U nordabiz_app -d nordabiz -c "
|
||
SELECT COUNT(*) AS audits_today
|
||
FROM company_website_analysis
|
||
WHERE seo_audited_at >= CURRENT_DATE;
|
||
"
|
||
```
|
||
|
||
**Check Failed Audits:**
|
||
```bash
|
||
# Companies with no SEO data
|
||
psql -U nordabiz_app -d nordabiz -c "
|
||
SELECT c.id, c.name, c.website
|
||
FROM companies c
|
||
LEFT JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active'
|
||
AND c.website IS NOT NULL
|
||
AND cwa.id IS NULL;
|
||
"
|
||
```
|
||
|
||
### 10.2 Maintenance Tasks
|
||
|
||
**1. Re-audit Stale Data (>30 days):**
|
||
```bash
|
||
python seo_audit.py --all --filter-stale 30
|
||
```
|
||
|
||
**2. Audit New Companies:**
|
||
```bash
|
||
# Companies added in last 7 days
|
||
python seo_audit.py --filter-new 7
|
||
```
|
||
|
||
**3. Fix Failed Audits:**
|
||
```bash
|
||
# Re-audit companies with errors
|
||
python seo_audit.py --retry-failed
|
||
```
|
||
|
||
**4. Clean Old Data:**
|
||
```sql
|
||
-- Delete audit results older than 90 days (keep latest)
|
||
DELETE FROM company_website_analysis
|
||
WHERE analyzed_at < NOW() - INTERVAL '90 days'
|
||
AND id NOT IN (
|
||
SELECT DISTINCT ON (company_id) id
|
||
FROM company_website_analysis
|
||
ORDER BY company_id, analyzed_at DESC
|
||
);
|
||
```
|
||
|
||
### 10.3 Monitoring Queries
|
||
|
||
**Score Distribution:**
|
||
```sql
|
||
SELECT
|
||
CASE
|
||
WHEN pagespeed_seo_score >= 90 THEN 'Excellent (90-100)'
|
||
WHEN pagespeed_seo_score >= 50 THEN 'Good (50-89)'
|
||
WHEN pagespeed_seo_score >= 0 THEN 'Poor (0-49)'
|
||
ELSE 'Not Audited'
|
||
END AS score_range,
|
||
COUNT(*) AS companies
|
||
FROM companies c
|
||
LEFT JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active'
|
||
GROUP BY score_range
|
||
ORDER BY score_range;
|
||
```
|
||
|
||
**Top/Bottom Performers:**
|
||
```sql
|
||
-- Top 10 SEO scores
|
||
SELECT c.name, cwa.pagespeed_seo_score, cwa.seo_overall_score
|
||
FROM companies c
|
||
JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active'
|
||
ORDER BY cwa.seo_overall_score DESC
|
||
LIMIT 10;
|
||
|
||
-- Bottom 10 SEO scores
|
||
SELECT c.name, cwa.pagespeed_seo_score, cwa.seo_overall_score
|
||
FROM companies c
|
||
JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active' AND cwa.seo_overall_score IS NOT NULL
|
||
ORDER BY cwa.seo_overall_score ASC
|
||
LIMIT 10;
|
||
```
|
||
|
||
**Audit Coverage:**
|
||
```sql
|
||
SELECT
|
||
COUNT(*) AS total_companies,
|
||
COUNT(cwa.id) AS audited_companies,
|
||
ROUND(COUNT(cwa.id)::NUMERIC / COUNT(*)::NUMERIC * 100, 1) AS coverage_percent
|
||
FROM companies c
|
||
LEFT JOIN company_website_analysis cwa ON c.id = cwa.company_id
|
||
WHERE c.status = 'active' AND c.website IS NOT NULL;
|
||
```
|
||
|
||
---
|
||
|
||
## 11. Future Enhancements
|
||
|
||
### 11.1 Planned Features
|
||
|
||
**1. Automated Re-Audit Scheduling:**
|
||
- Weekly cron job for all companies
|
||
- Priority queue for low-scoring sites
|
||
- Email alerts for score drops
|
||
|
||
**2. Historical Trend Tracking:**
|
||
- Store audit history (not just latest)
|
||
- Chart score changes over time
|
||
- Identify improving/declining sites
|
||
|
||
**3. Competitor Benchmarking:**
|
||
- Compare scores within categories
|
||
- Identify SEO leaders
|
||
- Best practice recommendations
|
||
|
||
**4. SEO Report Generation:**
|
||
- PDF reports for company owners
|
||
- Actionable recommendations
|
||
- Step-by-step fix guides
|
||
|
||
**5. Integration with Company Profiles:**
|
||
- Display SEO badge on company page
|
||
- Show top SEO issues
|
||
- Link to audit details
|
||
|
||
**6. Mobile vs Desktop Audits:**
|
||
- Separate scores for mobile/desktop
|
||
- Mobile-first optimization tracking
|
||
- Device-specific recommendations
|
||
|
||
### 11.2 Technical Improvements
|
||
|
||
**1. Async Batch Processing:**
|
||
- Celery background tasks
|
||
- Parallel audits (5 concurrent)
|
||
- Real-time progress updates
|
||
|
||
**2. API Webhook Notifications:**
|
||
- Notify company owners of audit results
|
||
- Integration with Slack/Discord
|
||
- Email summaries
|
||
|
||
**3. Advanced Caching:**
|
||
- Cache PageSpeed results for 7 days
|
||
- Skip re-audit if recent
|
||
- Force refresh button for admins
|
||
|
||
**4. Audit Scheduling:**
|
||
- Per-company audit frequency
|
||
- High-priority companies daily
|
||
- Low-priority weekly
|
||
|
||
---
|
||
|
||
## 12. Troubleshooting
|
||
|
||
### 12.1 Common Issues
|
||
|
||
**Issue:** "PageSpeed API quota exceeded"
|
||
**Solution:** Wait 24 hours for quota reset or upgrade to paid tier
|
||
|
||
**Issue:** "Database connection failed"
|
||
**Solution:** Check PostgreSQL is running: `systemctl status postgresql`
|
||
|
||
**Issue:** "SSL certificate verify failed"
|
||
**Solution:** Script automatically tries HTTP fallback
|
||
|
||
**Issue:** "Company has no website URL"
|
||
**Solution:** Add website in company edit form or skip
|
||
|
||
**Issue:** "Timeout after 30s"
|
||
**Solution:** Website is slow/down, skip or retry later
|
||
|
||
### 12.2 Debugging
|
||
|
||
**Enable Verbose Logging:**
|
||
```bash
|
||
python seo_audit.py --all --verbose
|
||
```
|
||
|
||
**Check API Key:**
|
||
```bash
|
||
echo $GOOGLE_PAGESPEED_API_KEY
|
||
# Should print API key, not empty
|
||
```
|
||
|
||
**Test Single Company:**
|
||
```bash
|
||
python seo_audit.py --company-id 26 --dry-run
|
||
# See full audit output without saving
|
||
```
|
||
|
||
**Check Database Connection:**
|
||
```bash
|
||
psql -U nordabiz_app -d nordabiz -h 127.0.0.1 -c "SELECT COUNT(*) FROM companies;"
|
||
```
|
||
|
||
**Test PageSpeed API:**
|
||
```bash
|
||
curl "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://pixlab.pl&key=YOUR_API_KEY&strategy=mobile"
|
||
```
|
||
|
||
---
|
||
|
||
## 13. Related Documentation
|
||
|
||
- **Google PageSpeed API:** [docs/architecture/flows/external-api-integrations.md#3-google-pagespeed-insights-api](../06-external-integrations.md#3-google-pagespeed-insights-api)
|
||
- **Database Schema:** [docs/architecture/05-database-schema.md](../05-database-schema.md)
|
||
- **Flask Components:** [docs/architecture/04-flask-components.md](../04-flask-components.md)
|
||
- **Admin Panel:** [CLAUDE.md#audyt-seo-panel-adminseo](../../CLAUDE.md#audyt-seo-panel-adminseo)
|
||
|
||
---
|
||
|
||
## 14. Glossary
|
||
|
||
| Term | Definition |
|
||
|------|------------|
|
||
| **SEO** | Search Engine Optimization - improving website visibility in search results |
|
||
| **PageSpeed Insights** | Google tool for measuring website performance and SEO quality |
|
||
| **Lighthouse** | Automated audit tool by Google (powers PageSpeed Insights) |
|
||
| **Core Web Vitals** | Google's UX metrics: LCP (Largest Contentful Paint), FID (First Input Delay), CLS (Cumulative Layout Shift) |
|
||
| **On-Page SEO** | SEO factors on the page itself (meta tags, headings, content) |
|
||
| **Technical SEO** | SEO factors related to crawlability (robots.txt, sitemap, indexability) |
|
||
| **Meta Tags** | HTML tags providing metadata about the page (title, description, keywords) |
|
||
| **Structured Data** | Machine-readable format (JSON-LD, Schema.org) for search engines |
|
||
| **Canonical URL** | Preferred version of a page (prevents duplicate content issues) |
|
||
| **Robots.txt** | File telling search engines which pages to crawl/not crawl |
|
||
| **Sitemap.xml** | XML file listing all pages on a website for search engines |
|
||
| **Open Graph** | Meta tags for social media sharing (og:title, og:image, etc.) |
|
||
| **Twitter Card** | Meta tags for Twitter sharing |
|
||
| **Upsert** | Database operation: INSERT or UPDATE if exists |
|
||
| **Quota** | API usage limit (25,000 requests/day for PageSpeed) |
|
||
|
||
---
|
||
|
||
**Document End**
|