nordabiz/docs/architecture/11-troubleshooting-guide.md
Maciej Pienczyn 110d971dca
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
feat: migrate prod docs to OVH VPS + UTC→Warsaw timezone in all templates
Production moved from on-prem VM 249 (10.22.68.249) to OVH VPS
(57.128.200.27, inpi-vps-waw01). Updated ALL documentation, slash
commands, memory files, architecture docs, and deploy procedures.

Added |local_time Jinja filter (UTC→Europe/Warsaw) and converted
155 .strftime() calls across 71 templates so timestamps display
in Polish timezone regardless of server timezone.

Also includes: created_by_id tracking, abort import fix, ICS
calendar fix for missing end times, Pros Poland data cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:41:53 +02:00

2608 lines
59 KiB
Markdown

# 11. Troubleshooting Guide
**Document Type:** Operations Guide
**Last Updated:** 2026-04-04
**Maintainer:** DevOps Team
---
## Table of Contents
1. [Quick Reference](#1-quick-reference)
2. [Infrastructure & Network Issues](#2-infrastructure--network-issues)
3. [Application & Service Issues](#3-application--service-issues)
4. [Database Issues](#4-database-issues)
5. [API Integration Issues](#5-api-integration-issues)
6. [Authentication & Security Issues](#6-authentication--security-issues)
7. [Performance Issues](#7-performance-issues)
8. [Monitoring & Diagnostics](#8-monitoring--diagnostics)
9. [Emergency Procedures](#9-emergency-procedures)
10. [Diagnostic Commands Reference](#10-diagnostic-commands-reference)
---
## 1. Quick Reference
### 1.1 Emergency Contacts
| Role | Contact | Availability |
|------|---------|--------------|
| System Administrator | maciejpi@inpi.local | Business hours |
| Database Administrator | maciejpi@inpi.local | Business hours |
| On-Call Support | See CLAUDE.md | 24/7 |
### 1.2 Critical Services Status Check
```bash
# Quick health check - run this first!
curl -I https://nordabiznes.pl/health
# Expected: HTTP/2 200
# If failed, proceed to relevant section below
```
### 1.3 Issue Decision Tree
```mermaid
graph TD
A[Issue Detected] --> B{Can access site?}
B -->|No| C{From where?}
B -->|Yes, but slow| D[Check Performance Issues]
C -->|Nowhere| E[Section 2.1: ERR_TOO_MANY_REDIRECTS]
C -->|Only internal| E
C -->|500 Error| F[Section 3.1: Application Crash]
B -->|Yes, specific feature broken| G{Which feature?}
G -->|Login/Auth| H[Section 6: Authentication Issues]
G -->|Search| I[Section 3.3: Search Issues]
G -->|AI Chat| J[Section 5.2: Gemini API Issues]
G -->|Database| K[Section 4: Database Issues]
```
### 1.4 Severity Levels
| Level | Description | Response Time | Example |
|-------|-------------|---------------|---------|
| **CRITICAL** | Complete service outage | Immediate | ERR_TOO_MANY_REDIRECTS |
| **HIGH** | Major feature broken | < 1 hour | Database connection lost |
| **MEDIUM** | Minor feature degraded | < 4 hours | Search slow |
| **LOW** | Cosmetic or minor bug | Next business day | UI glitch |
---
## 2. Infrastructure & Network Issues
### 2.1 ERR_TOO_MANY_REDIRECTS
**Severity:** CRITICAL
**Incident History:** 2026-01-02 (30 min outage)
#### Symptoms
- Browser error: `ERR_TOO_MANY_REDIRECTS`
- Portal completely inaccessible via https://nordabiznes.pl
- Internal access works fine (http://57.128.200.27:5000)
- Affects 100% of external users
#### Root Cause
Nginx Proxy Manager (NPM) configured to forward to **port 80** instead of **port 5000**.
**Why this causes redirect loop:**
1. NPM forwards HTTPS HTTP to backend port 80
2. Nginx on port 80 sees HTTP and redirects to HTTPS
3. Request goes back to NPM, creating infinite loop
4. Browser aborts after ~20 redirects
#### Diagnosis
```bash
# 1. Check NPM proxy configuration
ssh maciejpi@10.22.68.250
docker exec nginx-proxy-manager_app_1 \
sqlite3 /data/database.sqlite \
"SELECT id, domain_names, forward_host, forward_port FROM proxy_host WHERE id = 27;"
# Expected output:
# 27|["nordabiznes.pl","www.nordabiznes.pl"]|57.128.200.27|5000
# If forward_port shows 80 → PROBLEM FOUND!
# 2. Test backend directly
curl -I http://57.128.200.27:80/
# If this returns 301 redirect → confirms issue
curl -I http://57.128.200.27:5000/health
# Should return 200 OK if Flask is running
```
#### Solution
**Option A: Fix via NPM Web UI (Recommended)**
```bash
# 1. Access NPM admin panel
open http://10.22.68.250:81
# 2. Navigate to: Proxy Hosts → nordabiznes.pl (ID 27)
# 3. Edit configuration:
# - Forward Hostname/IP: 57.128.200.27
# - Forward Port: 5000 (CRITICAL!)
# - Scheme: http
# 4. Save and test
```
**Option B: Fix via NPM API**
```python
import requests
NPM_URL = "http://10.22.68.250:81/api"
# Login to get token first (see NPM API docs)
data = {
"domain_names": ["nordabiznes.pl", "www.nordabiznes.pl"],
"forward_scheme": "http",
"forward_host": "57.128.200.27",
"forward_port": 5000, # CRITICAL: Must be 5000!
"certificate_id": 27,
"ssl_forced": True,
"http2_support": True
}
response = requests.put(
f"{NPM_URL}/nginx/proxy-hosts/27",
headers={"Authorization": f"Bearer {token}"},
json=data
)
```
#### Verification
```bash
# 1. External test (from outside INPI network)
curl -I https://nordabiznes.pl/health
# Expected: HTTP/2 200
# 2. Check NPM logs
ssh maciejpi@10.22.68.250
docker logs nginx-proxy-manager_app_1 --tail 20
# Should show 200 responses, not 301
```
#### Prevention
- **ALWAYS verify port 5000 after ANY NPM configuration change**
- Add monitoring alert for non-200 responses on /health
- Document NPM configuration in change requests
- Test from external network before marking changes complete
---
### 2.2 502 Bad Gateway
**Severity:** HIGH
#### Symptoms
- Browser shows "502 Bad Gateway" error
- NPM logs show "upstream connection failed"
- Site completely inaccessible
#### Root Causes
1. Flask/Gunicorn service stopped
2. Backend server (57.128.200.27) unreachable
3. Firewall blocking port 5000
#### Diagnosis
```bash
# 1. Check Flask service status
ssh maciejpi@57.128.200.27
sudo systemctl status nordabiznes
# 2. Check if port 5000 is listening
sudo netstat -tlnp | grep :5000
# Expected: gunicorn process listening
# 3. Check Flask logs
sudo journalctl -u nordabiznes -n 50 --no-pager
# 4. Test backend directly
curl http://localhost:5000/health
# Should return JSON with status
```
#### Solution
**If service is stopped:**
```bash
# Restart Flask application
sudo systemctl restart nordabiznes
# Check status
sudo systemctl status nordabiznes
# Verify it's working
curl http://localhost:5000/health
```
**If service won't start:**
```bash
# Check for syntax errors
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 -m py_compile app.py
# Check for missing dependencies
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 -c "import flask; import sqlalchemy"
# Check environment variables
sudo -u www-data cat /var/www/nordabiznes/.env | grep -v "PASSWORD\|SECRET\|KEY"
# Try running manually (for debugging)
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 app.py
```
**If network issue:**
```bash
# Test connectivity from NPM to backend
ssh maciejpi@10.22.68.250
curl -I http://57.128.200.27:5000/health
# Check firewall rules
ssh maciejpi@57.128.200.27
sudo iptables -L -n | grep 5000
```
#### Verification
```bash
curl -I https://nordabiznes.pl/health
# Expected: HTTP/2 200
```
---
### 2.3 504 Gateway Timeout
**Severity:** MEDIUM
#### Symptoms
- Browser shows "504 Gateway Timeout"
- Requests take >60 seconds
- Some requests succeed, others timeout
#### Root Causes
1. Database query hanging
2. External API timeout (Gemini, PageSpeed, etc.)
3. Insufficient Gunicorn workers
4. Resource exhaustion (CPU, memory)
#### Diagnosis
```bash
# 1. Check Gunicorn worker status
ssh maciejpi@57.128.200.27
ps aux | grep gunicorn
# Look for zombie workers or high CPU usage
# 2. Check database connections
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c \
"SELECT count(*) FROM pg_stat_activity WHERE datname = 'nordabiz';"
# 3. Check for long-running queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c \
"SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '5 seconds';"
# 4. Check system resources
top -n 1
free -h
df -h
# 5. Check Flask logs for slow requests
sudo journalctl -u nordabiznes -n 100 --no-pager | grep -E "slow|timeout|took"
```
#### Solution
**If database query hanging:**
```bash
# Identify and kill long-running query
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
# Find problematic query
SELECT pid, query FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '30 seconds';
# Kill it (replace PID)
SELECT pg_terminate_backend(12345);
```
**If resource exhaustion:**
```bash
# Restart Flask to clear memory
sudo systemctl restart nordabiznes
# Consider increasing Gunicorn workers (edit systemd service)
sudo nano /etc/systemd/system/nordabiznes.service
# Change: --workers=4 (adjust based on CPU cores)
sudo systemctl daemon-reload
sudo systemctl restart nordabiznes
```
**If external API timeout:**
```bash
# Check if Gemini API is responsive
curl -I https://generativelanguage.googleapis.com/v1beta/models
# Check PageSpeed API
curl -I https://www.googleapis.com/pagespeedonline/v5/runPagespeed
# Check Brave Search API
curl -I https://api.search.brave.com/res/v1/web/search
```
#### Verification
```bash
# Test response time
time curl -I https://nordabiznes.pl/health
# Should complete in < 2 seconds
```
---
### 2.4 SSL Certificate Issues
**Severity:** HIGH
#### Symptoms
- Browser shows "Your connection is not private"
- SSL certificate expired or invalid
- Mixed content warnings
#### Diagnosis
```bash
# 1. Check certificate expiry
echo | openssl s_client -servername nordabiznes.pl -connect nordabiznes.pl:443 2>/dev/null | \
openssl x509 -noout -dates
# 2. Check certificate details
curl -vI https://nordabiznes.pl 2>&1 | grep -E "SSL|certificate"
# 3. Check NPM certificate status
ssh maciejpi@10.22.68.250
docker exec nginx-proxy-manager_app_1 \
sqlite3 /data/database.sqlite \
"SELECT id, nice_name, expires_on FROM certificate WHERE id = 27;"
```
#### Solution
**If certificate expired:**
```bash
# NPM auto-renews Let's Encrypt certificates
# Force renewal via NPM UI or API
# Via UI:
# 1. Access http://10.22.68.250:81
# 2. SSL Certificates → nordabiznes.pl
# 3. Click "Renew" button
# Via CLI (if auto-renewal failed):
ssh maciejpi@10.22.68.250
docker exec nginx-proxy-manager_app_1 \
node /app/index.js certificate renew 27
```
**If mixed content warnings:**
```bash
# Check Flask is generating HTTPS URLs
# Verify in templates: url_for(..., _external=True, _scheme='https')
# Check CSP headers in app.py
grep "Content-Security-Policy" /var/www/nordabiznes/app.py
```
---
### 2.5 DNS Resolution Issues
**Severity:** MEDIUM
#### Symptoms
- `nslookup nordabiznes.pl` fails
- Site accessible by IP but not domain
- Inconsistent access from different networks
#### Diagnosis
```bash
# 1. Check external DNS (OVH)
nslookup nordabiznes.pl 8.8.8.8
# Should return: 85.237.177.83
# 2. Check internal DNS (inpi.local)
nslookup nordabiznes.inpi.local 10.22.68.1
# Should return: 57.128.200.27
# 3. Test from different locations
curl -I -H "Host: nordabiznes.pl" http://85.237.177.83/health
# 4. Check Fortigate NAT rules
# Access Fortigate admin panel and verify NAT entry:
# External: 85.237.177.83:443 → Internal: 10.22.68.250:443
```
#### Solution
**If external DNS issue:**
```bash
# Check OVH DNS settings
# Login to OVH control panel
# Verify A record: nordabiznes.pl → 85.237.177.83
# Verify A record: www.nordabiznes.pl → 85.237.177.83
```
**If internal DNS issue:**
```bash
# Update internal DNS server
# This requires access to INPI DNS management (see dns-manager skill)
```
---
## 3. Application & Service Issues
### 3.1 Application Crash / Won't Start
**Severity:** CRITICAL
#### Symptoms
- Flask service status shows "failed" or "inactive"
- Systemd shows error in logs
- Manual start fails with traceback
#### Diagnosis
```bash
# 1. Check service status
ssh maciejpi@57.128.200.27
sudo systemctl status nordabiznes
# 2. Check recent logs
sudo journalctl -u nordabiznes -n 100 --no-pager
# 3. Try manual start for detailed error
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 app.py
# Read the traceback carefully
```
#### Common Root Causes & Solutions
**A. Python Syntax Error**
```bash
# Symptom: SyntaxError in logs
# Cause: Recent code change introduced syntax error
# Fix: Check syntax
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 -m py_compile app.py
# Rollback if necessary
cd /var/www/nordabiznes
sudo -u www-data git log --oneline -5
sudo -u www-data git revert HEAD # or specific commit
sudo systemctl restart nordabiznes
```
**B. Missing Environment Variables**
```bash
# Symptom: KeyError or "SECRET_KEY not found"
# Cause: .env file missing or incomplete
# Fix: Check .env exists and has required variables
sudo -u www-data ls -la /var/www/nordabiznes/.env
sudo -u www-data cat /var/www/nordabiznes/.env | grep -E "^[A-Z_]+=" | wc -l
# Should have ~20 environment variables
# Required variables (add if missing):
# - SECRET_KEY
# - DATABASE_URL
# - GEMINI_API_KEY
# - BRAVE_SEARCH_API_KEY
# - GOOGLE_PAGESPEED_API_KEY
# - ADMIN_EMAIL
# - ADMIN_PASSWORD
```
**C. Database Connection Failed**
```bash
# Symptom: "could not connect to server" or "FATAL: password authentication failed"
# Cause: PostgreSQL not running or wrong credentials
# Fix: Check PostgreSQL
sudo systemctl status postgresql
# Test connection
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;"
# If password wrong, update .env and restart
```
**D. Missing Python Dependencies**
```bash
# Symptom: ImportError or ModuleNotFoundError
# Cause: Dependency not installed in venv
# Fix: Reinstall dependencies
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/pip install -r requirements.txt
# Verify specific package
sudo -u www-data /var/www/nordabiznes/venv/bin/pip show flask
```
**E. Port 5000 Already in Use**
```bash
# Symptom: "Address already in use"
# Cause: Another process using port 5000
# Fix: Find and kill process
sudo lsof -i :5000
sudo kill <PID>
# Or restart server if unclear
sudo reboot
```
#### Verification
```bash
sudo systemctl status nordabiznes
# Should show "active (running)"
curl http://localhost:5000/health
# Should return JSON
```
---
### 3.2 White Screen / Blank Page
**Severity:** HIGH
#### Symptoms
- Page loads but shows blank white screen
- No error message in browser
- HTML source is empty or minimal
#### Diagnosis
```bash
# 1. Check browser console (F12)
# Look for JavaScript errors
# 2. Check Flask logs
ssh maciejpi@57.128.200.27
sudo journalctl -u nordabiznes -n 50 --no-pager | grep ERROR
# 3. Check template rendering
curl https://nordabiznes.pl/ -o /tmp/page.html
less /tmp/page.html
# Check if HTML is complete
# 4. Check static assets loading
curl -I https://nordabiznes.pl/static/css/styles.css
# Should return 200
```
#### Root Causes & Solutions
**A. Template Rendering Error**
```bash
# Symptom: Jinja2 error in logs
# Cause: Syntax error in template file
# Fix: Check Flask logs for template name
sudo journalctl -u nordabiznes -n 100 | grep -i jinja
# Test template syntax
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 -c "
from jinja2 import Template
with open('templates/index.html') as f:
Template(f.read())
"
```
**B. JavaScript Error**
```bash
# Symptom: Console shows JS error
# Cause: Syntax error in JavaScript code
# Fix: Check browser console
# Common issues:
# - extra_js block has <script> tags (shouldn't!)
# - undefined variable reference
# - missing semicolon
# Template fix for extra_js:
# WRONG: {% block extra_js %}<script>code</script>{% endblock %}
# RIGHT: {% block extra_js %}code{% endblock %}
```
**C. Database Query Failed**
```bash
# Symptom: 500 error in network tab
# Cause: Database query error preventing page render
# Fix: Check Flask logs
sudo journalctl -u nordabiznes -n 50 | grep -i "sqlalchemy\|database"
# Check database connectivity
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;"
```
---
### 3.3 Search Not Working
**Severity:** MEDIUM
#### Symptoms
- Search returns no results for valid queries
- Search is very slow (>5 seconds)
- Search returns "Database error"
#### Diagnosis
```bash
# 1. Test search endpoint
curl "https://nordabiznes.pl/search?q=test" -v
# 2. Check search_service.py logs
ssh maciejpi@57.128.200.27
sudo journalctl -u nordabiznes -n 100 | grep -i search
# 3. Test database FTS
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
# Test FTS query
SELECT name, ts_rank(search_vector, to_tsquery('polish', 'web')) AS score
FROM companies
WHERE search_vector @@ to_tsquery('polish', 'web')
ORDER BY score DESC LIMIT 5;
# Check pg_trgm extension
SELECT * FROM pg_extension WHERE extname = 'pg_trgm';
```
#### Root Causes & Solutions
**A. Full-Text Search Index Outdated**
```bash
# Symptom: Recent companies don't appear in search
# Cause: search_vector not updated
# Fix: Rebuild FTS index
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
UPDATE companies SET search_vector =
to_tsvector('polish',
COALESCE(name, '') || ' ' ||
COALESCE(description, '') || ' ' ||
COALESCE(array_to_string(services, ' '), '') || ' ' ||
COALESCE(array_to_string(competencies, ' '), '')
);
VACUUM ANALYZE companies;
```
**B. Synonym Expansion Not Working**
```bash
# Symptom: Search for "www" doesn't find "strony internetowe"
# Cause: SYNONYM_EXPANSION dict in search_service.py incomplete
# Fix: Check synonyms
cd /var/www/nordabiznes
grep -A 20 "SYNONYM_EXPANSION" search_service.py
# Add missing synonyms if needed
# Restart service after editing
```
**C. Search Timeout**
```bash
# Symptom: Search takes >30 seconds
# Cause: Missing database indexes
# Fix: Add indexes
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
CREATE INDEX IF NOT EXISTS idx_companies_search_vector ON companies USING gin(search_vector);
CREATE INDEX IF NOT EXISTS idx_companies_name_trgm ON companies USING gin(name gin_trgm_ops);
VACUUM ANALYZE companies;
```
#### Verification
```bash
# Test search
curl "https://nordabiznes.pl/search?q=web" | grep -c "company-card"
# Should return number of results found
```
---
### 3.4 AI Chat Not Responding
**Severity:** MEDIUM
#### Symptoms
- Chat shows "thinking..." forever
- Chat returns error message
- Empty responses from AI
#### Root Causes & Solutions
See [Section 5.2: Gemini API Issues](#52-gemini-api-issues) for detailed troubleshooting.
Quick check:
```bash
# 1. Verify Gemini API key
ssh maciejpi@57.128.200.27
sudo -u www-data cat /var/www/nordabiznes/.env | grep GEMINI_API_KEY
# Should not be empty
# 2. Test Gemini API directly
curl -H "x-goog-api-key: YOUR_API_KEY" \
"https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \
-H "Content-Type: application/json" \
-d '{"contents":[{"parts":[{"text":"Hello"}]}]}'
# 3. Check quota
# Visit: https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas
```
---
## 4. Database Issues
### 4.1 Database Connection Failed
**Severity:** CRITICAL
#### Symptoms
- Flask logs show "could not connect to server"
- All database queries fail
- 500 error on all pages
#### Diagnosis
```bash
# 1. Check PostgreSQL service
ssh maciejpi@57.128.200.27
sudo systemctl status postgresql
# 2. Check PostgreSQL is listening
sudo netstat -tlnp | grep 5432
# Should show: LISTEN on 127.0.0.1:5432
# 3. Check logs
sudo journalctl -u postgresql -n 50
# 4. Test connection
psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;"
```
#### Solution
**If PostgreSQL is stopped:**
```bash
sudo systemctl start postgresql
sudo systemctl status postgresql
# If fails to start, check logs
sudo journalctl -u postgresql -n 100 --no-pager
```
**If connection refused:**
```bash
# Check pg_hba.conf allows local connections
sudo cat /etc/postgresql/*/main/pg_hba.conf | grep "127.0.0.1"
# Should have: host all all 127.0.0.1/32 md5
# Reload if changed
sudo systemctl reload postgresql
```
**If authentication failed:**
```bash
# Verify user exists
sudo -u postgres psql -c "\du nordabiz_app"
# Reset password if needed
sudo -u postgres psql
ALTER USER nordabiz_app WITH PASSWORD 'NEW_PASSWORD';
\q
# Update .env with new password
sudo nano /var/www/nordabiznes/.env
# Update DATABASE_URL line
# Restart Flask
sudo systemctl restart nordabiznes
```
#### Verification
```bash
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT count(*) FROM companies;"
# Should return count
```
---
### 4.2 Database Query Slow
**Severity:** MEDIUM
#### Symptoms
- Pages load slowly (>5 seconds)
- Database queries take long time
- High CPU usage on database server
#### Diagnosis
```bash
# 1. Check for slow queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '1 second'
ORDER BY duration DESC;
# 2. Check for missing indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey';
# 3. Check table statistics
SELECT schemaname, tablename, n_live_tup, n_dead_tup,
last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables
WHERE schemaname = 'public'
ORDER BY n_live_tup DESC;
# 4. Enable query logging temporarily
ALTER DATABASE nordabiz SET log_min_duration_statement = 1000;
-- Log queries taking > 1 second
```
#### Solution
**If missing indexes:**
```bash
# Add appropriate indexes based on queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Example: Index on foreign key
CREATE INDEX idx_company_news_company_id ON company_news(company_id);
-- Example: Composite index for common query
CREATE INDEX idx_users_email_active ON users(email, is_active);
-- Rebuild search index
REINDEX INDEX idx_companies_search_vector;
VACUUM ANALYZE;
```
**If high dead tuple ratio:**
```bash
# Run vacuum
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
VACUUM ANALYZE;
# For severe cases
VACUUM FULL companies; -- Locks table!
```
**If table statistics outdated:**
```bash
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
ANALYZE companies;
ANALYZE users;
ANALYZE ai_chat_messages;
```
#### Verification
```bash
# Check query performance improved
\timing
SELECT * FROM companies WHERE name ILIKE '%test%' LIMIT 10;
# Should complete in < 100ms
```
---
### 4.3 Database Disk Full
**Severity:** HIGH
#### Symptoms
- PostgreSQL logs show "No space left on device"
- INSERT/UPDATE queries fail
- Database becomes read-only
#### Diagnosis
```bash
# 1. Check disk usage
ssh maciejpi@57.128.200.27
df -h
# Check /var/lib/postgresql usage
# 2. Check database size
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT pg_size_pretty(pg_database_size('nordabiz'));
SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
# 3. Check WAL files
sudo du -sh /var/lib/postgresql/*/main/pg_wal/
```
#### Solution
**If WAL files accumulating:**
```bash
# Check WAL settings
sudo -u postgres psql -c "SHOW max_wal_size;"
sudo -u postgres psql -c "SHOW wal_keep_size;"
# Trigger checkpoint
sudo -u postgres psql -c "CHECKPOINT;"
```
**If old backups not cleaned:**
```bash
# Remove old backups (keep last 7 days)
find /backup/nordabiz/ -name "*.sql" -mtime +7 -delete
```
**If logs too large:**
```bash
# Truncate old logs
sudo journalctl --vacuum-time=7d
# Rotate PostgreSQL logs
sudo -u postgres pg_archivecleanup /var/lib/postgresql/*/main/pg_wal/ 000000010000000000000001
```
**Emergency: Archive and purge old data:**
```bash
# Archive old data before deletion
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Example: Archive old AI chat messages (>6 months)
CREATE TABLE ai_chat_messages_archive AS
SELECT * FROM ai_chat_messages
WHERE created_at < NOW() - INTERVAL '6 months';
DELETE FROM ai_chat_messages
WHERE created_at < NOW() - INTERVAL '6 months';
VACUUM FULL ai_chat_messages;
```
---
### 4.4 Database Migration Failed
**Severity:** HIGH
#### Symptoms
- Migration script returns error
- Database schema out of sync with code
- Missing tables or columns
#### Diagnosis
```bash
# 1. Check current schema version
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
\dt
# List all tables
\d companies
# Describe companies table
# 2. Check migration logs
ls -la /var/www/nordabiznes/database/migrations/
# 3. Check Flask-Migrate status (if using Alembic)
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/flask db current
```
#### Solution
**If table missing:**
```bash
# Re-run migration script
cd /var/www/nordabiznes
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz < database/schema.sql
```
**If column added but missing:**
```bash
# Add column manually
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
ALTER TABLE companies ADD COLUMN IF NOT EXISTS new_column VARCHAR(255);
-- Grant permissions
GRANT ALL ON TABLE companies TO nordabiz_app;
```
**If migration stuck:**
```bash
# Rollback last migration
sudo -u www-data /var/www/nordabiznes/venv/bin/flask db downgrade
# Re-apply
sudo -u www-data /var/www/nordabiznes/venv/bin/flask db upgrade
```
#### Verification
```bash
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
\d companies
# Verify schema matches expected structure
```
---
## 5. API Integration Issues
### 5.1 API Rate Limit Exceeded
**Severity:** MEDIUM
#### Symptoms
- 429 "Too Many Requests" errors
- API calls fail with quota exceeded message
- Features stop working after heavy usage
#### Diagnosis
```bash
# 1. Check API usage in database
ssh maciejpi@57.128.200.27
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Gemini API usage today
SELECT COUNT(*), SUM(input_tokens), SUM(output_tokens)
FROM ai_api_costs
WHERE DATE(created_at) = CURRENT_DATE;
-- PageSpeed API usage today
SELECT COUNT(*) FROM company_website_analysis
WHERE DATE(created_at) = CURRENT_DATE;
-- Brave Search API usage this month
SELECT COUNT(*) FROM company_news
WHERE DATE(created_at) >= DATE_TRUNC('month', CURRENT_DATE);
# 2. Check rate limiting logs
sudo journalctl -u nordabiznes -n 100 | grep -i "rate limit\|quota\|429"
```
#### API Quotas Reference
| API | Free Tier Limit | Current Usage Query |
|-----|-----------------|---------------------|
| Gemini AI | 1,500 req/day | `SELECT COUNT(*) FROM ai_api_costs WHERE DATE(created_at) = CURRENT_DATE;` |
| PageSpeed | 25,000 req/day | `SELECT COUNT(*) FROM company_website_analysis WHERE DATE(created_at) = CURRENT_DATE;` |
| Brave Search | 2,000 req/month | `SELECT COUNT(*) FROM company_news WHERE created_at >= DATE_TRUNC('month', CURRENT_DATE);` |
| Google Places | Limited | Check Google Cloud Console |
| MS Graph | Per tenant | Check Azure AD logs |
#### Solution
**If Gemini quota exceeded:**
```bash
# Wait until next day (quota resets at midnight UTC)
# OR upgrade to paid tier
# Temporary workaround: Disable AI chat
sudo nano /var/www/nordabiznes/app.py
# Comment out @app.route('/chat') temporarily
sudo systemctl restart nordabiznes
```
**If PageSpeed quota exceeded:**
```bash
# Stop SEO audit script
pkill -f seo_audit.py
# Wait until next day
# Consider batching audits to stay under quota
```
**If Brave Search quota exceeded:**
```bash
# Disable news monitoring temporarily
# Wait until next month
# Consider upgrading to paid tier ($5/month for 20k requests)
```
#### Prevention
```bash
# Add quota monitoring alerts
# Create script: /var/www/nordabiznes/scripts/check_api_quotas.sh
#!/bin/bash
GEMINI_COUNT=$(sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -t -c \
"SELECT COUNT(*) FROM ai_api_costs WHERE DATE(created_at) = CURRENT_DATE;")
if [ "$GEMINI_COUNT" -gt 1400 ]; then
echo "WARNING: Gemini API usage at $GEMINI_COUNT / 1500"
# Send alert email
fi
# Add to crontab: run hourly
# 0 * * * * /var/www/nordabiznes/scripts/check_api_quotas.sh
```
---
### 5.2 Gemini API Issues
**Severity:** MEDIUM
#### Symptoms
- AI chat returns empty responses
- "Safety filter blocked response" error
- Gemini API timeout
- "Conversation not found" error
#### Diagnosis
```bash
# 1. Test Gemini API directly
GEMINI_KEY=$(sudo -u www-data grep GEMINI_API_KEY /var/www/nordabiznes/.env | cut -d= -f2)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_KEY" \
-H "Content-Type: application/json" \
-d '{"contents":[{"parts":[{"text":"Hello, test"}]}]}'
# 2. Check Flask logs for Gemini errors
ssh maciejpi@57.128.200.27
sudo journalctl -u nordabiznes -n 100 | grep -i gemini
# 3. Check conversation ownership
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT id, user_id, created_at
FROM ai_chat_conversations
WHERE id = 123; -- Replace with conversation ID
```
#### Common Issues & Solutions
**A. Empty AI Responses**
```bash
# Cause: Safety filters blocking response
# OR context too long
# Check last message for safety filter
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT message, ai_response, error_message
FROM ai_chat_messages
ORDER BY created_at DESC
LIMIT 5;
# If error_message contains "safety" or "blocked":
# - Rephrase user query to be less controversial
# - No technical fix needed, it's Gemini's safety system
```
**B. Conversation Not Found**
```bash
# Cause: User trying to access someone else's conversation
# Verify conversation ownership
SELECT c.id, c.user_id, u.email
FROM ai_chat_conversations c
JOIN users u ON c.user_id = u.id
WHERE c.id = 123; -- Replace ID
# Fix: Ensure frontend passes correct conversation_id
# OR create new conversation for user
```
**C. Token Limit Exceeded**
```bash
# Cause: Conversation history too long (>200k tokens)
# Check token usage
SELECT id, input_tokens, output_tokens,
input_tokens + output_tokens AS total_tokens
FROM ai_chat_messages
WHERE conversation_id = 123
ORDER BY created_at DESC;
# Fix: Trim old messages
DELETE FROM ai_chat_messages
WHERE conversation_id = 123
AND created_at < (
SELECT created_at
FROM ai_chat_messages
WHERE conversation_id = 123
ORDER BY created_at DESC
LIMIT 1 OFFSET 10
);
```
**D. API Key Invalid**
```bash
# Symptom: 401 Unauthorized or 403 Forbidden
# Verify API key
sudo -u www-data grep GEMINI_API_KEY /var/www/nordabiznes/.env
# Test key directly
curl -H "x-goog-api-key: YOUR_KEY" \
"https://generativelanguage.googleapis.com/v1beta/models"
# If invalid, regenerate key in Google Cloud Console
# https://console.cloud.google.com/apis/credentials
```
#### Verification
```bash
# Test AI chat endpoint
curl -X POST https://nordabiznes.pl/api/chat \
-H "Content-Type: application/json" \
-d '{"conversation_id":123,"message":"test"}' \
-b "session=YOUR_SESSION_COOKIE"
# Should return JSON with AI response
```
---
### 5.3 PageSpeed API Issues
**Severity:** LOW
#### Symptoms
- SEO audit fails with API error
- PageSpeed scores show as null/0
- Timeout errors in audit script
#### Diagnosis
```bash
# 1. Test PageSpeed API directly
PAGESPEED_KEY=$(sudo -u www-data grep GOOGLE_PAGESPEED_API_KEY /var/www/nordabiznes/.env | cut -d= -f2)
curl "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://nordabiznes.pl&key=$PAGESPEED_KEY"
# 2. Check audit logs
ssh maciejpi@57.128.200.27
sudo journalctl -u nordabiznes -n 100 | grep -i pagespeed
# 3. Check recent audits
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT company_id, url, seo_score, performance_score,
audited_at, error_message
FROM company_website_analysis
ORDER BY audited_at DESC
LIMIT 10;
```
#### Solution
**If API key invalid:**
```bash
# Regenerate key in Google Cloud Console
# https://console.cloud.google.com/apis/credentials?project=gen-lang-client-0540794446
# Update .env
sudo nano /var/www/nordabiznes/.env
# GOOGLE_PAGESPEED_API_KEY=NEW_KEY
sudo systemctl restart nordabiznes
```
**If quota exceeded:**
```bash
# Wait until next day (25k/day limit)
# Check usage
# https://console.cloud.google.com/apis/api/pagespeedonline.googleapis.com/quotas
```
**If timeout:**
```bash
# Increase timeout in seo_audit.py
sudo nano /var/www/nordabiznes/scripts/seo_audit.py
# Find: timeout=30
# Change to: timeout=60
# Or run audits in smaller batches
python seo_audit.py --batch 1-10
# Wait 5 minutes between batches
```
---
### 5.4 Brave Search API Issues
**Severity:** LOW
#### Symptoms
- News monitoring returns no results
- Brave API 429 error
- Invalid search results
#### Diagnosis
```bash
# 1. Test Brave API directly
BRAVE_KEY=$(sudo -u www-data grep BRAVE_SEARCH_API_KEY /var/www/nordabiznes/.env | cut -d= -f2)
curl -H "X-Subscription-Token: $BRAVE_KEY" \
"https://api.search.brave.com/res/v1/news/search?q=test&count=5"
# 2. Check usage this month
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT COUNT(*) AS searches_this_month
FROM company_news
WHERE created_at >= DATE_TRUNC('month', CURRENT_DATE);
-- Free tier: 2,000/month
# 3. Check for error logs
sudo journalctl -u nordabiznes -n 100 | grep -i brave
```
#### Solution
**If quota exceeded (2000/month):**
```bash
# Wait until next month
# OR upgrade to paid tier
# Temporary: Disable news monitoring
# Comment out news fetch cron job
```
**If API key invalid:**
```bash
# Get new key from https://brave.com/search/api/
# Update .env
sudo nano /var/www/nordabiznes/.env
# BRAVE_SEARCH_API_KEY=NEW_KEY
sudo systemctl restart nordabiznes
```
---
## 6. Authentication & Security Issues
### 6.1 Cannot Login / Session Expired
**Severity:** MEDIUM
#### Symptoms
- "Invalid credentials" despite correct password
- Redirected to login immediately after logging in
- Session expires too quickly
- "CSRF token missing" error
#### Diagnosis
```bash
# 1. Check user exists and is active
ssh maciejpi@57.128.200.27
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT id, email, is_active, email_verified, failed_login_attempts
FROM users
WHERE email = 'user@example.com';
# 2. Check session configuration
grep -E "SECRET_KEY|PERMANENT_SESSION_LIFETIME" /var/www/nordabiznes/app.py
# 3. Check Flask logs for auth errors
sudo journalctl -u nordabiznes -n 100 | grep -i "login\|session\|auth"
# 4. Test from server (bypass network)
curl -c /tmp/cookies.txt -X POST http://localhost:5000/login \
-d "email=test@nordabiznes.pl&password=TEST_PASSWORD"
```
#### Common Issues & Solutions
**A. Account Locked (Failed Login Attempts)**
```bash
# Check failed attempts
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT email, failed_login_attempts, last_failed_login
FROM users
WHERE email = 'user@example.com';
# If >= 5 attempts, reset:
UPDATE users SET failed_login_attempts = 0
WHERE email = 'user@example.com';
```
**B. Email Not Verified**
```bash
# Check verification status
SELECT email, email_verified, verification_token, verification_token_expiry
FROM users
WHERE email = 'user@example.com';
# Force verify (for testing)
UPDATE users SET email_verified = TRUE
WHERE email = 'user@example.com';
```
**C. Session Cookie Not Persisting**
```bash
# Check cookie settings in app.py
grep -A 5 "SESSION_COOKIE" /var/www/nordabiznes/app.py
# Should have:
# SESSION_COOKIE_SECURE = True # HTTPS only
# SESSION_COOKIE_HTTPONLY = True # No JS access
# SESSION_COOKIE_SAMESITE = 'Lax' # CSRF protection
# If accessing via HTTP (not HTTPS), session won't work
# Ensure using https://nordabiznes.pl not http://
```
**D. CSRF Token Mismatch**
```bash
# Symptom: "400 Bad Request - CSRF token missing"
# Cause: Form submitted without CSRF token
# Fix: Ensure all forms have:
# {{ form.hidden_tag() }} # WTForms
# OR
# <input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
# Check template
grep -r "csrf_token" /var/www/nordabiznes/templates/login.html
```
**E. Password Hash Algorithm Changed**
```bash
# Symptom: Old users can't login after upgrade
# Check hash format
SELECT id, email, SUBSTRING(password_hash, 1, 20)
FROM users
WHERE email = 'user@example.com';
# Should start with: pbkdf2:sha256:
# If different, user needs password reset
# Send reset email via /forgot-password
```
#### Verification
```bash
# Test login flow
curl -c /tmp/cookies.txt -X POST http://localhost:5000/login \
-d "email=test@nordabiznes.pl&password=TEST_PASSWORD" \
-L -v
# Should see: Set-Cookie: session=...
# Should redirect to /dashboard
```
---
### 6.2 Unauthorized Access / Permission Denied
**Severity:** HIGH
#### Symptoms
- "403 Forbidden" error
- User can access pages they shouldn't
- Admin panel not accessible
#### Diagnosis
```bash
# 1. Check user role
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT id, email, is_admin, is_norda_member
FROM users
WHERE email = 'user@example.com';
# 2. Check route decorators
grep -B 2 "@app.route('/admin" /var/www/nordabiznes/app.py
# Should have: @login_required and @admin_required
# 3. Check Flask logs
sudo journalctl -u nordabiznes -n 50 | grep -i "forbidden\|unauthorized"
```
#### Solution
**If user should be admin:**
```bash
# Grant admin role
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
UPDATE users SET is_admin = TRUE
WHERE email = 'admin@nordabiznes.pl';
```
**If authorization check broken:**
```bash
# Check app.py decorators
# Should have:
@app.route('/admin/users')
@login_required
@admin_required
def admin_users():
...
# Verify @admin_required is defined:
grep -A 5 "def admin_required" /var/www/nordabiznes/app.py
```
**If company ownership check failed:**
```bash
# Verify company-user association
SELECT c.id, c.name, u.email
FROM companies c
LEFT JOIN users u ON c.id = u.company_id
WHERE c.slug = 'company-slug';
# Update user's company
UPDATE users SET company_id = 123
WHERE email = 'user@example.com';
```
---
### 6.3 Password Reset Not Working
**Severity:** MEDIUM
#### Symptoms
- Password reset email not received
- Reset token expired or invalid
- "Invalid token" error
#### Diagnosis
```bash
# 1. Check user reset token
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT email, reset_token, reset_token_expiry
FROM users
WHERE email = 'user@example.com';
# 2. Check email service logs
sudo journalctl -u nordabiznes -n 100 | grep -i "email\|smtp"
# 3. Test MS Graph API (email service)
# Check if AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID set
sudo -u www-data grep AZURE /var/www/nordabiznes/.env
```
#### Solution
**If token expired:**
```bash
# Tokens expire after 1 hour
# Generate new token via /forgot-password
# OR manually extend expiry:
UPDATE users SET reset_token_expiry = NOW() + INTERVAL '1 hour'
WHERE email = 'user@example.com';
```
**If email not sent:**
```bash
# Check MS Graph credentials
sudo -u www-data python3 << 'EOF'
import os
from email_service import EmailService
service = EmailService()
result = service.send_email(
to_email="test@example.com",
subject="Test",
body="Test email"
)
print(result)
EOF
# If fails, check Azure AD app registration
# Ensure "Mail.Send" permission granted
```
**Manual password reset (emergency):**
```bash
# Generate new password hash
sudo -u www-data python3 << 'EOF'
from werkzeug.security import generate_password_hash
password = "NewPassword123"
print(generate_password_hash(password))
EOF
# Update database
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
UPDATE users SET password_hash = 'HASH_FROM_ABOVE'
WHERE email = 'user@example.com';
```
---
## 7. Performance Issues
### 7.1 Slow Page Load Times
**Severity:** MEDIUM
#### Symptoms
- Pages take >5 seconds to load
- TTFB (Time to First Byte) is high
- Browser shows "waiting for nordabiznes.pl..."
#### Diagnosis
```bash
# 1. Measure response time
time curl -I https://nordabiznes.pl/
# 2. Check Gunicorn worker status
ssh maciejpi@57.128.200.27
ps aux | grep gunicorn
# Look for: worker processes (should be 4-8)
# 3. Check server load
top -n 1
# Look at: CPU usage, memory usage, load average
# 4. Check database query times
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT calls, mean_exec_time, query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
-- If pg_stat_statements not enabled, see solution below
# 5. Profile Flask app
sudo journalctl -u nordabiznes -n 100 | grep -E "took|slow|timeout"
```
#### Root Causes & Solutions
**A. Too Few Gunicorn Workers**
```bash
# Current workers
ps aux | grep gunicorn | grep -v grep | wc -l
# Recommended: (2 x CPU cores) + 1
# For 4 core VM: 9 workers
# Update systemd service
sudo nano /etc/systemd/system/nordabiznes.service
# Change:
ExecStart=/var/www/nordabiznes/venv/bin/gunicorn --workers=9 \
--bind 0.0.0.0:5000 --timeout 120 app:app
sudo systemctl daemon-reload
sudo systemctl restart nordabiznes
```
**B. Slow Database Queries**
```bash
# Enable query stats (if not enabled)
sudo -u postgres psql
ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';
# Restart PostgreSQL
sudo systemctl restart postgresql
# Check slow queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT calls, mean_exec_time, query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
# Add indexes for slow queries (see Section 4.2)
```
**C. External API Timeouts**
```bash
# Check for API timeout logs
sudo journalctl -u nordabiznes -n 200 | grep -i timeout
# Common culprits:
# - Gemini API (text generation)
# - PageSpeed API (site audit)
# - Brave Search API
# Solution: Add caching
# Example: Cache PageSpeed results for 24 hours
# app.py modification (pseudocode):
# if last_audit < 24h ago:
# return cached_result
# else:
# fetch new audit
```
**D. Missing Static Asset Caching**
```bash
# Check cache headers
curl -I https://nordabiznes.pl/static/css/styles.css | grep -i cache
# Should have: Cache-Control: max-age=31536000
# If missing, add to NPM proxy or app.py:
@app.after_request
def add_cache_header(response):
if request.path.startswith('/static/'):
response.cache_control.max_age = 31536000
return response
```
**E. Large Database Result Sets**
```bash
# Check for N+1 queries or loading too much data
# Example bad query:
# for company in Company.query.all(): # Loads ALL companies!
# print(company.name)
# Fix: Add pagination
# companies = Company.query.paginate(page=1, per_page=20)
```
#### Verification
```bash
# Test response time
for i in {1..5}; do
time curl -s -o /dev/null https://nordabiznes.pl/
done
# Should average < 500ms
```
---
### 7.2 High Memory Usage
**Severity:** MEDIUM
#### Symptoms
- Server OOM (Out of Memory) errors
- Swapping active (slow performance)
- Gunicorn workers killed by OOM killer
#### Diagnosis
```bash
# 1. Check memory usage
ssh maciejpi@57.128.200.27
free -h
# 2. Check which process using memory
ps aux --sort=-%mem | head -10
# 3. Check for memory leaks
# Monitor over time:
watch -n 5 'ps aux | grep gunicorn | awk "{sum+=\$6} END {print sum/1024 \" MB\"}"'
# 4. Check OOM killer logs
sudo dmesg | grep -i "out of memory\|oom"
```
#### Solution
**If Gunicorn workers too many:**
```bash
# Reduce workers
sudo nano /etc/systemd/system/nordabiznes.service
# Change: --workers=9 to --workers=4
sudo systemctl daemon-reload
sudo systemctl restart nordabiznes
```
**If memory leak in application:**
```bash
# Restart workers periodically
sudo nano /etc/systemd/system/nordabiznes.service
# Add: --max-requests=1000 --max-requests-jitter=100
# This restarts workers after 1000 requests
sudo systemctl daemon-reload
sudo systemctl restart nordabiznes
```
**If PostgreSQL using too much memory:**
```bash
# Check PostgreSQL memory settings
sudo -u postgres psql -c "SHOW shared_buffers;"
sudo -u postgres psql -c "SHOW work_mem;"
# Reduce if necessary
sudo nano /etc/postgresql/*/main/postgresql.conf
# shared_buffers = 256MB # Was 512MB
# work_mem = 4MB # Was 16MB
sudo systemctl restart postgresql
```
**If server needs more RAM:**
```bash
# Increase VM RAM in Proxmox
# OR add swap space
# Add 2GB swap
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
# Make permanent
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
```
---
### 7.3 High CPU Usage
**Severity:** MEDIUM
#### Symptoms
- CPU at 100% constantly
- Server load average > number of cores
- Slow response times
#### Diagnosis
```bash
# 1. Check CPU usage
ssh maciejpi@57.128.200.27
top -n 1
# Look for processes using >80% CPU
# 2. Check load average
uptime
# Load should be < number of CPU cores
# 3. Identify CPU-heavy queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT pid, now() - query_start AS duration, state, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY now() - query_start DESC;
```
#### Solution
**If database query CPU-intensive:**
```bash
# Kill long-running query
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT pg_terminate_backend(PID);
# Add index to optimize query
# See Section 4.2
```
**If AI chat overwhelming CPU:**
```bash
# Add rate limiting to chat endpoint
# app.py modification:
from flask_limiter import Limiter
limiter = Limiter(app, key_func=lambda: current_user.id)
@app.route('/api/chat', methods=['POST'])
@limiter.limit("10 per minute") # Add this
def chat_api():
...
```
**If search causing high CPU:**
```bash
# Optimize search query
# Use indexes instead of ILIKE
# Cache search results
# Add to app.py:
from functools import lru_cache
@lru_cache(maxsize=100)
def search_companies_cached(query):
return search_companies(db, query)
```
---
## 8. Monitoring & Diagnostics
### 8.1 Health Check Endpoints
```bash
# Application health
curl https://nordabiznes.pl/health
# Expected response:
{
"status": "healthy",
"database": "connected",
"timestamp": "2026-01-10T12:00:00Z"
}
# Database health
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;"
# NPM health (from proxy server)
ssh maciejpi@10.22.68.250
docker ps | grep nginx-proxy-manager
# Should show: Up X hours
# Flask service health
ssh maciejpi@57.128.200.27
sudo systemctl status nordabiznes
# Should show: active (running)
```
---
### 8.2 Log Locations
```bash
# Flask application logs
sudo journalctl -u nordabiznes -n 100 --no-pager
# Follow live logs
sudo journalctl -u nordabiznes -f
# PostgreSQL logs
sudo journalctl -u postgresql -n 50
# NPM logs
ssh maciejpi@10.22.68.250
docker logs nginx-proxy-manager_app_1 --tail 50 -f
# System logs
sudo journalctl -n 100
# Nginx access logs (on backend)
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log
```
---
### 8.3 Performance Metrics
```bash
# Response time monitoring
# Create script: /usr/local/bin/check_nordabiz_performance.sh
#!/bin/bash
RESPONSE_TIME=$(curl -w '%{time_total}\n' -o /dev/null -s https://nordabiznes.pl/health)
echo "Response time: ${RESPONSE_TIME}s"
if (( $(echo "$RESPONSE_TIME > 2" | bc -l) )); then
echo "WARNING: Slow response time!"
fi
# Add to cron: */5 * * * * /usr/local/bin/check_nordabiz_performance.sh
```
```bash
# Database performance
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Connection count
SELECT count(*) FROM pg_stat_activity;
-- Active queries
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';
-- Cache hit ratio (should be > 99%)
SELECT
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100 AS cache_hit_ratio
FROM pg_statio_user_tables;
-- Table sizes
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
```
---
### 8.4 Database Backup Verification
```bash
# Check last backup
ssh maciejpi@57.128.200.27
ls -lah /backup/nordabiz/ | head -10
# Expected: Daily backups (.sql files)
# Test restore (to test database)
sudo -u postgres createdb nordabiz_test
sudo -u postgres psql nordabiz_test < /backup/nordabiz/nordabiz_YYYY-MM-DD.sql
# Verify restore
sudo -u postgres psql nordabiz_test -c "SELECT count(*) FROM companies;"
# Cleanup
sudo -u postgres dropdb nordabiz_test
```
---
## 9. Emergency Procedures
### 9.1 Complete Service Outage
**Severity:** CRITICAL
#### Immediate Actions (First 5 Minutes)
```bash
# 1. Verify outage scope
curl -I https://nordabiznes.pl/health
# If fails, proceed
# 2. Check from internal network
ssh maciejpi@57.128.200.27
curl -I http://localhost:5000/health
# If this works → Network/NPM issue
# If this fails → Application issue
# 3. Notify stakeholders
# Send email/message: "nordabiznes.pl experiencing outage, investigating"
# 4. Check service status
sudo systemctl status nordabiznes
sudo systemctl status postgresql
```
#### If Network/NPM Issue
```bash
# 1. Verify NPM is running
ssh maciejpi@10.22.68.250
docker ps | grep nginx-proxy-manager
# If not running:
docker start nginx-proxy-manager_app_1
# 2. Check NPM configuration
docker exec nginx-proxy-manager_app_1 \
sqlite3 /data/database.sqlite \
"SELECT id, forward_host, forward_port FROM proxy_host WHERE id = 27;"
# Must show: 27|57.128.200.27|5000
# 3. Check Fortigate NAT
# Access Fortigate admin panel
# Verify: 85.237.177.83:443 → 10.22.68.250:443
```
#### If Application Issue
```bash
# 1. Check Flask service
ssh maciejpi@57.128.200.27
sudo systemctl status nordabiznes
# If failed, check logs
sudo journalctl -u nordabiznes -n 50
# 2. Try restart
sudo systemctl restart nordabiznes
# If restart fails, check manually
cd /var/www/nordabiznes
sudo -u www-data /var/www/nordabiznes/venv/bin/python3 app.py
# Read error message
# 3. Common quick fixes:
# - Syntax error: git revert last commit
# - Database down: sudo systemctl start postgresql
# - Port conflict: sudo lsof -i :5000 && kill PID
```
#### If Database Issue
```bash
# 1. Check PostgreSQL
sudo systemctl status postgresql
# If stopped:
sudo systemctl start postgresql
# If start fails:
sudo journalctl -u postgresql -n 50
# 2. Check disk space
df -h
# If full, clean old backups/logs (see Section 4.3)
# 3. Emergency: Restore from backup
sudo systemctl stop nordabiznes
sudo -u postgres dropdb nordabiz
sudo -u postgres createdb nordabiz
sudo -u postgres psql nordabiz < /backup/nordabiz/latest.sql
sudo systemctl start nordabiznes
```
---
### 9.2 Data Loss / Corruption
**Severity:** CRITICAL
#### Immediate Actions
```bash
# 1. STOP the application immediately
sudo systemctl stop nordabiznes
# 2. Create emergency backup of current state
sudo -u postgres pg_dump nordabiz > /tmp/nordabiz_emergency_$(date +%Y%m%d_%H%M%S).sql
# 3. Assess damage
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Check table counts
SELECT 'companies' AS table, count(*) FROM companies
UNION ALL
SELECT 'users', count(*) FROM users
UNION ALL
SELECT 'ai_chat_conversations', count(*) FROM ai_chat_conversations;
-- Compare with expected counts (should have ~80 companies, etc.)
```
#### Recovery Procedures
**If recent corruption (< 24 hours ago):**
```bash
# Restore from last night's backup
sudo systemctl stop nordabiznes
sudo -u postgres dropdb nordabiz
sudo -u postgres createdb nordabiz
sudo -u postgres psql nordabiz < /backup/nordabiz/nordabiz_$(date -d yesterday +%Y-%m-%d).sql
# Re-grant permissions
sudo -u postgres psql nordabiz << 'EOF'
GRANT ALL PRIVILEGES ON DATABASE nordabiz TO nordabiz_app;
GRANT ALL ON ALL TABLES IN SCHEMA public TO nordabiz_app;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO nordabiz_app;
EOF
sudo systemctl start nordabiznes
```
**If partial data loss:**
```bash
# Identify missing/corrupted records
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Example: Find companies with NULL required fields
SELECT id, slug, name FROM companies WHERE name IS NULL;
-- Restore specific table from backup
# Extract table from backup
pg_restore -t companies -d nordabiz /backup/nordabiz/latest.sql
```
---
### 9.3 Security Breach
**Severity:** CRITICAL
#### Immediate Actions (First 10 Minutes)
```bash
# 1. ISOLATE the server
ssh maciejpi@57.128.200.27
# Block all incoming traffic except your IP
sudo iptables -A INPUT -s YOUR_IP -j ACCEPT
sudo iptables -A INPUT -j DROP
# 2. Create forensic copy
sudo -u postgres pg_dump nordabiz > /tmp/forensic_$(date +%Y%m%d_%H%M%S).sql
sudo tar czf /tmp/www_forensic.tar.gz /var/www/nordabiznes/
# 3. Check for unauthorized access
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
-- Check for new admin users
SELECT id, email, created_at, is_admin
FROM users
WHERE is_admin = TRUE
ORDER BY created_at DESC;
-- Check for recent logins
SELECT user_id, ip_address, created_at
FROM user_login_history
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;
# 4. Check logs for suspicious activity
sudo journalctl -u nordabiznes --since "24 hours ago" | grep -iE "admin|delete|drop|unauthorized"
# 5. Notify stakeholders
# Email: "Security incident detected on nordabiznes.pl, investigating"
```
#### Investigation
```bash
# Check for SQL injection attempts
sudo journalctl -u nordabiznes --since "7 days ago" | grep -i "UNION\|DROP\|;--"
# Check for unauthorized file changes
sudo find /var/www/nordabiznes/ -type f -mtime -1 -ls
# Check for backdoors
sudo grep -r "eval\|exec\|system\|subprocess" /var/www/nordabiznes/*.py
# Check database for malicious data
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
SELECT * FROM users WHERE email LIKE '%<script%';
SELECT * FROM companies WHERE description LIKE '%<script%';
```
#### Remediation
```bash
# 1. Change all passwords
# - Database passwords
# - Admin user passwords
# - API keys
# 2. Revoke compromised sessions
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz
DELETE FROM flask_sessions; -- Force all users to re-login
# 3. Update all API keys
sudo nano /var/www/nordabiznes/.env
# Regenerate all keys in respective consoles
# 4. Patch vulnerability
# Based on investigation findings
# 5. Restore normal operation
sudo iptables -F # Clear firewall rules
sudo systemctl restart nordabiznes
```
---
## 10. Diagnostic Commands Reference
### 10.1 Quick Health Checks
```bash
# Complete health check (run all at once)
echo "=== Application Health ===" && \
curl -I https://nordabiznes.pl/health && \
echo -e "\n=== Service Status ===" && \
ssh maciejpi@57.128.200.27 "sudo systemctl status nordabiznes --no-pager | head -5" && \
echo -e "\n=== Database Connection ===" && \
ssh maciejpi@57.128.200.27 "sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz -c 'SELECT count(*) FROM companies;'" && \
echo -e "\n=== Server Load ===" && \
ssh maciejpi@57.128.200.27 "uptime"
```
### 10.2 NPM Proxy Diagnostics
```bash
# NPM configuration check
ssh maciejpi@10.22.68.250 "docker exec nginx-proxy-manager_app_1 \
sqlite3 /data/database.sqlite \
\"SELECT id, domain_names, forward_host, forward_port, enabled FROM proxy_host WHERE id = 27;\""
# NPM logs (live)
ssh maciejpi@10.22.68.250 "docker logs nginx-proxy-manager_app_1 --tail 20 -f"
# NPM container status
ssh maciejpi@10.22.68.250 "docker ps | grep nginx-proxy-manager"
# Test backend from NPM server
ssh maciejpi@10.22.68.250 "curl -I http://57.128.200.27:5000/health"
```
### 10.3 Database Diagnostics
```bash
# Database quick stats
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz << 'EOF'
SELECT 'Companies' AS metric, count(*) AS value FROM companies
UNION ALL SELECT 'Users', count(*) FROM users
UNION ALL SELECT 'Active sessions', count(*) FROM pg_stat_activity
UNION ALL SELECT 'DB size (MB)', pg_database_size('nordabiz')/1024/1024;
EOF
# Find slow queries
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz << 'EOF'
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '2 seconds'
ORDER BY duration DESC;
EOF
# Check locks
sudo -u www-data psql -h localhost -U nordabiz_app -d nordabiz << 'EOF'
SELECT relation::regclass, mode, granted
FROM pg_locks
WHERE NOT granted;
EOF
```
### 10.4 Performance Diagnostics
```bash
# Response time test (10 requests)
for i in {1..10}; do
curl -w "Request $i: %{time_total}s\n" -o /dev/null -s https://nordabiznes.pl/
done
# Server resource usage
ssh maciejpi@57.128.200.27 "top -b -n 1 | head -20"
# Disk usage
ssh maciejpi@57.128.200.27 "df -h && echo -e '\n=== Top 10 Directories ===\n' && du -sh /* 2>/dev/null | sort -rh | head -10"
# Network connectivity
ping -c 5 nordabiznes.pl
traceroute nordabiznes.pl
# SSL certificate check
echo | openssl s_client -servername nordabiznes.pl -connect nordabiznes.pl:443 2>/dev/null | openssl x509 -noout -dates -subject
```
### 10.5 API Integration Diagnostics
```bash
# Test all external APIs
ssh maciejpi@57.128.200.27
# Gemini API
GEMINI_KEY=$(sudo -u www-data grep GEMINI_API_KEY .env | cut -d= -f2)
curl -s -H "x-goog-api-key: $GEMINI_KEY" \
"https://generativelanguage.googleapis.com/v1beta/models" | jq '.models[0].name'
# PageSpeed API
PAGESPEED_KEY=$(sudo -u www-data grep GOOGLE_PAGESPEED_API_KEY .env | cut -d= -f2)
curl -s "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://nordabiznes.pl&key=$PAGESPEED_KEY" | jq '.lighthouseResult.categories.performance.score'
# Brave Search API
BRAVE_KEY=$(sudo -u www-data grep BRAVE_SEARCH_API_KEY .env | cut -d= -f2)
curl -s -H "X-Subscription-Token: $BRAVE_KEY" \
"https://api.search.brave.com/res/v1/web/search?q=test&count=1" | jq '.web.results[0].title'
# KRS API
curl -s "https://api-krs.ms.gov.pl/api/krs/OdpisAktualny/0000878913" | jq '.odpis.dane.dzial1'
```
### 10.6 Git & Deployment Diagnostics
```bash
# Check current deployment version
ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git log --oneline -5"
# Check for uncommitted changes
ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git status"
# Check remote sync
ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git remote -v && git fetch && git status"
# Verify file permissions
ssh maciejpi@57.128.200.27 "ls -la /var/www/nordabiznes/ | head -10"
```
---
## Appendix: Related Documentation
- **System Architecture:** [01-system-context.md](01-system-context.md)
- **Container Diagram:** [02-container-diagram.md](02-container-diagram.md)
- **Deployment Architecture:** [03-deployment-architecture.md](03-deployment-architecture.md)
- **Network Topology:** [07-network-topology.md](07-network-topology.md)
- **Critical Configurations:** [08-critical-configurations.md](08-critical-configurations.md)
- **Security Architecture:** [09-security-architecture.md](09-security-architecture.md)
- **API Endpoints:** [10-api-endpoints.md](10-api-endpoints.md)
- **HTTP Request Flow:** [flows/06-http-request-flow.md](flows/06-http-request-flow.md)
- **Authentication Flow:** [flows/01-authentication-flow.md](flows/01-authentication-flow.md)
- **Incident Report:** [../../INCIDENT_REPORT_20260102.md](../../INCIDENT_REPORT_20260102.md)
---
**Document Status:** ✅ Complete
**Version:** 1.0
**Last Review:** 2026-01-10
---
## Maintenance Notes
**When to Update This Guide:**
1. After any production incident → Add to relevant section
2. When new features added → Add new troubleshooting scenarios
3. When infrastructure changes → Update diagnostic commands
4. Monthly review → Verify commands still work
5. After major version upgrades → Test all procedures
**Contribution Guidelines:**
- Keep solutions actionable (copy-paste commands when possible)
- Include expected output for diagnostic commands
- Reference related architecture docs
- Test all commands before adding
- Use consistent formatting (bash code blocks)
---
**End of Troubleshooting Guide**