nordabiz/README.md

573 lines
20 KiB
Markdown

# Norda Biznes Hub
**Production-ready Flask web application** providing a business directory and networking platform for members of the Norda Biznes association from Wejherowo and surrounding area.
**🚀 Status:** LIVE in production since 2025-11-23
**🌐 URL:** https://nordabiznes.pl
**📊 Coverage:** 80 member companies (100% of Norda Biznes membership)
## Overview
Norda Biznes Hub is a **Flask-powered web platform** built with PostgreSQL, featuring AI-driven search capabilities, comprehensive company profiles, and administrative tools for managing member data. The platform integrates Google Gemini AI for intelligent company recommendations, automated news monitoring via Brave Search API, and comprehensive SEO/social media auditing tools.
**Key Capabilities:**
- **Company Directory** - Complete catalog of 80 member companies with verified data
- **AI Chat Assistant** - Google Gemini 2.5 Flash-powered conversational search
- **Advanced Search** - Multi-mode search with FTS, fuzzy matching, and synonym expansion
- **Admin Panels** - News moderation, SEO audit, social media tracking, GBP/IT audits
- **User Authentication** - Secure login with email confirmation and role-based access
- **RESTful API** - JSON endpoints for programmatic access to company data
## Features
### User-Facing Features
#### 🏢 Company Directory & Catalog
- **80 member companies** with 100% coverage of Norda Biznes membership
- **6 business categories:** IT, Construction, Services, Production, Trade, Other
- **Comprehensive company profiles** with verified data (NIP, REGON, KRS)
- **Detailed information:** Contact details, services, competencies, social media presence
- **Website technical analysis** for each company
- **Data quality levels:** Basic, enhanced, and complete profiles
#### 🔍 Advanced Search & Discovery
- **Multi-mode search system** with AI-powered capabilities
- **Direct lookup:** NIP/REGON exact matching
- **Keyword search:** Company name, description, services, competencies
- **Synonym expansion:** Automatic keyword expansion (e.g., "strony" → www, web, portal)
- **Full-text search:** PostgreSQL FTS with tsvector indexing
- **Fuzzy matching:** pg_trgm for typo tolerance
- **Weighted scoring:** Prioritized results by relevance
#### 💬 AI Chat Assistant
- **Conversational AI** powered by Google Gemini 2.5 Flash
- **Context-aware recommendations** for company discovery
- **Multi-turn conversations** with full history tracking
- **Answer questions** about member companies and services
- **Find companies** by service, competency, or business need
- **Quality assurance:** 15 test cases with 70% pass threshold
- **Rate limiting:** 200 requests/day, 50 requests/hour
#### 🔐 User Authentication & Accounts
- **Secure registration** with email confirmation
- **User profiles** with company affiliation tracking
- **Password reset** functionality
- **Session management** with Flask-Login
- **CSRF protection** on all forms
- **Role-based access** (user vs admin)
#### 🔔 Notifications System
- **Real-time notifications** for user activity
- **Notification types:** New news, news approved/rejected
- **JSON API** for programmatic access
- **Unread count** tracking
### Admin Features (Requires Admin Access)
#### 📰 News Monitoring & Moderation
- **Automated news monitoring** via Brave Search API
- **AI-powered relevance scoring** (0.0-1.0) with Google Gemini
- **News categorization:** News mentions, press releases, awards
- **Admin moderation dashboard** with approve/reject workflow
- **Bulk actions** for efficient moderation
- **Display on company profiles** for approved news
#### 📱 Social Media Audit
- **Track 6 platforms:** Facebook, Instagram, LinkedIn, YouTube, TikTok, Twitter/X
- **Profile verification** and validation
- **Activity monitoring:** Last checked, followers count
- **Missing profile identification** for outreach opportunities
- **Current coverage:** 115 profiles across 53 companies (66% coverage)
- **Batch audit processing** for efficiency
#### 🎯 SEO Dashboard & Audit
- **Comprehensive SEO audits** using Google PageSpeed Insights API
- **Performance scoring** (0-100) for each company website
- **Accessibility scoring** for WCAG compliance
- **Best practices scoring** for web standards
- **On-page SEO analysis:** Meta tags, headings, images
- **Technical SEO checks:** robots.txt, sitemap, canonical tags
- **Historical tracking** of SEO improvements
- **Batch processing** with rate limiting (25,000 req/day)
#### 🏪 Google Business Profile (GBP) Audit
- **Field-by-field completeness checking**
- **Weighted scoring algorithm** (100 points total)
- **AI-powered recommendations** via Google Gemini
- **Historical tracking** of profile improvements
- **Photo requirements analysis**
#### 💻 IT Infrastructure Audit
- **Security posture scoring** (50% weight)
- **Collaboration readiness assessment** (30% weight)
- **Completeness scoring** (20% weight)
- **Maturity level classification**
- **Cross-company collaboration matching**
- **Security elements:** EDR, MFA, Firewall, Backup, DR, VPN, Monitoring
#### 💬 Forum Management
- **Discussion forum** for member engagement
- **Topic and reply moderation**
- **Community discussions**
#### 📅 Calendar Management
- **Events and meetings calendar**
- **Event creation and management**
- **Member calendar access**
#### 👥 User Management
- **User account administration**
- **Permission management**
- **Admin role assignment**
### Technical Features
#### 🔌 RESTful API
- **JSON endpoints** for programmatic access:
- `GET /api/companies` - Companies list
- `GET /api/verify-nip` - NIP verification
- `GET /api/notifications` - User notifications
- `GET /health` - Health check
- **Authentication required** for protected endpoints
#### 🛡️ Security & Rate Limiting
- **Flask-Login** for session management
- **Flask-WTF** for CSRF protection on all forms
- **Flask-Limiter** for rate limiting (200 req/day, 50 req/hour)
- **Secure password hashing**
- **SQL injection protection** via SQLAlchemy ORM
- **XSS protection** via Jinja2 auto-escaping
- **API key management** via environment variables
#### 🗄️ Database Architecture
- **PostgreSQL** primary database with 20+ tables
- **SQLAlchemy 2.0** ORM with advanced features
- **Full-text search** with tsvector indexing
- **Fuzzy matching** with pg_trgm extension
- **JSONB support** for flexible data storage
- **Migration system** for schema versioning
#### 🌍 Multi-Environment Deployment
- **Development:** PostgreSQL via Docker (localhost:5433)
- **Production:** NORDABIZ-01 server (VM 249, 10.22.68.249)
- **Reverse proxy:** NPM on R11-REVPROXY-01
- **SSL/TLS:** Let's Encrypt with auto-renewal
- **Domain:** nordabiznes.pl (DNS in OVH)
- **WSGI:** Gunicorn production server
#### 🔗 External API Integrations
- **Google Gemini AI** - Conversational AI (200 req/day)
- **Google PageSpeed Insights** - SEO audits (25,000 req/day)
- **Google Maps/Places** - Business verification
- **Microsoft Graph API** - Email service
- **KRS Open API** - Polish business registry
- **Brave Search API** - News monitoring (2,000 req/month)
#### ✅ Data Verification & Quality Control
- **NIP verification** via KRS API
- **REGON validation**
- **KRS number verification**
- **Data completeness checks**
- **Quality levels:** Basic, enhanced, complete
- **Automated verification scripts**
#### 🧪 AI Quality Testing Framework
- **Automated AI response evaluation**
- **15 test cases** in 8 business categories
- **70% pass threshold** for quality assurance
- **JSON-based test configuration**
- **Detailed evaluation reports**
## Project Structure
```
nordabiz/
├── app.py # Main Flask application (routes, auth, API)
├── database.py # SQLAlchemy models (20+ tables)
├── gemini_service.py # Google Gemini AI integration
├── nordabiz_chat.py # AI chat engine with company context
├── search_service.py # Unified search (FTS, fuzzy, synonyms)
├── email_service.py # Microsoft Graph email integration
├── krs_api_service.py # Polish business registry API
├── gbp_audit_service.py # Google Business Profile audit
├── it_audit_service.py # IT infrastructure audit
├── templates/ # Jinja2 HTML templates (30+ files)
│ ├── base.html # Base template with navigation and auth
│ ├── index.html # Company directory listing
│ ├── company_detail.html # Detailed company profile page
│ ├── chat.html # AI chat interface
│ ├── admin/ # Admin dashboards and tools (15 files)
│ ├── auth/ # Authentication flows (5 files)
│ ├── forum/ # Community forum (3 files)
│ ├── calendar/ # Events calendar (4 files)
│ ├── messages/ # Private messaging (4 files)
│ └── errors/ # Error pages (404, 500)
├── static/ # Static assets (CSS, images)
│ ├── css/ # Fluent Design stylesheets
│ │ ├── microsoft-fluent.css
│ │ └── fluent-nordabiz.css
│ └── img/companies/ # Company logos (82 images)
├── database/ # Database schemas and migrations
│ ├── schema.sql # Main PostgreSQL schema
│ ├── README.md # Database documentation
│ └── migrations/ # Versioned migrations (6 files)
├── scripts/ # Production automation scripts
│ ├── seo_audit.py # SEO audit tool (PageSpeed Insights)
│ ├── social_media_audit.py # Social media presence audit
│ ├── seo_analyzer.py # On-page SEO analysis
│ └── company-data-collector.js # Node.js web scraper
├── tests/ # Test suite (7 files)
│ ├── ai_quality_evaluator.py # AI testing framework
│ ├── ai_quality_test_cases.json # 15 test cases in 8 categories
│ ├── test_admin_seo_dashboard.py
│ ├── test_gbp_audit_field_checks.py
│ ├── test_it_audit_collaboration.py
│ └── test_social_media_audit.py
├── data/ # Source data files
│ ├── companies-basic.json # 80 company profiles
│ └── data-sources.md # Data source documentation
├── requirements.txt # Python dependencies (13 packages)
├── docker-compose.yml # PostgreSQL development database
├── .env.example # Environment variables template
├── CLAUDE.md # Project documentation for AI
└── deployment_checklist.md # Production deployment guide
```
## Development Environment Setup
Follow these steps to set up the development environment on your local machine.
### Prerequisites
- **Python 3.9+** - Core programming language
- **PostgreSQL 15+** - Database (via Docker or native installation)
- **Git** - Version control
- **Docker** (optional) - For containerized PostgreSQL
### Installation Steps
#### 1. Clone the Repository
```bash
git clone https://github.com/pienczyn/nordabiz.git
cd nordabiz
```
#### 2. Create Virtual Environment
```bash
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
```
#### 3. Install Python Dependencies
```bash
pip install -r requirements.txt
```
This will install all required packages including:
- Flask 3.0.0 (web framework)
- SQLAlchemy 2.0.23 (ORM)
- psycopg2-binary 2.9.9 (PostgreSQL adapter)
- google-generativeai 0.3.2 (Gemini AI)
- Flask-Login, Flask-WTF, Flask-Limiter (security)
- And more (see `requirements.txt` for complete list)
#### 4. Set Up PostgreSQL Database
**Option A: Using Docker (Recommended for Development)**
```bash
# Start PostgreSQL container
docker compose up -d
# Verify container is running
docker ps | grep nordabiz-postgres
```
This creates a PostgreSQL 15 database on `localhost:5433` with:
- Database: `nordabiz`
- User: `nordabiz_app`
- Password: `dev_password`
**Option B: Native PostgreSQL Installation**
```bash
# Install PostgreSQL (example for Ubuntu/Debian)
sudo apt install postgresql-15 postgresql-contrib-15
# Create database and user
sudo -u postgres psql
```
```sql
CREATE DATABASE nordabiz;
CREATE USER nordabiz_app WITH PASSWORD 'your_password_here';
GRANT ALL PRIVILEGES ON DATABASE nordabiz TO nordabiz_app;
\q
```
#### 5. Initialize Database Schema
```bash
# Apply schema (Docker setup)
docker exec -i nordabiz-postgres psql -U nordabiz_app -d nordabiz < database/schema.sql
# Or for native PostgreSQL
psql -U nordabiz_app -d nordabiz -h localhost < database/schema.sql
```
#### 6. Configure Environment Variables
Create `.env` file in the project root:
```bash
cp .env.example .env
```
Edit `.env` and configure the following variables:
```bash
# Flask Configuration
SECRET_KEY=your-super-secret-key-change-this
FLASK_ENV=development
# Server Configuration
PORT=5000
HOST=0.0.0.0
# Database Configuration (for Docker setup)
DATABASE_URL=postgresql://nordabiz_app:dev_password@localhost:5433/nordabiz
# Google Gemini API (required for AI chat)
GOOGLE_GEMINI_API_KEY=your_gemini_api_key_here
# Google PageSpeed Insights API (optional, for SEO audits)
GOOGLE_PAGESPEED_API_KEY=your_pagespeed_api_key_here
# Google Places API (optional, for GBP audits)
GOOGLE_PLACES_API_KEY=your_places_api_key_here
# Email Configuration (optional, for user verification)
MAIL_SERVER=smtp.gmail.com
MAIL_PORT=587
MAIL_USE_TLS=True
MAIL_USERNAME=your_email@gmail.com
MAIL_PASSWORD=your_app_password_here
MAIL_DEFAULT_SENDER=noreply@norda-biznes.info
# Application URLs
APP_URL=http://localhost:5000
VERIFY_EMAIL_URL=http://localhost:5000/verify-email
```
**Getting API Keys:**
- **Gemini API**: https://ai.google.dev/ (free tier: 200 requests/day)
- **PageSpeed Insights**: https://developers.google.com/speed/docs/insights/v5/get-started (free tier: 25,000 requests/day)
- **Google Places**: https://console.cloud.google.com/apis/credentials (free tier: $200/month credit)
#### 7. Run the Application
```bash
# Ensure virtual environment is activated
source venv/bin/activate # or venv\Scripts\activate on Windows
# Run Flask development server
python3 app.py
```
The application will start on `http://localhost:5000`
**Default ports:**
- Flask app: `5000` (or `5001` if 5000 is occupied)
- PostgreSQL (Docker): `5433` (mapped from container's 5432)
#### 8. Verify Installation
Open your browser and navigate to:
- **Main app**: http://localhost:5000
- **Health check**: http://localhost:5000/health
- **API test**: http://localhost:5000/api/companies
### Development Workflow
```bash
# Start PostgreSQL (if using Docker)
docker compose up -d
# Activate virtual environment
source venv/bin/activate
# Run application
python3 app.py
# In another terminal: Run tests
python -m pytest tests/
# Stop PostgreSQL when done
docker compose down
```
### Troubleshooting
**Database Connection Issues:**
```bash
# Check PostgreSQL is running (Docker)
docker ps | grep nordabiz-postgres
# Check PostgreSQL logs
docker logs nordabiz-postgres
# Restart PostgreSQL container
docker compose restart
```
**Port Already in Use:**
```bash
# Change PORT in .env file
PORT=5001
# Or stop conflicting service
lsof -ti:5000 | xargs kill -9
```
**Missing Dependencies:**
```bash
# Reinstall all dependencies
pip install -r requirements.txt --force-reinstall
```
**Database Schema Issues:**
```bash
# Reset database (Docker)
docker compose down -v # WARNING: Deletes all data!
docker compose up -d
docker exec -i nordabiz-postgres psql -U nordabiz_app -d nordabiz < database/schema.sql
```
For more detailed database setup and management, see `database/README.md`.
## Roadmap & Future Enhancements
While the platform is fully functional and in production, the following features are planned for future releases:
### Enhanced Company Profiles
- [ ] Photo galleries for company showcases
- [ ] Product/service catalog with detailed descriptions
- [ ] Company achievements and awards timeline
- [ ] Employee directory and team profiles
### Networking & Collaboration
- [ ] Direct messaging system between member companies
- [ ] Collaboration opportunities board
- [ ] Joint project proposals and consortium building
- [ ] Recommendation and referral system
### B2B Marketplace
- [ ] Dedicated marketplace for B2B offers
- [ ] Business partner matching algorithm
- [ ] RFP/RFQ posting and bidding system
- [ ] Contract management tools
### Analytics & Reporting
- [ ] Company analytics dashboard
- [ ] Member engagement metrics
- [ ] Platform usage statistics
- [ ] Custom reporting tools
### Integration & Automation
- [ ] Newsletter system for member updates
- [ ] CRM integration capabilities
- [ ] Automated data enrichment workflows
- [ ] Third-party service integrations
**Note:** Features are prioritized based on member feedback and business value. See `CLAUDE.md` for detailed roadmap planning.
## Technology Stack
### Backend
- **Python** 3.9+ - Core programming language
- **Flask** 3.0.0 - Web application framework
- **SQLAlchemy** 2.0.23 - ORM and database abstraction layer
- **PostgreSQL** - Primary relational database (production and development)
- **Gunicorn** - WSGI HTTP server for production deployment
### Security & Authentication
- **Flask-Login** 0.6.3 - User session management and authentication
- **Flask-WTF** 1.2.1 - CSRF protection and form validation
- **Flask-Limiter** 3.5.0 - Rate limiting (200 req/day, 50 req/hour)
### AI & Machine Learning
- **Google Gemini AI** (google-generativeai 0.3.2)
- Models: gemini-2.5-flash (default), gemini-2.5-flash-lite, gemini-2.5-pro
- Features: Multi-turn conversations, context-aware recommendations, AI-powered search
- Limits: Free tier (200 requests/day)
### External API Integrations
1. **Google Gemini AI** - Conversational AI and company recommendations
2. **Google PageSpeed Insights** - SEO and performance analysis (25,000 req/day)
3. **Google Maps/Places** - Business verification and geocoding
4. **Microsoft Graph API** - Email service (OAuth2 with MSAL)
5. **KRS Open API** - Polish business registry data
6. **Brave Search API** - News monitoring and company mentions (2,000 req/month)
### Frontend
- **Jinja2** - Server-side HTML template rendering (30+ templates)
- **CSS3** - Custom styling with Fluent Design System inspiration
- **Vanilla JavaScript** (ES6+) - Dynamic UI interactions, AJAX, form validation
- **No external frameworks** - Custom UI components (modals, toasts, cards, tables)
### SEO & Web Analysis
- **BeautifulSoup4** 4.12.3 - HTML parsing and meta tag extraction
- **lxml** 5.1.0 - Fast XML/HTML processing with XPath support
- **python-whois** 0.9.4 - Domain information and WHOIS lookup
### Infrastructure & Deployment
- **Nginx Proxy Manager** - Reverse proxy on R11-REVPROXY-01 (10.22.68.250)
- **Let's Encrypt** - SSL/TLS certificates with auto-renewal
- **Docker** - PostgreSQL container for local development
- **systemd** - Service management (nordabiznes.service)
- **Git** - Version control (GitHub + Gitea internal)
### Database
- **PostgreSQL** - Primary database with advanced features:
- Full-text search (FTS) with tsvector
- Fuzzy matching with pg_trgm extension
- JSONB for flexible data storage
- 20+ tables (Company, User, Chat, News, Social Media, SEO, etc.)
- psycopg2-binary 2.9.9 adapter
### Utilities
- **Flask-Mail** 0.9.1 - Email functionality
- **requests** 2.31.0 - HTTP client for external API calls
- **feedparser** 6.0.10 - RSS/Atom feed parsing
- **python-dotenv** 1.0.0 - Environment variable management
## Dane kontaktowe Norda Biznes
- **Adres**: ul. 12 Marca 238/5, 84-200 Wejherowo
- **Email**: biuro@norda-biznes.info
- **Telefon**: +48 729 716 400
- **Web**: https://norda-biznes.info
## Rozwój
Projekt jest gotowy do rozbudowy. Kolejne fazy mogą obejmować:
1. Backend (Node.js, Python, PHP)
2. Baza danych (PostgreSQL, MongoDB)
3. Autoryzacja i uwierzytelnianie
4. API dla integracji z innymi systemami
5. Aplikacja mobilna
## Licencja
Projekt stworzony dla Norda Biznes - Regionalna Izba Przedsiębiorców