- Zmiana nazwy: "Norda Biznes Hub" → "Norda Biznes Partner" - Aktualizacja modelu AI: Gemini 2.0 Flash → Gemini 3 Flash - Zachowano historyczne odniesienia w timeline i dokumentacji Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1444 lines
41 KiB
Markdown
1444 lines
41 KiB
Markdown
# AI Chat Flow
|
|
|
|
**Document Version:** 1.0
|
|
**Last Updated:** 2026-01-10
|
|
**Status:** Production LIVE
|
|
**Flow Type:** AI-Powered Company Discovery & Chat
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This document describes the **complete AI chat flow** for the Norda Biznes Partner application, covering:
|
|
|
|
- **Chat Interface** (`/chat` route)
|
|
- **Conversation Management** (start, message, history)
|
|
- **Context Building** with full company database
|
|
- **Gemini API Integration** for intelligent responses
|
|
- **Cost Tracking** and performance metrics
|
|
- **Search Integration** for company discovery
|
|
|
|
**Key Technology:**
|
|
- **AI Model:** Google Gemini 2.5 Flash (gemini-2.5-flash)
|
|
- **Chat Engine:** NordaBizChatEngine (nordabiz_chat.py)
|
|
- **Gemini Service:** Centralized GeminiService (gemini_service.py)
|
|
- **Search Integration:** Unified SearchService (search_service.py)
|
|
- **Database:** PostgreSQL (conversations, messages, companies)
|
|
|
|
**Key Features:**
|
|
- Full company database context (all 80 companies available to AI)
|
|
- Multi-turn conversation with history (last 10 messages)
|
|
- Intelligent company selection by AI (no pre-filtering)
|
|
- Real-time cost tracking (tokens, latency, theoretical cost)
|
|
- Free tier usage monitoring (1,500 requests/day limit)
|
|
- Compact data format to minimize token usage
|
|
|
|
**Cost & Performance:**
|
|
- **Model:** Gemini 2.5 Flash
|
|
- **Pricing:** $0.075/$0.30 per 1M tokens (input/output)
|
|
- **Free Tier:** 1,500 requests/day, unlimited tokens
|
|
- **Typical Response:** 200-400ms latency, 5,000-15,000 tokens
|
|
- **Actual Cost:** $0.00 (free tier)
|
|
- **Theoretical Cost:** $0.003-0.006 per message
|
|
|
|
---
|
|
|
|
## 1. High-Level Chat Flow
|
|
|
|
### 1.1 Complete Chat Flow Diagram
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
User[User] -->|1. Navigate to /chat| Browser[Browser]
|
|
Browser -->|2. GET /chat| Flask[Flask App<br/>app.py]
|
|
Flask -->|3. Require login| AuthCheck{Authenticated?}
|
|
|
|
AuthCheck -->|No| Login[Redirect to /login]
|
|
AuthCheck -->|Yes| ChatUI[Render chat.html]
|
|
|
|
ChatUI -->|4. Load UI| Browser
|
|
Browser -->|5. POST /api/chat/start| Flask
|
|
Flask -->|6. Create conversation| ChatEngine[NordaBizChatEngine<br/>nordabiz_chat.py]
|
|
ChatEngine -->|7. INSERT| ConvDB[(ai_chat_conversations)]
|
|
|
|
ConvDB -->|8. conversation_id| ChatEngine
|
|
ChatEngine -->|9. Return conversation| Flask
|
|
Flask -->|10. JSON response| Browser
|
|
|
|
Browser -->|11. User types message| UserInput[User Message]
|
|
UserInput -->|12. POST /api/chat/:id/message| Flask
|
|
|
|
Flask -->|13. Verify ownership| DB[(PostgreSQL)]
|
|
Flask -->|14. send_message| ChatEngine
|
|
|
|
ChatEngine -->|15. Save user message| MsgDB[(ai_chat_messages)]
|
|
ChatEngine -->|16. Build context| ContextBuilder[Context Builder<br/>_build_conversation_context]
|
|
|
|
ContextBuilder -->|17. Load ALL companies| DB
|
|
ContextBuilder -->|18. Load last 10 messages| MsgDB
|
|
ContextBuilder -->|19. Compact format| Context[Full Context<br/>JSON]
|
|
|
|
Context -->|20. Query AI| GeminiService[Gemini Service<br/>gemini_service.py]
|
|
GeminiService -->|21. API call| GeminiAPI[Google Gemini API<br/>gemini-2.5-flash]
|
|
|
|
GeminiAPI -->|22. AI response| GeminiService
|
|
GeminiService -->|23. Track cost| CostDB[(ai_api_costs)]
|
|
GeminiService -->|24. Response text| ChatEngine
|
|
|
|
ChatEngine -->|25. Count tokens| TokenCounter[Tokenizer]
|
|
TokenCounter -->|26. tokens_input, tokens_output| ChatEngine
|
|
ChatEngine -->|27. Save AI message| MsgDB
|
|
ChatEngine -->|28. Update conversation| ConvDB
|
|
|
|
ChatEngine -->|29. Return response| Flask
|
|
Flask -->|30. JSON + tech_info| Browser
|
|
Browser -->|31. Display message| User
|
|
|
|
style ChatEngine fill:#4CAF50
|
|
style GeminiService fill:#2196F3
|
|
style ContextBuilder fill:#FF9800
|
|
style DB fill:#9C27B0
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Chat Initialization Flow
|
|
|
|
### 2.1 Start Conversation
|
|
|
|
**Route:** `POST /api/chat/start`
|
|
**File:** `app.py` (lines 3511-3533)
|
|
**Authentication:** Required (`@login_required`)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
actor User
|
|
participant Browser
|
|
participant Flask as Flask App<br/>(app.py)
|
|
participant Engine as NordaBizChatEngine<br/>(nordabiz_chat.py)
|
|
participant DB as PostgreSQL<br/>(ai_chat_conversations)
|
|
|
|
User->>Browser: Click "Start Chat"
|
|
Browser->>Flask: POST /api/chat/start<br/>{title: "Rozmowa..."}
|
|
|
|
Note over Flask: @login_required
|
|
Flask->>Flask: Get current_user.id
|
|
|
|
Flask->>Engine: start_conversation(<br/> user_id=current_user.id,<br/> title="Rozmowa - 2026-01-10 10:30"<br/>)
|
|
|
|
Engine->>Engine: Auto-generate title if not provided
|
|
Engine->>DB: INSERT INTO ai_chat_conversations<br/>(user_id, started_at, title,<br/> conversation_type, is_active,<br/> message_count, model_name)
|
|
|
|
DB->>Engine: conversation.id = 123
|
|
Engine->>Flask: Return AIChatConversation object
|
|
|
|
Flask->>Browser: JSON {<br/> success: true,<br/> conversation_id: 123,<br/> title: "Rozmowa - 2026-01-10 10:30"<br/>}
|
|
|
|
Browser->>User: Chat session ready
|
|
```
|
|
|
|
**Database Operation:**
|
|
```sql
|
|
INSERT INTO ai_chat_conversations (
|
|
user_id, started_at, conversation_type, title,
|
|
is_active, message_count, model_name, created_at
|
|
) VALUES (
|
|
?, NOW(), 'general', ?,
|
|
TRUE, 0, 'gemini-2.5-flash', NOW()
|
|
);
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"success": true,
|
|
"conversation_id": 123,
|
|
"title": "Rozmowa - 2026-01-10 10:30"
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Message Flow (Core Chat Logic)
|
|
|
|
### 3.1 Send Message Sequence
|
|
|
|
**Route:** `POST /api/chat/<conversation_id>/message`
|
|
**File:** `app.py` (lines 3536-3603)
|
|
**Authentication:** Required (`@login_required`)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
actor User
|
|
participant Browser
|
|
participant Flask as Flask App
|
|
participant Engine as NordaBizChatEngine
|
|
participant DB as PostgreSQL
|
|
participant Context as Context Builder
|
|
participant Search as SearchService
|
|
participant Gemini as GeminiService
|
|
participant API as Gemini API
|
|
participant CostDB as ai_api_costs
|
|
|
|
User->>Browser: Type: "Kto robi strony www?"
|
|
Browser->>Flask: POST /api/chat/123/message<br/>{message: "Kto robi strony www?"}
|
|
|
|
Note over Flask: Verify conversation ownership
|
|
Flask->>DB: SELECT * FROM ai_chat_conversations<br/>WHERE id = 123 AND user_id = ?
|
|
DB->>Flask: Conversation found
|
|
|
|
Flask->>Engine: send_message(<br/> conversation_id=123,<br/> user_message="Kto robi strony www?",<br/> user_id=current_user.id<br/>)
|
|
|
|
Note over Engine: 1. Save user message
|
|
Engine->>DB: INSERT INTO ai_chat_messages<br/>(conversation_id, role='user',<br/> content="Kto robi strony www?")
|
|
DB->>Engine: Message saved
|
|
|
|
Note over Engine: 2. Build context with ALL companies
|
|
Engine->>Context: _build_conversation_context(<br/> db, conversation, message<br/>)
|
|
|
|
Context->>DB: SELECT * FROM companies<br/>WHERE status = 'active'
|
|
DB->>Context: 80 companies
|
|
|
|
Context->>DB: SELECT * FROM ai_chat_messages<br/>WHERE conversation_id = 123<br/>ORDER BY created_at DESC<br/>LIMIT 10
|
|
DB->>Context: Last 10 messages
|
|
|
|
Context->>Context: Build compact JSON format<br/>(minimize tokens)
|
|
Context->>Engine: Return full context dict
|
|
|
|
Note over Engine: 3. Query AI with full context
|
|
Engine->>Gemini: generate_text(<br/> prompt=system_prompt + context + history,<br/> feature='ai_chat',<br/> user_id=current_user.id,<br/> temperature=0.7<br/>)
|
|
|
|
Gemini->>API: POST /v1/models/gemini-2.5-flash:generateContent
|
|
API->>Gemini: AI response text
|
|
|
|
Note over Gemini: Track API cost to database
|
|
Gemini->>Gemini: Count tokens (input, output)
|
|
Gemini->>Gemini: Calculate cost<br/>($0.075/$0.30 per 1M tokens)
|
|
Gemini->>CostDB: INSERT INTO ai_api_costs<br/>(api_provider, model_name, feature,<br/> tokens, cost, latency_ms)
|
|
|
|
Gemini->>Engine: Return response text
|
|
|
|
Note over Engine: 4. Calculate per-message metrics
|
|
Engine->>Engine: tokenizer.count_tokens(user_message)
|
|
Engine->>Engine: tokenizer.count_tokens(response)
|
|
Engine->>Engine: Calculate latency_ms, cost_usd
|
|
|
|
Note over Engine: 5. Save AI response
|
|
Engine->>DB: INSERT INTO ai_chat_messages<br/>(conversation_id, role='assistant',<br/> content=response, tokens_input,<br/> tokens_output, cost_usd, latency_ms)
|
|
|
|
Note over Engine: 6. Update conversation stats
|
|
Engine->>DB: UPDATE ai_chat_conversations<br/>SET message_count = message_count + 2,<br/> updated_at = NOW()<br/>WHERE id = 123
|
|
|
|
Engine->>Flask: Return AIChatMessage object
|
|
|
|
Note over Flask: Get free tier usage stats
|
|
Flask->>CostDB: SELECT COUNT(*), SUM(tokens)<br/>FROM ai_api_costs<br/>WHERE DATE(timestamp) = TODAY()
|
|
CostDB->>Flask: requests_today, tokens_today
|
|
|
|
Flask->>Browser: JSON {<br/> success: true,<br/> message: "PIXLAB, WebStorm...",<br/> tech_info: {...}<br/>}
|
|
|
|
Browser->>User: Display AI response
|
|
```
|
|
|
|
### 3.2 Message Implementation Details
|
|
|
|
**Input Validation:**
|
|
- Message cannot be empty (`.strip()` check)
|
|
- Conversation ownership verified (user_id match)
|
|
- Conversation must exist and be active
|
|
|
|
**Database Operations:**
|
|
```sql
|
|
-- Save user message
|
|
INSERT INTO ai_chat_messages (
|
|
conversation_id, created_at, role, content,
|
|
edited, regenerated
|
|
) VALUES (?, NOW(), 'user', ?, FALSE, FALSE);
|
|
|
|
-- Save AI response with metrics
|
|
INSERT INTO ai_chat_messages (
|
|
conversation_id, created_at, role, content,
|
|
tokens_input, tokens_output, cost_usd, latency_ms,
|
|
edited, regenerated
|
|
) VALUES (?, NOW(), 'assistant', ?, ?, ?, ?, ?, FALSE, FALSE);
|
|
|
|
-- Update conversation
|
|
UPDATE ai_chat_conversations
|
|
SET message_count = message_count + 2,
|
|
updated_at = NOW()
|
|
WHERE id = ?;
|
|
```
|
|
|
|
**Response Format:**
|
|
```json
|
|
{
|
|
"success": true,
|
|
"message": "Znalazłem kilka firm zajmujących się stronami www: PIXLAB (www.pixlab.pl, tel: 509 509 689), WebStorm Agencja Interaktywna...",
|
|
"message_id": 456,
|
|
"created_at": "2026-01-10T10:35:22.123456",
|
|
"tech_info": {
|
|
"model": "gemini-2.5-flash",
|
|
"data_source": "PostgreSQL (80 firm Norda Biznes)",
|
|
"architecture": "Full DB Context (wszystkie firmy w kontekście AI)",
|
|
"tokens_input": 8543,
|
|
"tokens_output": 234,
|
|
"tokens_total": 8777,
|
|
"latency_ms": 342,
|
|
"theoretical_cost_usd": 0.00128,
|
|
"actual_cost_usd": 0.0,
|
|
"free_tier": {
|
|
"is_free": true,
|
|
"daily_limit": 1500,
|
|
"requests_today": 47,
|
|
"tokens_today": 423891,
|
|
"remaining": 1453
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Context Building (Core Intelligence)
|
|
|
|
### 4.1 Context Building Flow
|
|
|
|
**Method:** `_build_conversation_context(db, conversation, current_message)`
|
|
**File:** `nordabiz_chat.py` (lines 254-310)
|
|
**Strategy:** Full database context (AI does intelligent filtering)
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Start([User Message:<br/>"Kto robi strony www?"]) --> LoadCompanies[Load ALL active companies<br/>FROM companies WHERE status='active']
|
|
|
|
LoadCompanies --> Count[Total: 80 companies]
|
|
Count --> LoadCategories[Load all categories with counts]
|
|
LoadCategories --> LoadHistory[Load last 10 conversation messages<br/>ORDER BY created_at DESC]
|
|
|
|
LoadHistory --> BuildContext[Build context dict]
|
|
|
|
BuildContext --> CompactFormat[Convert ALL companies<br/>to compact format]
|
|
|
|
CompactFormat --> CompactLoop{For each<br/>company}
|
|
CompactLoop -->|Process| CompactFields[Include only non-empty fields:<br/>- name, cat (category)<br/>- desc (description_short)<br/>- history (founding_history)<br/>- svc (services)<br/>- comp (competencies)<br/>- web, tel, mail<br/>- city, year<br/>- cert (top 3 certifications)]
|
|
|
|
CompactFields --> SaveTokens[Save tokens by:<br/>- Short field names<br/>- Omit empty fields<br/>- Limit certs to 3]
|
|
|
|
SaveTokens --> NextCompany{More<br/>companies?}
|
|
NextCompany -->|Yes| CompactLoop
|
|
NextCompany -->|No| ContextReady[Context ready]
|
|
|
|
ContextReady --> ContextDict{Context Dictionary}
|
|
ContextDict --> Field1[conversation_type: 'general']
|
|
ContextDict --> Field2[total_companies: 80]
|
|
ContextDict --> Field3[categories: Array]
|
|
ContextDict --> Field4[all_companies: Array<br/>~8,000-12,000 tokens]
|
|
ContextDict --> Field5[recent_messages: Array<br/>Last 10 messages]
|
|
|
|
Field1 & Field2 & Field3 & Field4 & Field5 --> Return[Return to _query_ai]
|
|
|
|
style BuildContext fill:#4CAF50
|
|
style CompactFormat fill:#FF9800
|
|
style ContextDict fill:#2196F3
|
|
```
|
|
|
|
### 4.2 Compact Company Format
|
|
|
|
**Purpose:** Minimize token usage while preserving all important data
|
|
|
|
**Example Company Object:**
|
|
```json
|
|
{
|
|
"name": "PIXLAB Sp. z o.o.",
|
|
"cat": "IT i Technologie",
|
|
"desc": "Agencja interaktywna - strony www, sklepy online, aplikacje",
|
|
"history": "Założona przez Macieja Pieńczyńskiego w 2015 roku",
|
|
"svc": ["Strony WWW", "E-commerce", "Aplikacje webowe", "SEO"],
|
|
"comp": ["WordPress", "Shopify", "React", "Node.js"],
|
|
"web": "https://pixlab.pl",
|
|
"tel": "509 509 689",
|
|
"mail": "kontakt@pixlab.pl",
|
|
"city": "Wejherowo",
|
|
"year": 2015,
|
|
"cert": ["ISO 9001", "Google Partner"]
|
|
}
|
|
```
|
|
|
|
**Token Savings:**
|
|
- Short field names: `svc` instead of `services` (-40%)
|
|
- Omit empty fields: Only include if data exists (-30%)
|
|
- Limit certifications: Top 3 instead of all (-20%)
|
|
- Compact JSON: No extra whitespace (-10%)
|
|
|
|
**Typical Token Usage:**
|
|
- Single company: ~100-150 tokens (compact)
|
|
- All 80 companies: ~8,000-12,000 tokens
|
|
- System prompt: ~500 tokens
|
|
- Conversation history (10 msgs): ~1,000-2,000 tokens
|
|
- **Total input:** ~10,000-15,000 tokens
|
|
|
|
---
|
|
|
|
## 5. AI Query & Prompt Engineering
|
|
|
|
### 5.1 AI Query Flow
|
|
|
|
**Method:** `_query_ai(context, user_message, user_id)`
|
|
**File:** `nordabiz_chat.py` (lines 406-481)
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Start([Context + User Message]) --> BuildPrompt[Build system prompt]
|
|
|
|
BuildPrompt --> SystemPrompt[SYSTEM PROMPT:<br/>- Role definition<br/>- Database stats<br/>- Instructions<br/>- Data format guide]
|
|
|
|
SystemPrompt --> AddCompanies[Add ALL companies JSON<br/>~8,000-12,000 tokens]
|
|
|
|
AddCompanies --> AddHistory[Add conversation history<br/>Last 10 messages]
|
|
|
|
AddHistory --> AddUserMsg[Add current user message]
|
|
|
|
AddUserMsg --> FullPrompt[Complete prompt ready<br/>~10,000-15,000 tokens]
|
|
|
|
FullPrompt --> UseGlobal{use_global_service?}
|
|
|
|
UseGlobal -->|Yes (default)| GeminiSvc[gemini_service.generate_text]
|
|
UseGlobal -->|No (legacy)| DirectAPI[model.generate_content]
|
|
|
|
GeminiSvc --> AutoCost[Automatic cost tracking<br/>to ai_api_costs table]
|
|
DirectAPI --> NoCost[No cost tracking]
|
|
|
|
AutoCost --> APICall[Gemini API Call<br/>gemini-2.5-flash]
|
|
NoCost --> APICall
|
|
|
|
APICall --> Response[AI Response<br/>~200-400 tokens]
|
|
|
|
Response --> Return[Return response text]
|
|
|
|
style SystemPrompt fill:#4CAF50
|
|
style GeminiSvc fill:#2196F3
|
|
style AutoCost fill:#FF9800
|
|
```
|
|
|
|
### 5.2 System Prompt Structure
|
|
|
|
**File:** `nordabiz_chat.py` (lines 426-458)
|
|
|
|
```
|
|
Jesteś pomocnym asystentem portalu Norda Biznes - katalogu firm
|
|
zrzeszonych w stowarzyszeniu Norda Biznes z Wejherowa.
|
|
|
|
📊 MASZ DOSTĘP DO PEŁNEJ BAZY DANYCH:
|
|
- Liczba firm: 80
|
|
- Kategorie: IT i Technologie (25), Budownictwo (18), Usługi (15), ...
|
|
|
|
🎯 TWOJA ROLA:
|
|
- Analizujesz CAŁĄ bazę firm i wybierasz najlepsze dopasowania do pytania
|
|
- Odpowiadasz zwięźle (2-3 zdania), chyba że użytkownik prosi o szczegóły
|
|
- Podajesz konkretne nazwy firm z kontaktem
|
|
- Możesz wyszukiwać po: nazwie, usługach, kompetencjach, właścicielach, mieście
|
|
|
|
📋 FORMAT DANYCH (skróty):
|
|
- name: nazwa firmy
|
|
- cat: kategoria
|
|
- desc: krótki opis
|
|
- history: historia firmy, właściciele, założyciele
|
|
- svc: usługi
|
|
- comp: kompetencje
|
|
- web/tel/mail: kontakt
|
|
- city: miasto
|
|
- cert: certyfikaty
|
|
|
|
⚠️ WAŻNE:
|
|
- ZAWSZE podawaj nazwę firmy i kontakt (tel/web/mail jeśli dostępne)
|
|
- Jeśli pytanie o osobę (np. "kto to Roszman") - szukaj w polu "history"
|
|
- Odpowiadaj PO POLSKU
|
|
|
|
🏢 PEŁNA BAZA FIRM (wybierz najlepsze):
|
|
[JSON array with all 80 companies in compact format]
|
|
|
|
# HISTORIA ROZMOWY:
|
|
Użytkownik: [previous message 1]
|
|
Ty: [previous response 1]
|
|
Użytkownik: [previous message 2]
|
|
Ty: [previous response 2]
|
|
...
|
|
|
|
Użytkownik: Kto robi strony www?
|
|
Ty:
|
|
```
|
|
|
|
**Prompt Engineering Principles:**
|
|
1. **Clear role definition:** "Jesteś pomocnym asystentem..."
|
|
2. **Database context:** Total companies, category distribution
|
|
3. **Response guidelines:** Concise (2-3 sentences), specific contacts
|
|
4. **Data format guide:** Field name abbreviations explained
|
|
5. **Search capabilities:** What AI can search by
|
|
6. **Important notes:** Always include contact, search in "history" for people
|
|
7. **Language:** Always respond in Polish
|
|
8. **Full context:** ALL companies provided (AI does filtering)
|
|
9. **Conversation history:** Last 10 messages for context continuity
|
|
|
|
---
|
|
|
|
## 6. Cost Tracking & Performance
|
|
|
|
### 6.1 Dual Cost Tracking System
|
|
|
|
The application uses **TWO levels** of cost tracking:
|
|
|
|
**Level 1: Global API Cost Tracking** (ai_api_costs table)
|
|
- Managed by `gemini_service.py`
|
|
- Tracks ALL Gemini API calls (chat, image analysis, etc.)
|
|
- Automatic via `_log_api_cost()` method
|
|
|
|
**Level 2: Per-Message Chat Metrics** (ai_chat_messages table)
|
|
- Managed by `nordabiz_chat.py`
|
|
- Tracks tokens, cost, latency per chat message
|
|
- User-facing metrics for transparency
|
|
|
|
### 6.2 Cost Tracking Flow
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Engine as NordaBizChatEngine
|
|
participant Gemini as GeminiService
|
|
participant API as Gemini API
|
|
participant GlobalDB as ai_api_costs
|
|
participant ChatDB as ai_chat_messages
|
|
|
|
Engine->>Gemini: generate_text(<br/> prompt, feature='ai_chat',<br/> user_id=123<br/>)
|
|
|
|
Note over Gemini: Start timer
|
|
Gemini->>API: POST /generateContent
|
|
API->>Gemini: Response text
|
|
Note over Gemini: Stop timer (latency_ms)
|
|
|
|
Note over Gemini: Count tokens
|
|
Gemini->>Gemini: input_tokens = count_tokens(prompt)
|
|
Gemini->>Gemini: output_tokens = count_tokens(response)
|
|
|
|
Note over Gemini: Calculate cost
|
|
Gemini->>Gemini: input_cost = (input/1M) * $0.075
|
|
Gemini->>Gemini: output_cost = (output/1M) * $0.30
|
|
Gemini->>Gemini: total_cost = input + output
|
|
|
|
Note over Gemini: Global cost tracking
|
|
Gemini->>GlobalDB: INSERT INTO ai_api_costs<br/>(api_provider='gemini',<br/> model='gemini-2.5-flash',<br/> feature='ai_chat',<br/> user_id=123,<br/> tokens, cost, latency)
|
|
|
|
Gemini->>Engine: Return response text
|
|
|
|
Note over Engine: Per-message tracking
|
|
Engine->>Engine: tokenizer.count_tokens(user_msg)
|
|
Engine->>Engine: tokenizer.count_tokens(response)
|
|
Engine->>Engine: Calculate cost again (for message record)
|
|
|
|
Engine->>ChatDB: INSERT INTO ai_chat_messages<br/>(role='assistant',<br/> content, tokens_input,<br/> tokens_output, cost_usd,<br/> latency_ms)
|
|
```
|
|
|
|
### 6.3 Cost Calculation
|
|
|
|
**Gemini 2.5 Flash Pricing:**
|
|
- **Input:** $0.075 per 1M tokens
|
|
- **Output:** $0.30 per 1M tokens
|
|
- **Free Tier:** 1,500 requests/day (unlimited tokens)
|
|
|
|
**Typical Chat Message:**
|
|
```
|
|
Input: 10,000 tokens (system prompt + companies + history) = $0.00075
|
|
Output: 300 tokens (AI response) = $0.00009
|
|
Total: = $0.00084
|
|
```
|
|
|
|
**Daily Usage Estimate:**
|
|
- 100 chat messages/day
|
|
- Average 10,000 input + 300 output tokens
|
|
- Theoretical cost: $0.084/day ($2.52/month)
|
|
- **Actual cost: $0.00** (free tier covers all usage)
|
|
|
|
### 6.4 Free Tier Monitoring
|
|
|
|
**Function:** `get_free_tier_usage()`
|
|
**File:** `app.py`
|
|
|
|
```python
|
|
def get_free_tier_usage():
|
|
"""Get free tier usage stats for today"""
|
|
db = SessionLocal()
|
|
try:
|
|
today_start = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
|
|
|
|
stats = db.query(
|
|
func.count(AIAPICostLog.id).label('requests'),
|
|
func.sum(AIAPICostLog.total_tokens).label('tokens')
|
|
).filter(
|
|
AIAPICostLog.timestamp >= today_start,
|
|
AIAPICostLog.api_provider == 'gemini',
|
|
AIAPICostLog.success == True
|
|
).first()
|
|
|
|
return {
|
|
'requests_today': stats.requests or 0,
|
|
'tokens_today': stats.tokens or 0,
|
|
'daily_limit': 1500,
|
|
'remaining': max(0, 1500 - (stats.requests or 0))
|
|
}
|
|
finally:
|
|
db.close()
|
|
```
|
|
|
|
**Response in `/api/chat/:id/message`:**
|
|
```json
|
|
{
|
|
"tech_info": {
|
|
"free_tier": {
|
|
"is_free": true,
|
|
"daily_limit": 1500,
|
|
"requests_today": 47,
|
|
"tokens_today": 423891,
|
|
"remaining": 1453
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Conversation History
|
|
|
|
### 7.1 Get History Flow
|
|
|
|
**Route:** `GET /api/chat/<conversation_id>/history`
|
|
**File:** `app.py` (lines 3606-3634)
|
|
**Authentication:** Required (`@login_required`)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
actor User
|
|
participant Browser
|
|
participant Flask as Flask App
|
|
participant Engine as NordaBizChatEngine
|
|
participant DB as ai_chat_messages
|
|
|
|
User->>Browser: Load chat history
|
|
Browser->>Flask: GET /api/chat/123/history
|
|
|
|
Note over Flask: Verify ownership
|
|
Flask->>DB: SELECT * FROM ai_chat_conversations<br/>WHERE id = 123 AND user_id = ?
|
|
DB->>Flask: Conversation found
|
|
|
|
Flask->>Engine: get_conversation_history(123)
|
|
|
|
Engine->>DB: SELECT * FROM ai_chat_messages<br/>WHERE conversation_id = 123<br/>ORDER BY created_at ASC
|
|
|
|
DB->>Engine: All messages in conversation
|
|
|
|
Engine->>Engine: Format messages as dicts
|
|
Engine->>Flask: Return messages array
|
|
|
|
Flask->>Browser: JSON {<br/> success: true,<br/> messages: [...]<br/>}
|
|
|
|
Browser->>User: Display conversation history
|
|
```
|
|
|
|
**Response Format:**
|
|
```json
|
|
{
|
|
"success": true,
|
|
"messages": [
|
|
{
|
|
"id": 789,
|
|
"role": "user",
|
|
"content": "Kto robi strony www?",
|
|
"created_at": "2026-01-10T10:35:00.123456",
|
|
"tokens_input": 0,
|
|
"tokens_output": 0,
|
|
"cost_usd": 0.0,
|
|
"latency_ms": 0
|
|
},
|
|
{
|
|
"id": 790,
|
|
"role": "assistant",
|
|
"content": "Znalazłem kilka firm zajmujących się stronami www...",
|
|
"created_at": "2026-01-10T10:35:02.456789",
|
|
"tokens_input": 8543,
|
|
"tokens_output": 234,
|
|
"cost_usd": 0.00128,
|
|
"latency_ms": 342
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Database Schema
|
|
|
|
### 8.1 Conversation Tables
|
|
|
|
**ai_chat_conversations** (conversation metadata)
|
|
```sql
|
|
CREATE TABLE ai_chat_conversations (
|
|
id SERIAL PRIMARY KEY,
|
|
user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
|
|
started_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMP,
|
|
conversation_type VARCHAR(50) DEFAULT 'general',
|
|
title VARCHAR(500),
|
|
is_active BOOLEAN DEFAULT TRUE,
|
|
message_count INTEGER DEFAULT 0,
|
|
model_name VARCHAR(100)
|
|
);
|
|
|
|
CREATE INDEX idx_chat_conv_user_id ON ai_chat_conversations(user_id);
|
|
CREATE INDEX idx_chat_conv_started_at ON ai_chat_conversations(started_at DESC);
|
|
```
|
|
|
|
**ai_chat_messages** (individual messages)
|
|
```sql
|
|
CREATE TABLE ai_chat_messages (
|
|
id SERIAL PRIMARY KEY,
|
|
conversation_id INTEGER NOT NULL REFERENCES ai_chat_conversations(id) ON DELETE CASCADE,
|
|
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
role VARCHAR(20) NOT NULL, -- 'user' or 'assistant'
|
|
content TEXT NOT NULL,
|
|
tokens_input INTEGER,
|
|
tokens_output INTEGER,
|
|
cost_usd DECIMAL(10,6),
|
|
latency_ms INTEGER,
|
|
edited BOOLEAN DEFAULT FALSE,
|
|
regenerated BOOLEAN DEFAULT FALSE
|
|
);
|
|
|
|
CREATE INDEX idx_chat_msg_conv_id ON ai_chat_messages(conversation_id);
|
|
CREATE INDEX idx_chat_msg_created_at ON ai_chat_messages(created_at);
|
|
```
|
|
|
|
**ai_api_costs** (global API cost tracking)
|
|
```sql
|
|
CREATE TABLE ai_api_costs (
|
|
id SERIAL PRIMARY KEY,
|
|
timestamp TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
api_provider VARCHAR(50) NOT NULL, -- 'gemini'
|
|
model_name VARCHAR(100), -- 'gemini-2.5-flash'
|
|
feature VARCHAR(100), -- 'ai_chat', 'image_analysis', etc.
|
|
user_id INTEGER REFERENCES users(id),
|
|
input_tokens INTEGER,
|
|
output_tokens INTEGER,
|
|
total_tokens INTEGER,
|
|
input_cost DECIMAL(10,6),
|
|
output_cost DECIMAL(10,6),
|
|
total_cost DECIMAL(10,6),
|
|
success BOOLEAN DEFAULT TRUE,
|
|
error_message TEXT,
|
|
latency_ms INTEGER,
|
|
prompt_hash VARCHAR(64)
|
|
);
|
|
|
|
CREATE INDEX idx_api_costs_timestamp ON ai_api_costs(timestamp DESC);
|
|
CREATE INDEX idx_api_costs_provider ON ai_api_costs(api_provider);
|
|
CREATE INDEX idx_api_costs_feature ON ai_api_costs(feature);
|
|
CREATE INDEX idx_api_costs_user_id ON ai_api_costs(user_id);
|
|
```
|
|
|
|
### 8.2 Entity Relationships
|
|
|
|
```mermaid
|
|
erDiagram
|
|
users ||--o{ ai_chat_conversations : "has many"
|
|
ai_chat_conversations ||--o{ ai_chat_messages : "contains"
|
|
users ||--o{ ai_api_costs : "generates"
|
|
|
|
users {
|
|
int id PK
|
|
varchar email
|
|
varchar name
|
|
boolean is_admin
|
|
}
|
|
|
|
ai_chat_conversations {
|
|
int id PK
|
|
int user_id FK
|
|
timestamp started_at
|
|
varchar conversation_type
|
|
varchar title
|
|
boolean is_active
|
|
int message_count
|
|
varchar model_name
|
|
}
|
|
|
|
ai_chat_messages {
|
|
int id PK
|
|
int conversation_id FK
|
|
timestamp created_at
|
|
varchar role
|
|
text content
|
|
int tokens_input
|
|
int tokens_output
|
|
decimal cost_usd
|
|
int latency_ms
|
|
}
|
|
|
|
ai_api_costs {
|
|
int id PK
|
|
timestamp timestamp
|
|
varchar api_provider
|
|
varchar model_name
|
|
varchar feature
|
|
int user_id FK
|
|
int total_tokens
|
|
decimal total_cost
|
|
int latency_ms
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Error Handling
|
|
|
|
### 9.1 Common Error Scenarios
|
|
|
|
**1. Conversation Not Found**
|
|
```python
|
|
# app.py
|
|
conversation = db.query(AIChatConversation).filter_by(
|
|
id=conversation_id,
|
|
user_id=current_user.id
|
|
).first()
|
|
|
|
if not conversation:
|
|
return jsonify({
|
|
'success': False,
|
|
'error': 'Conversation not found'
|
|
}), 404
|
|
```
|
|
|
|
**2. Empty Message**
|
|
```python
|
|
message = data.get('message', '').strip()
|
|
|
|
if not message:
|
|
return jsonify({
|
|
'success': False,
|
|
'error': 'Wiadomość nie może być pusta'
|
|
}), 400
|
|
```
|
|
|
|
**3. Gemini API Error**
|
|
```python
|
|
# gemini_service.py
|
|
try:
|
|
response = self.model.generate_content(prompt)
|
|
|
|
# Check safety filters
|
|
if not response.candidates:
|
|
raise Exception("Response blocked by safety filters")
|
|
|
|
# Check finish reason
|
|
if candidate.finish_reason not in [1, 0]: # STOP or UNSPECIFIED
|
|
raise Exception(f"Response incomplete: {finish_reason}")
|
|
|
|
except Exception as e:
|
|
logger.error(f"Gemini API error: {e}")
|
|
|
|
# Log failed request to database
|
|
self._log_api_cost(
|
|
prompt=prompt,
|
|
response_text='',
|
|
input_tokens=self.count_tokens(prompt),
|
|
output_tokens=0,
|
|
success=False,
|
|
error_message=str(e)
|
|
)
|
|
|
|
raise Exception(f"Gemini API call failed: {str(e)}")
|
|
```
|
|
|
|
**4. Database Connection Error**
|
|
```python
|
|
# nordabiz_chat.py
|
|
db = SessionLocal()
|
|
try:
|
|
# Database operations
|
|
conversation = db.query(AIChatConversation).filter_by(id=conversation_id).first()
|
|
# ...
|
|
finally:
|
|
db.close() # Always close connection
|
|
```
|
|
|
|
### 9.2 Error Response Format
|
|
|
|
```json
|
|
{
|
|
"success": false,
|
|
"error": "Conversation not found"
|
|
}
|
|
```
|
|
|
|
**HTTP Status Codes:**
|
|
- `400` - Bad Request (empty message, invalid input)
|
|
- `404` - Not Found (conversation doesn't exist)
|
|
- `500` - Internal Server Error (Gemini API failure, database error)
|
|
|
|
---
|
|
|
|
## 10. Search Integration
|
|
|
|
### 10.1 Search Service Integration
|
|
|
|
**Method:** `_find_relevant_companies(db, message)`
|
|
**File:** `nordabiz_chat.py` (lines 383-404)
|
|
**Status:** DEPRECATED (kept for reference, not used in production)
|
|
|
|
**Historical Context:**
|
|
The chat engine originally used SearchService to **pre-filter** companies before sending to AI:
|
|
|
|
```python
|
|
# OLD APPROACH (deprecated):
|
|
def _find_relevant_companies(self, db, message):
|
|
"""Find companies relevant to user's message"""
|
|
results = search_companies(db, message, limit=10)
|
|
return [result.company for result in results]
|
|
|
|
# In _build_conversation_context:
|
|
relevant_companies = self._find_relevant_companies(db, current_message)
|
|
context['companies'] = [self._company_to_compact_dict(c) for c in relevant_companies]
|
|
```
|
|
|
|
**Current Approach:**
|
|
Send **ALL companies** to AI and let it do intelligent filtering:
|
|
|
|
```python
|
|
# NEW APPROACH (current production):
|
|
def _build_conversation_context(self, db, conversation, current_message):
|
|
"""Build context with ALL companies (not pre-filtered)"""
|
|
all_companies = db.query(Company).filter_by(status='active').all()
|
|
|
|
context['all_companies'] = [
|
|
self._company_to_compact_dict(c)
|
|
for c in all_companies
|
|
]
|
|
return context
|
|
```
|
|
|
|
**Why the Change?**
|
|
|
|
| Aspect | Old (Pre-filtered) | New (Full Context) |
|
|
|--------|-------------------|-------------------|
|
|
| **Companies sent** | 8-10 (search filtered) | 80 (all active) |
|
|
| **Token usage** | ~1,500 tokens | ~10,000 tokens |
|
|
| **Search quality** | Keyword-based, limited | AI-powered, intelligent |
|
|
| **Multi-criteria** | Difficult | Excellent |
|
|
| **Owner searches** | Impossible | Works perfectly |
|
|
| **Cost** | $0.0001/msg | $0.0008/msg |
|
|
| **User experience** | Sometimes misses results | Always comprehensive |
|
|
|
|
**Example:**
|
|
- User: "Kto to Roszman?" (Who is Roszman?)
|
|
- Old approach: Search for "roszman" in services/competencies → 0 results ❌
|
|
- New approach: AI searches `founding_history` field → Finds company owner ✅
|
|
|
|
---
|
|
|
|
## 11. Performance & Optimization
|
|
|
|
### 11.1 Performance Metrics
|
|
|
|
**Typical Chat Message:**
|
|
- **Latency:** 200-400ms
|
|
- **Input tokens:** 8,000-15,000 (system prompt + 80 companies + history)
|
|
- **Output tokens:** 200-500 (AI response)
|
|
- **Total tokens:** 8,500-15,500
|
|
- **Theoretical cost:** $0.0008-0.0015
|
|
- **Actual cost:** $0.00 (free tier)
|
|
|
|
**Database Queries:**
|
|
- Conversation lookup: ~5ms (indexed on user_id, id)
|
|
- All companies query: ~50ms (80 rows, no complex joins)
|
|
- Last 10 messages: ~10ms (indexed on conversation_id, created_at)
|
|
- **Total DB time:** ~65ms
|
|
|
|
**Gemini API:**
|
|
- Network latency: ~100-200ms
|
|
- Processing time: ~100-200ms
|
|
- **Total API time:** ~250-350ms
|
|
|
|
### 11.2 Token Optimization Strategies
|
|
|
|
**1. Compact Field Names**
|
|
```python
|
|
# GOOD (saves ~40% tokens):
|
|
{"name": "PIXLAB", "svc": ["WWW", "SEO"], "comp": ["WordPress"]}
|
|
|
|
# BAD (wasteful):
|
|
{"company_name": "PIXLAB", "services": ["WWW", "SEO"], "competencies": ["WordPress"]}
|
|
```
|
|
|
|
**2. Omit Empty Fields**
|
|
```python
|
|
# GOOD:
|
|
compact = {"name": c.name}
|
|
if c.description_short:
|
|
compact['desc'] = c.description_short
|
|
# Only adds field if data exists
|
|
|
|
# BAD:
|
|
compact = {
|
|
"name": c.name,
|
|
"desc": c.description_short or "", # Wastes tokens on ""
|
|
}
|
|
```
|
|
|
|
**3. Limit Arrays**
|
|
```python
|
|
# GOOD (top 3 certifications):
|
|
if c.certifications:
|
|
compact['cert'] = [cert.name for cert in c.certifications[:3]]
|
|
|
|
# BAD (all certifications):
|
|
compact['cert'] = [cert.name for cert in c.certifications] # May be 10+
|
|
```
|
|
|
|
**4. Compact JSON (no whitespace)**
|
|
```python
|
|
# GOOD:
|
|
json.dumps(data, ensure_ascii=False, indent=None)
|
|
# {"name":"PIXLAB","svc":["WWW"]}
|
|
|
|
# BAD:
|
|
json.dumps(data, ensure_ascii=False, indent=2)
|
|
# {
|
|
# "name": "PIXLAB",
|
|
# "svc": ["WWW"]
|
|
# }
|
|
```
|
|
|
|
**Token Savings:**
|
|
- Single company: 200 tokens → 100 tokens (50% reduction)
|
|
- 80 companies: 16,000 tokens → 8,000 tokens (50% reduction)
|
|
- Cost savings: $0.0016 → $0.0008 per message (50% reduction)
|
|
|
|
### 11.3 Caching Opportunities (Future)
|
|
|
|
**Not Currently Implemented** (all companies loaded per message)
|
|
|
|
**Potential Optimizations:**
|
|
1. **Company data caching** (Redis)
|
|
- Cache all companies JSON for 5 minutes
|
|
- Invalidate on company data changes
|
|
- Reduce DB query time: 50ms → 5ms
|
|
|
|
2. **Prompt template caching**
|
|
- Cache system prompt template
|
|
- Only rebuild when companies change
|
|
|
|
3. **Conversation context caching**
|
|
- Cache last 10 messages per conversation
|
|
- Invalidate on new message
|
|
- Reduce DB query time: 10ms → 1ms
|
|
|
|
**Why Not Implemented Yet:**
|
|
- Current performance is acceptable (250-350ms total)
|
|
- Free tier has no rate limits on DB queries
|
|
- Premature optimization (80 companies is small dataset)
|
|
- Complexity vs. benefit tradeoff
|
|
|
|
---
|
|
|
|
## 12. Security & Access Control
|
|
|
|
### 12.1 Authentication & Authorization
|
|
|
|
**All chat routes require authentication:**
|
|
```python
|
|
@app.route('/chat')
|
|
@login_required
|
|
def chat():
|
|
"""AI Chat interface"""
|
|
return render_template('chat.html')
|
|
|
|
@app.route('/api/chat/start', methods=['POST'])
|
|
@login_required
|
|
def chat_start():
|
|
# Only logged-in users can start conversations
|
|
...
|
|
|
|
@app.route('/api/chat/<int:conversation_id>/message', methods=['POST'])
|
|
@login_required
|
|
def chat_send_message(conversation_id):
|
|
# Verify conversation ownership
|
|
conversation = db.query(AIChatConversation).filter_by(
|
|
id=conversation_id,
|
|
user_id=current_user.id # IMPORTANT: Ownership check
|
|
).first()
|
|
|
|
if not conversation:
|
|
return jsonify({'error': 'Conversation not found'}), 404
|
|
...
|
|
```
|
|
|
|
### 12.2 Input Sanitization
|
|
|
|
**User message sanitization:**
|
|
```python
|
|
# app.py
|
|
message = data.get('message', '').strip()
|
|
|
|
# No HTML/JavaScript injection possible
|
|
# Gemini API treats all input as plain text
|
|
# Database stores as TEXT (no code execution)
|
|
```
|
|
|
|
**No SQL Injection:**
|
|
```python
|
|
# Safe (parameterized query):
|
|
conversation = db.query(AIChatConversation).filter_by(
|
|
id=conversation_id,
|
|
user_id=current_user.id
|
|
).first()
|
|
|
|
# PostgreSQL parameters prevent SQL injection
|
|
```
|
|
|
|
### 12.3 Rate Limiting
|
|
|
|
**Gemini API Free Tier Limits:**
|
|
- 1,500 requests/day
|
|
- No per-minute limit
|
|
- No token limit
|
|
|
|
**Application-Level Limits:**
|
|
- No specific rate limiting on chat endpoints (yet)
|
|
- User must be logged in (reduces abuse)
|
|
- Flask-Limiter can be added if needed
|
|
|
|
**Future Rate Limiting:**
|
|
```python
|
|
from flask_limiter import Limiter
|
|
|
|
limiter = Limiter(app, key_func=lambda: current_user.id)
|
|
|
|
@app.route('/api/chat/<int:conversation_id>/message', methods=['POST'])
|
|
@login_required
|
|
@limiter.limit("60 per hour") # 60 messages per hour per user
|
|
def chat_send_message(conversation_id):
|
|
...
|
|
```
|
|
|
|
---
|
|
|
|
## 13. Monitoring & Debugging
|
|
|
|
### 13.1 Cost Tracking Queries
|
|
|
|
**Daily API usage:**
|
|
```sql
|
|
SELECT
|
|
DATE(timestamp) as date,
|
|
COUNT(*) as requests,
|
|
SUM(total_tokens) as tokens,
|
|
SUM(total_cost) as cost_usd
|
|
FROM ai_api_costs
|
|
WHERE api_provider = 'gemini'
|
|
AND feature = 'ai_chat'
|
|
GROUP BY DATE(timestamp)
|
|
ORDER BY date DESC;
|
|
```
|
|
|
|
**Top users by API usage:**
|
|
```sql
|
|
SELECT
|
|
u.name,
|
|
u.email,
|
|
COUNT(*) as chat_messages,
|
|
SUM(c.total_tokens) as total_tokens,
|
|
SUM(c.total_cost) as total_cost_usd
|
|
FROM ai_api_costs c
|
|
JOIN users u ON c.user_id = u.id
|
|
WHERE c.api_provider = 'gemini'
|
|
AND c.feature = 'ai_chat'
|
|
GROUP BY u.id, u.name, u.email
|
|
ORDER BY total_cost_usd DESC
|
|
LIMIT 10;
|
|
```
|
|
|
|
**Free tier usage today:**
|
|
```sql
|
|
SELECT
|
|
COUNT(*) as requests_today,
|
|
SUM(total_tokens) as tokens_today,
|
|
1500 - COUNT(*) as remaining_requests
|
|
FROM ai_api_costs
|
|
WHERE DATE(timestamp) = CURRENT_DATE
|
|
AND api_provider = 'gemini'
|
|
AND success = TRUE;
|
|
```
|
|
|
|
### 13.2 Chat Analytics
|
|
|
|
**Most active conversations:**
|
|
```sql
|
|
SELECT
|
|
c.id,
|
|
c.title,
|
|
u.name as user_name,
|
|
c.message_count,
|
|
c.started_at,
|
|
c.updated_at
|
|
FROM ai_chat_conversations c
|
|
JOIN users u ON c.user_id = u.id
|
|
WHERE c.is_active = TRUE
|
|
ORDER BY c.message_count DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Average response metrics:**
|
|
```sql
|
|
SELECT
|
|
AVG(tokens_input) as avg_input_tokens,
|
|
AVG(tokens_output) as avg_output_tokens,
|
|
AVG(latency_ms) as avg_latency_ms,
|
|
AVG(cost_usd) as avg_cost_usd
|
|
FROM ai_chat_messages
|
|
WHERE role = 'assistant'
|
|
AND created_at > NOW() - INTERVAL '7 days';
|
|
```
|
|
|
|
### 13.3 Error Monitoring
|
|
|
|
**Failed API requests:**
|
|
```sql
|
|
SELECT
|
|
timestamp,
|
|
model_name,
|
|
feature,
|
|
error_message,
|
|
latency_ms
|
|
FROM ai_api_costs
|
|
WHERE success = FALSE
|
|
AND api_provider = 'gemini'
|
|
ORDER BY timestamp DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Conversations with errors:**
|
|
```sql
|
|
-- Conversations where last message is from user (AI didn't respond)
|
|
SELECT
|
|
c.id,
|
|
c.title,
|
|
c.message_count,
|
|
c.updated_at,
|
|
(SELECT content FROM ai_chat_messages
|
|
WHERE conversation_id = c.id
|
|
ORDER BY created_at DESC LIMIT 1) as last_message
|
|
FROM ai_chat_conversations c
|
|
WHERE c.message_count % 2 = 1 -- Odd number (user message without response)
|
|
AND c.updated_at > NOW() - INTERVAL '1 hour'
|
|
ORDER BY c.updated_at DESC;
|
|
```
|
|
|
|
---
|
|
|
|
## 14. Future Enhancements
|
|
|
|
### 14.1 Planned Features
|
|
|
|
**1. Conversation Context Memory**
|
|
- Remember user preferences across sessions
|
|
- "Remember that I'm looking for IT services"
|
|
- Personalized recommendations
|
|
|
|
**2. Conversation Sharing**
|
|
- Share conversation URL with other users
|
|
- Public vs. private conversations
|
|
- Embed chat widget on company profiles
|
|
|
|
**3. Voice Input/Output**
|
|
- Web Speech API for voice input
|
|
- Text-to-speech for AI responses
|
|
- Hands-free interaction
|
|
|
|
**4. Multi-Modal Input**
|
|
- Upload images (company logo, product photos)
|
|
- Gemini Vision API for image analysis
|
|
- "Find companies similar to this logo"
|
|
|
|
**5. Conversation Search**
|
|
- Full-text search across all user conversations
|
|
- Filter by date, company mentioned, topic
|
|
- Export conversation history
|
|
|
|
**6. Advanced Analytics**
|
|
- Which companies are most recommended by AI?
|
|
- What services are users asking about most?
|
|
- Conversation funnel (browse → chat → contact)
|
|
|
|
### 14.2 Optimization Opportunities
|
|
|
|
**1. Redis Caching**
|
|
```python
|
|
# Cache all companies JSON
|
|
redis_key = f"companies:all:{version_hash}"
|
|
cached = redis.get(redis_key)
|
|
|
|
if cached:
|
|
all_companies = json.loads(cached)
|
|
else:
|
|
all_companies = load_from_db()
|
|
redis.setex(redis_key, 300, json.dumps(all_companies)) # 5 min TTL
|
|
```
|
|
|
|
**2. Prompt Compression**
|
|
- Use Gemini's context caching feature (when available)
|
|
- Cache system prompt + company database
|
|
- Only send new user message (save 90% tokens)
|
|
|
|
**3. Streaming Responses**
|
|
```python
|
|
@app.route('/api/chat/<int:conversation_id>/message', methods=['POST'])
|
|
def chat_send_message(conversation_id):
|
|
# Enable streaming
|
|
response = gemini_service.generate_text(
|
|
prompt=full_prompt,
|
|
stream=True # Return generator
|
|
)
|
|
|
|
# Server-Sent Events (SSE)
|
|
def generate():
|
|
for chunk in response:
|
|
yield f"data: {json.dumps({'text': chunk.text})}\n\n"
|
|
|
|
return Response(generate(), mimetype='text/event-stream')
|
|
```
|
|
|
|
**4. Conversation Summarization**
|
|
- Auto-summarize conversations > 20 messages
|
|
- Include summary instead of full history
|
|
- Reduce token usage by 50%
|
|
|
|
---
|
|
|
|
## 15. Troubleshooting Guide
|
|
|
|
### 15.1 Common Issues
|
|
|
|
**Issue: "Conversation not found" error**
|
|
```
|
|
Cause: User trying to access someone else's conversation
|
|
Fix: Verify conversation_id belongs to current_user.id
|
|
|
|
SQL Debug:
|
|
SELECT id, user_id FROM ai_chat_conversations WHERE id = 123;
|
|
```
|
|
|
|
**Issue: Empty AI responses**
|
|
```
|
|
Cause: Gemini safety filters blocking response
|
|
Fix: Check ai_api_costs for error_message
|
|
|
|
SQL Debug:
|
|
SELECT error_message, prompt_hash FROM ai_api_costs
|
|
WHERE success = FALSE ORDER BY timestamp DESC LIMIT 10;
|
|
```
|
|
|
|
**Issue: Slow response times (> 1 second)**
|
|
```
|
|
Cause: Large context (many companies, long history)
|
|
Fix: Check token counts, consider summarization
|
|
|
|
SQL Debug:
|
|
SELECT tokens_input, tokens_output, latency_ms
|
|
FROM ai_chat_messages
|
|
WHERE latency_ms > 1000
|
|
ORDER BY created_at DESC LIMIT 20;
|
|
```
|
|
|
|
**Issue: "Free tier limit exceeded"**
|
|
```
|
|
Cause: > 1,500 requests in 24 hours
|
|
Fix: Wait for quota reset (midnight Pacific Time)
|
|
|
|
SQL Debug:
|
|
SELECT COUNT(*) FROM ai_api_costs
|
|
WHERE DATE(timestamp) = CURRENT_DATE AND api_provider = 'gemini';
|
|
```
|
|
|
|
### 15.2 Diagnostic Commands
|
|
|
|
**Check Gemini API connectivity:**
|
|
```bash
|
|
python3 -c "
|
|
from gemini_service import GeminiService
|
|
svc = GeminiService()
|
|
response = svc.generate_text('Hello', feature='test')
|
|
print(response)
|
|
"
|
|
```
|
|
|
|
**Verify database connection:**
|
|
```bash
|
|
psql -U nordabiz_app -d nordabiz -c "
|
|
SELECT COUNT(*) as conversations FROM ai_chat_conversations;
|
|
SELECT COUNT(*) as messages FROM ai_chat_messages;
|
|
SELECT COUNT(*) as api_calls FROM ai_api_costs WHERE api_provider = 'gemini';
|
|
"
|
|
```
|
|
|
|
**Test chat flow:**
|
|
```python
|
|
from nordabiz_chat import NordaBizChatEngine
|
|
|
|
engine = NordaBizChatEngine()
|
|
conv = engine.start_conversation(user_id=1, title="Test")
|
|
response = engine.send_message(conv.id, "Test message", user_id=1)
|
|
print(f"Response: {response.content}")
|
|
```
|
|
|
|
---
|
|
|
|
## 16. Related Documentation
|
|
|
|
- **[Search Flow](./02-search-flow.md)** - Company search integration
|
|
- **[Authentication Flow](./01-authentication-flow.md)** - User authentication
|
|
- **[Flask Components](../04-flask-components.md)** - Application architecture
|
|
- **[External Integrations](../06-external-integrations.md)** - Gemini API details
|
|
- **[Database Schema](../05-database-schema.md)** - Database structure
|
|
|
|
---
|
|
|
|
## 17. Glossary
|
|
|
|
| Term | Definition |
|
|
|------|------------|
|
|
| **NordaBizChatEngine** | Main chat engine class in `nordabiz_chat.py` |
|
|
| **GeminiService** | Centralized Gemini API wrapper in `gemini_service.py` |
|
|
| **Conversation** | Chat session with multiple messages |
|
|
| **Context** | Full company database + history sent to AI |
|
|
| **Compact Format** | Token-optimized company data format |
|
|
| **Free Tier** | Google Gemini free tier (1,500 req/day) |
|
|
| **Token** | Unit of text (~4 characters) for AI models |
|
|
| **Latency** | Response time in milliseconds |
|
|
| **Cost Tracking** | Dual-level system (global + per-message) |
|
|
| **System Prompt** | Instructions sent to AI with each query |
|
|
|
|
---
|
|
|
|
## 18. Maintenance
|
|
|
|
**When to Update This Document:**
|
|
- ✅ Gemini model version change (e.g., 2.5 → 3.0)
|
|
- ✅ Pricing changes
|
|
- ✅ New chat features (voice, images, etc.)
|
|
- ✅ Context building algorithm changes
|
|
- ✅ Database schema changes
|
|
- ✅ Performance optimization implementations
|
|
|
|
**Document Owner:** Development Team
|
|
**Review Frequency:** Quarterly or after major changes
|
|
**Last Review:** 2026-01-10
|
|
|
|
---
|
|
|
|
**END OF DOCUMENT**
|