diff --git a/docs/superpowers/plans/2026-03-28-nordagpt-identity-memory.md b/docs/superpowers/plans/2026-03-28-nordagpt-identity-memory.md new file mode 100644 index 0000000..6541ad9 --- /dev/null +++ b/docs/superpowers/plans/2026-03-28-nordagpt-identity-memory.md @@ -0,0 +1,1926 @@ +# NordaGPT Identity, Memory & Performance — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Transform NordaGPT from an anonymous chatbot into a personalized assistant with user identity, persistent memory, smart routing, and streaming responses. + +**Architecture:** Four-phase rollout: (1) inject user identity into AI prompt, (2) smart router + selective context loading, (3) streaming SSE responses, (4) persistent user memory with async extraction. Each phase is independently deployable and testable. + +**Tech Stack:** Flask 3.0, SQLAlchemy 2.0, PostgreSQL, Google Gemini API (3-Flash, 3.1-Flash-Lite), Server-Sent Events, Jinja2 inline JS. + +**Spec:** `docs/superpowers/specs/2026-03-28-nordagpt-identity-memory-design.md` + +--- + +## File Structure + +### New files + +| File | Responsibility | +|------|---------------| +| `smart_router.py` | Classifies query complexity, selects data categories and model | +| `memory_service.py` | CRUD for user memory facts + conversation summaries, extraction prompt | +| `context_builder.py` | Loads selective data from DB based on router decision | +| `database/migrations/092_ai_user_memory.sql` | Memory + summary tables | +| `database/migrations/093_ai_conversation_summary.sql` | Summary table | + +### Modified files + +| File | Changes | +|------|---------| +| `database.py` | Add AIUserMemory, AIConversationSummary models (before line 5954) | +| `nordabiz_chat.py` | Accept user_context, integrate router, selective context, memory injection | +| `gemini_service.py` | Token counting for streamed responses | +| `blueprints/chat/routes.py` | Build user_context, add streaming endpoint, memory CRUD routes | +| `templates/chat.html` | Streaming UI, thinking animation, memory settings panel | + +--- + +## Phase 1: User Identity (Tasks 1-3) + +### Task 1: Pass user context from route to chat engine + +**Files:** +- Modify: `blueprints/chat/routes.py:234-309` +- Modify: `nordabiz_chat.py:163-180` + +- [ ] **Step 1: Build user_context dict in chat route** + +In `blueprints/chat/routes.py`, modify `chat_send_message()`. After line 262 (where `current_user.id` and `current_user.email` are used for limit check), add user_context construction: + +```python +# After line 262, before line 268 +# Build user context for AI personalization +user_context = { + 'user_id': current_user.id, + 'user_name': current_user.name, + 'user_email': current_user.email, + 'company_name': current_user.company.name if current_user.company else None, + 'company_id': current_user.company.id if current_user.company else None, + 'company_category': current_user.company.category.name if current_user.company and current_user.company.category else None, + 'company_role': current_user.company_role or 'MEMBER', + 'is_norda_member': current_user.is_norda_member, + 'chamber_role': current_user.chamber_role, + 'member_since': current_user.created_at.strftime('%Y-%m-%d') if current_user.created_at else None, +} +``` + +- [ ] **Step 2: Pass user_context to send_message()** + +In the same function, modify the `chat_engine.send_message()` call (around line 282): + +```python +# Before: +ai_response = chat_engine.send_message( + conversation_id, + user_message=message, + user_id=current_user.id, + thinking_level=thinking_level +) + +# After: +ai_response = chat_engine.send_message( + conversation_id, + user_message=message, + user_id=current_user.id, + thinking_level=thinking_level, + user_context=user_context +) +``` + +- [ ] **Step 3: Update send_message() signature in nordabiz_chat.py** + +In `nordabiz_chat.py`, modify `send_message()` at line 163: + +```python +# Before: +def send_message( + self, + conversation_id: int, + user_message: str, + user_id: int, + thinking_level: str = 'high' +) -> AIChatMessage: + +# After: +def send_message( + self, + conversation_id: int, + user_message: str, + user_id: int, + thinking_level: str = 'high', + user_context: Optional[Dict[str, Any]] = None +) -> AIChatMessage: +``` + +Add `from typing import Optional, Dict, Any` to imports if not already present. + +- [ ] **Step 4: Thread user_context through to _query_ai()** + +In `send_message()`, find the call to `_query_ai()` (around line 239) and add user_context: + +```python +# Before: +ai_response_text = self._query_ai(context, original_message, user_id=user_id, thinking_level=thinking_level) + +# After: +ai_response_text = self._query_ai(context, original_message, user_id=user_id, thinking_level=thinking_level, user_context=user_context) +``` + +- [ ] **Step 5: Update _query_ai() signature** + +In `nordabiz_chat.py`, modify `_query_ai()` at line 890: + +```python +# Before: +def _query_ai( + self, + context: Dict[str, Any], + user_message: str, + user_id: Optional[int] = None, + thinking_level: str = 'high' +) -> str: + +# After: +def _query_ai( + self, + context: Dict[str, Any], + user_message: str, + user_id: Optional[int] = None, + thinking_level: str = 'high', + user_context: Optional[Dict[str, Any]] = None +) -> str: +``` + +- [ ] **Step 6: Commit** + +```bash +git add blueprints/chat/routes.py nordabiz_chat.py +git commit -m "refactor(chat): thread user_context from route through to _query_ai" +``` + +--- + +### Task 2: Inject user identity into system prompt + +**Files:** +- Modify: `nordabiz_chat.py:920-930` + +- [ ] **Step 1: Add user identity block to system prompt** + +In `nordabiz_chat.py`, inside `_query_ai()`, find line ~922 where `system_prompt` starts. Insert the user identity block BEFORE the main system prompt string (after line 921, before line 922): + +```python + # Build user identity section + user_identity = "" + if user_context: + user_identity = f""" +# AKTUALNY UŻYTKOWNIK +Rozmawiasz z: {user_context.get('user_name', 'Nieznany')} +Firma: {user_context.get('company_name', 'brak')} — kategoria: {user_context.get('company_category', 'brak')} +Rola w firmie: {user_context.get('company_role', 'MEMBER')} +Członek Izby Norda Biznes: {'tak' if user_context.get('is_norda_member') else 'nie'} +Rola w Izbie: {user_context.get('chamber_role') or '—'} +Na portalu od: {user_context.get('member_since', 'nieznana data')} + +ZASADY PERSONALIZACJI: +- Zwracaj się do użytkownika po imieniu (pierwsze słowo z imienia i nazwiska) +- W pierwszej wiadomości konwersacji przywitaj się: "Cześć [imię], w czym mogę pomóc?" +- Na pytania "co wiesz o mnie?" / "kim jestem?" — wypisz powyższe dane + powiązania firmowe z bazy +- Uwzględniaj kontekst firmy użytkownika w odpowiedziach (np. sugeruj partnerów z komplementarnych branż) +- NIE ujawniaj danych technicznych (user_id, company_id, rola systemowa) +""" +``` + +- [ ] **Step 2: Prepend user_identity to system_prompt** + +Find where `system_prompt` is first assigned (line 922) and prepend: + +```python + # Line 922 area - the system_prompt f-string starts here + system_prompt = user_identity + f"""Jesteś pomocnym asystentem portalu Norda Biznes... +``` + +This is a minimal change — just concatenate `user_identity` (which is empty string if no context) before the existing prompt. + +- [ ] **Step 3: Verify syntax compiles** + +```bash +python3 -m py_compile nordabiz_chat.py && echo "OK" +``` + +- [ ] **Step 4: Test locally** + +Start local dev server and send a chat message. Verify in logs that the prompt now contains the user identity block. Check that the AI greets by name. + +```bash +python3 app.py +# In another terminal: +curl -X POST http://localhost:5000/api/chat/1/message \ + -H "Content-Type: application/json" \ + -d '{"message": "Kim jestem?"}' +``` + +(Note: requires auth cookie — easier to test via browser) + +- [ ] **Step 5: Commit** + +```bash +git add nordabiz_chat.py +git commit -m "feat(nordagpt): inject user identity into AI system prompt — personalized greetings and context" +``` + +--- + +### Task 3: Deploy Phase 1 and verify + +**Files:** None (deployment only) + +- [ ] **Step 1: Push to remotes** + +```bash +git push origin master && git push inpi master +``` + +- [ ] **Step 2: Deploy to staging** + +```bash +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +``` + +- [ ] **Step 3: Test on staging — verify AI greets by name** + +Open https://staging.nordabiznes.pl/chat, start new conversation, type "Cześć". Verify AI responds with your name. + +Type "Co wiesz o mnie?" — verify AI lists your profile data. + +- [ ] **Step 4: Deploy to production** + +```bash +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +curl -sI https://nordabiznes.pl/health | head -3 +``` + +- [ ] **Step 5: Commit deployment notes (update release_notes in routes.py)** + +Add new release entry in `blueprints/public/routes.py` `_get_releases()` function. + +--- + +## Phase 2: Smart Router + Context Builder (Tasks 4-7) + +### Task 4: Create context_builder.py — selective data loading + +**Files:** +- Create: `context_builder.py` + +- [ ] **Step 1: Create context_builder.py with selective loading functions** + +```python +""" +Context Builder for NordaGPT Smart Router +========================================== +Loads only the data categories requested by the Smart Router, +instead of loading everything for every query. +""" + +import json +import logging +from typing import Dict, Any, List, Optional +from datetime import datetime, timedelta + +from database import ( + SessionLocal, Company, Category, CompanyRecommendation, + NordaEvent, Classified, ForumTopic, ForumReply, + CompanyPerson, Person, User, CompanySocialMedia, + GBPAudit, CompanyWebsiteAnalysis, ZOPKNews, + UserCompanyPermissions +) +from sqlalchemy import func, desc + +logger = logging.getLogger(__name__) + + +def _company_to_compact_dict(company) -> Dict: + """Convert company to compact dict for AI context. Mirrors nordabiz_chat.py format.""" + return { + 'name': company.name, + 'cat': company.category.name if company.category else None, + 'profile': f'/firma/{company.slug}', + 'desc': company.description_short, + 'about': company.description_full[:500] if company.description_full else None, + 'svc': company.services, + 'comp': company.competencies, + 'web': company.website, + 'tel': company.phone, + 'mail': company.email, + 'city': company.city, + } + + +def build_selective_context( + data_needed: List[str], + conversation_id: int, + current_message: str, + user_context: Optional[Dict] = None +) -> Dict[str, Any]: + """ + Build AI context with only the requested data categories. + + Args: + data_needed: List of category strings from Smart Router, e.g.: + ["companies_all", "companies_filtered:IT", "companies_single:termo", + "events", "news", "classifieds", "forum", "company_people", + "registered_users", "social_media", "audits"] + conversation_id: Current conversation ID for history + current_message: User's message text + user_context: User identity dict + + Returns: + Context dict compatible with nordabiz_chat.py _query_ai() + """ + db = SessionLocal() + context = {} + + try: + # Always load: basic stats and conversation history + active_companies = db.query(Company).filter_by(status='active').all() + context['total_companies'] = len(active_companies) + + categories = db.query(Category).all() + context['categories'] = [ + {'name': c.name, 'slug': c.slug, 'company_count': len([co for co in active_companies if co.category_id == c.id])} + for c in categories + ] + + # Conversation history (always loaded) + from database import AIChatMessage, AIChatConversation + messages = db.query(AIChatMessage).filter_by( + conversation_id=conversation_id + ).order_by(AIChatMessage.created_at.desc()).limit(10).all() + context['recent_messages'] = [ + {'role': msg.role, 'content': msg.content} + for msg in reversed(messages) + ] + + # Selective data loading based on router decision + for category in data_needed: + if category == 'companies_all': + context['all_companies'] = [_company_to_compact_dict(c) for c in active_companies] + + elif category.startswith('companies_filtered:'): + filter_cat = category.split(':', 1)[1] + filtered = [c for c in active_companies + if c.category and c.category.name.lower() == filter_cat.lower()] + context['all_companies'] = [_company_to_compact_dict(c) for c in filtered] + + elif category.startswith('companies_single:'): + search = category.split(':', 1)[1].lower() + matched = [c for c in active_companies + if search in c.name.lower() or search in (c.slug or '')] + context['all_companies'] = [_company_to_compact_dict(c) for c in matched[:5]] + + elif category == 'events': + events = db.query(NordaEvent).filter( + NordaEvent.event_date >= datetime.now(), + NordaEvent.event_date <= datetime.now() + timedelta(days=60) + ).order_by(NordaEvent.event_date).all() + context['upcoming_events'] = [ + {'title': e.title, 'date': str(e.event_date), 'type': e.event_type, + 'location': e.location, 'url': f'/kalendarz/{e.id}'} + for e in events + ] + + elif category == 'news': + news = db.query(ZOPKNews).filter( + ZOPKNews.published_at >= datetime.now() - timedelta(days=30), + ZOPKNews.status == 'approved' + ).order_by(ZOPKNews.published_at.desc()).limit(10).all() + context['recent_news'] = [ + {'title': n.title, 'summary': n.ai_summary, 'date': str(n.published_at), + 'source': n.source_name, 'url': n.source_url} + for n in news + ] + + elif category == 'classifieds': + classifieds = db.query(Classified).filter( + Classified.status == 'active', + Classified.is_test == False + ).order_by(Classified.created_at.desc()).limit(20).all() + context['classifieds'] = [ + {'type': c.listing_type, 'title': c.title, 'description': c.description, + 'company': c.company.name if c.company else None, + 'budget': c.budget_text, 'url': f'/b2b/{c.id}'} + for c in classifieds + ] + + elif category == 'forum': + topics = db.query(ForumTopic).filter( + ForumTopic.is_test == False + ).order_by(ForumTopic.created_at.desc()).limit(15).all() + context['forum_topics'] = [ + {'title': t.title, 'content': t.content[:300], + 'author': t.author.name if t.author else None, + 'replies': t.reply_count, 'url': f'/forum/{t.slug}'} + for t in topics + ] + + elif category == 'company_people': + people_query = db.query(CompanyPerson).join(Person).join(Company).filter( + Company.status == 'active' + ).all() + grouped = {} + for cp in people_query: + cname = cp.company.name + if cname not in grouped: + grouped[cname] = [] + grouped[cname].append({ + 'name': cp.person.name, + 'role': cp.role_description, + 'shares': cp.shares_value + }) + context['company_people'] = grouped + + elif category == 'registered_users': + users = db.query(User).filter( + User.is_active == True, + User.company_id.isnot(None) + ).all() + grouped = {} + for u in users: + cname = u.company.name if u.company else 'Brak firmy' + if cname not in grouped: + grouped[cname] = [] + grouped[cname].append({ + 'name': u.name, 'email': u.email, + 'role': u.company_role, 'member': u.is_norda_member + }) + context['registered_users'] = grouped + + elif category == 'social_media': + socials = db.query(CompanySocialMedia).filter_by(is_valid=True).all() + grouped = {} + for s in socials: + cname = s.company.name if s.company else 'Unknown' + if cname not in grouped: + grouped[cname] = [] + grouped[cname].append({ + 'platform': s.platform, 'url': s.url, + 'followers': s.followers_count + }) + context['company_social_media'] = grouped + + elif category == 'audits': + # GBP audits + gbp = db.query(GBPAudit).order_by(GBPAudit.created_at.desc()).all() + seen = set() + gbp_unique = [] + for g in gbp: + if g.company_id not in seen: + seen.add(g.company_id) + gbp_unique.append({ + 'company': g.company.name if g.company else None, + 'score': g.overall_score, 'reviews': g.total_reviews, + 'rating': g.average_rating + }) + context['gbp_audits'] = gbp_unique + + # SEO audits + seo = db.query(CompanyWebsiteAnalysis).all() + context['seo_audits'] = [ + {'company': s.company.name if s.company else None, + 'seo': s.seo_score, 'performance': s.performance_score} + for s in seo + ] + + # If no companies were loaded by any category, load a minimal summary + if 'all_companies' not in context: + context['all_companies'] = [] + + finally: + db.close() + + return context +``` + +- [ ] **Step 2: Verify syntax** + +```bash +python3 -m py_compile context_builder.py && echo "OK" +``` + +- [ ] **Step 3: Commit** + +```bash +git add context_builder.py +git commit -m "feat(nordagpt): add context_builder.py — selective data loading for smart router" +``` + +--- + +### Task 5: Create smart_router.py — query classification + +**Files:** +- Create: `smart_router.py` + +- [ ] **Step 1: Create smart_router.py** + +```python +""" +Smart Router for NordaGPT +========================== +Classifies query complexity and selects which data categories to load. +Uses Gemini 3.1 Flash-Lite for fast, cheap classification (~1-2s). +""" + +import json +import logging +import time +from typing import Dict, Any, List, Optional + +logger = logging.getLogger(__name__) + +# Keyword-based fast routing (no API call needed) +FAST_ROUTES = { + 'companies_all': ['wszystkie firmy', 'ile firm', 'lista firm', 'katalog', 'porównaj firmy'], + 'events': ['wydarzenie', 'spotkanie', 'kalendarz', 'konferencja', 'szkolenie', 'kiedy'], + 'news': ['aktualności', 'nowości', 'wiadomości', 'pej', 'atom', 'elektrownia', 'zopk'], + 'classifieds': ['ogłoszenie', 'b2b', 'zlecenie', 'oferta', 'szukam', 'oferuję'], + 'forum': ['forum', 'dyskusja', 'temat', 'wątek', 'post'], + 'company_people': ['zarząd', 'krs', 'właściciel', 'prezes', 'udziały', 'wspólnik'], + 'registered_users': ['użytkownik', 'kto jest', 'profil', 'zarejestrowany', 'członek'], + 'social_media': ['facebook', 'instagram', 'linkedin', 'social media', 'media społeczn'], + 'audits': ['seo', 'google', 'gbp', 'opinie', 'ocena', 'pageSpeed'], +} + +# Model selection by complexity +MODEL_MAP = { + 'simple': {'model': '3.1-flash-lite', 'thinking': 'minimal'}, + 'medium': {'model': '3-flash', 'thinking': 'low'}, + 'complex': {'model': '3-flash', 'thinking': 'high'}, +} + +ROUTER_PROMPT = """Jesteś routerem zapytań. Przeanalizuj pytanie i zdecyduj jakie dane potrzebne. + +Użytkownik: {user_name} z firmy {company_name} +Pytanie: {message} + +Zwróć TYLKO JSON (bez markdown): +{{ + "complexity": "simple|medium|complex", + "data_needed": ["lista kategorii z poniższych"] +}} + +Kategorie: +- companies_all — wszystkie firmy (porównania, przeglądy, "ile firm") +- companies_filtered:KATEGORIA — firmy z kategorii (np. companies_filtered:IT) +- companies_single:NAZWA — jedna firma (np. companies_single:termo) +- events — nadchodzące wydarzenia +- news — aktualności, PEJ, ZOPK +- classifieds — ogłoszenia B2B +- forum — tematy forum +- company_people — zarząd, KRS, udziałowcy +- registered_users — użytkownicy portalu +- social_media — profile social media firm +- audits — wyniki SEO/GBP + +Zasady: +- "simple" = jedno pytanie o konkretną rzecz (telefon, adres, link) +- "medium" = porównanie, lista, filtrowanie +- "complex" = analiza, strategia, rekomendacje +- Wybierz MINIMUM kategorii. Nie ładuj niepotrzebnych danych. +- Jeśli pytanie dotyczy konkretnej firmy, użyj companies_single:nazwa +- Pytania ogólne o użytkownika (kim jestem, co wiesz) = [] (dane z profilu wystarczą) +""" + + +def route_query_fast(message: str, user_context: Optional[Dict] = None) -> Dict[str, Any]: + """ + Fast keyword-based routing. No API call. + Returns routing decision or None if uncertain (needs AI router). + """ + msg_lower = message.lower() + + # Check for personal questions — no data needed + personal_patterns = ['kim jestem', 'co wiesz o mnie', 'mój profil', 'moje dane'] + if any(p in msg_lower for p in personal_patterns): + return { + 'complexity': 'simple', + 'data_needed': [], + 'model': '3.1-flash-lite', + 'thinking': 'minimal', + 'routed_by': 'fast' + } + + # Check for greetings — no data needed + greeting_patterns = ['cześć', 'hej', 'witam', 'dzień dobry', 'siema', 'hello'] + if any(msg_lower.strip().startswith(p) for p in greeting_patterns) and len(message) < 30: + return { + 'complexity': 'simple', + 'data_needed': [], + 'model': '3.1-flash-lite', + 'thinking': 'minimal', + 'routed_by': 'fast' + } + + # Check keyword matches + matched_categories = [] + for category, keywords in FAST_ROUTES.items(): + if any(kw in msg_lower for kw in keywords): + matched_categories.append(category) + + # Check for specific company name mention + # Simple heuristic: if message has quotes or specific capitalized words + if not matched_categories: + # Can't determine — return None to trigger AI router + return None + + # Determine complexity + if len(matched_categories) <= 1 and len(message) < 80: + complexity = 'simple' + elif len(matched_categories) <= 2: + complexity = 'medium' + else: + complexity = 'complex' + + model_config = MODEL_MAP[complexity] + return { + 'complexity': complexity, + 'data_needed': matched_categories, + 'model': model_config['model'], + 'thinking': model_config['thinking'], + 'routed_by': 'fast' + } + + +def route_query_ai( + message: str, + user_context: Optional[Dict] = None, + gemini_service=None +) -> Dict[str, Any]: + """ + AI-powered routing using Flash-Lite. Called when fast routing is uncertain. + """ + if not gemini_service: + # Fallback: load everything + return _fallback_route() + + user_name = user_context.get('user_name', 'Nieznany') if user_context else 'Nieznany' + company_name = user_context.get('company_name', 'brak') if user_context else 'brak' + + prompt = ROUTER_PROMPT.format( + user_name=user_name, + company_name=company_name, + message=message + ) + + try: + start = time.time() + response = gemini_service.generate_text( + prompt=prompt, + temperature=0.1, + max_tokens=200, + model='gemini-3.1-flash-lite-preview', + thinking_level='minimal', + feature='smart_router' + ) + latency = int((time.time() - start) * 1000) + logger.info(f"Smart Router AI response in {latency}ms: {response[:200]}") + + # Parse JSON from response + # Handle potential markdown wrapping + text = response.strip() + if text.startswith('```'): + text = text.split('\n', 1)[1].rsplit('```', 1)[0].strip() + + result = json.loads(text) + complexity = result.get('complexity', 'medium') + model_config = MODEL_MAP.get(complexity, MODEL_MAP['medium']) + + return { + 'complexity': complexity, + 'data_needed': result.get('data_needed', []), + 'model': model_config['model'], + 'thinking': model_config['thinking'], + 'routed_by': 'ai', + 'router_latency_ms': latency + } + + except (json.JSONDecodeError, KeyError, Exception) as e: + logger.warning(f"Smart Router AI failed: {e}, falling back to full context") + return _fallback_route() + + +def route_query( + message: str, + user_context: Optional[Dict] = None, + gemini_service=None +) -> Dict[str, Any]: + """ + Main entry point. Tries fast routing first, falls back to AI routing. + """ + # Try fast keyword-based routing + result = route_query_fast(message, user_context) + if result is not None: + logger.info(f"Smart Router FAST: complexity={result['complexity']}, data={result['data_needed']}") + return result + + # Fall back to AI routing + result = route_query_ai(message, user_context, gemini_service) + logger.info(f"Smart Router AI: complexity={result['complexity']}, data={result['data_needed']}") + return result + + +def _fallback_route() -> Dict[str, Any]: + """Fallback: load everything, use default model. Safe but slow.""" + return { + 'complexity': 'medium', + 'data_needed': [ + 'companies_all', 'events', 'news', 'classifieds', + 'forum', 'company_people', 'registered_users' + ], + 'model': '3-flash', + 'thinking': 'low', + 'routed_by': 'fallback' + } +``` + +- [ ] **Step 2: Verify syntax** + +```bash +python3 -m py_compile smart_router.py && echo "OK" +``` + +- [ ] **Step 3: Commit** + +```bash +git add smart_router.py +git commit -m "feat(nordagpt): add smart_router.py — fast keyword routing + AI fallback" +``` + +--- + +### Task 6: Integrate Smart Router into nordabiz_chat.py + +**Files:** +- Modify: `nordabiz_chat.py:163-282, 347-643, 890-1365` + +- [ ] **Step 1: Add imports at top of nordabiz_chat.py** + +After existing imports (around line 30), add: + +```python +from smart_router import route_query +from context_builder import build_selective_context +``` + +- [ ] **Step 2: Modify send_message() to use Smart Router** + +In `send_message()`, replace the call to `_build_conversation_context()` and `_query_ai()` (around lines 236-239). The key change: use the router to decide model and data, then use context_builder for selective loading. + +Find the section where context is built and AI is queried (around lines 236-241): + +```python +# Before (approximately lines 236-241): +# context = self._build_conversation_context(db, conversation, original_message) +# ai_response_text = self._query_ai(context, original_message, user_id=user_id, thinking_level=thinking_level, user_context=user_context) + +# After: +# Smart Router — classify query and select data + model +route_decision = route_query( + message=original_message, + user_context=user_context, + gemini_service=self.gemini_service +) + +# Override model and thinking based on router decision +effective_model = route_decision.get('model', '3-flash') +effective_thinking = route_decision.get('thinking', thinking_level) + +# Build selective context (only requested data categories) +context = build_selective_context( + data_needed=route_decision.get('data_needed', []), + conversation_id=conversation.id, + current_message=original_message, + user_context=user_context +) + +# Use the original _query_ai but with router-selected parameters +ai_response_text = self._query_ai( + context, original_message, + user_id=user_id, + thinking_level=effective_thinking, + user_context=user_context +) +``` + +Note: Keep `_build_conversation_context()` and full `_query_ai()` intact as fallback. The router's `_fallback_route()` loads all data, so it's safe. + +- [ ] **Step 3: Log routing decisions** + +After the route_query call, add logging: + +```python +logger.info( + f"NordaGPT Router: user={user_context.get('user_name') if user_context else '?'}, " + f"complexity={route_decision['complexity']}, model={effective_model}, " + f"thinking={effective_thinking}, data={route_decision['data_needed']}, " + f"routed_by={route_decision.get('routed_by')}" +) +``` + +- [ ] **Step 4: Update the GeminiService call in _query_ai() to use effective model** + +Currently `_query_ai()` uses `self.gemini_service` which has a fixed model. We need to pass the router-selected model to the generate_text call. In `_query_ai()`, around line 1352, modify: + +```python +# Before: +response = self.gemini_service.generate_text( + prompt=full_prompt, + temperature=0.7, + thinking_level=thinking_level, + user_id=user_id, + feature='chat' +) + +# After: +response = self.gemini_service.generate_text( + prompt=full_prompt, + temperature=0.7, + thinking_level=thinking_level, + user_id=user_id, + feature='chat', + model=route_decision.get('model') if hasattr(self, '_current_route_decision') else None +) +``` + +Actually, a cleaner approach — pass the model through context: + +In `send_message()`, add to context before calling `_query_ai()`: +```python +context['_route_decision'] = route_decision +``` + +In `_query_ai()`, read it at the generate_text call: +```python +route = context.get('_route_decision', {}) +effective_model_id = None +model_alias = route.get('model') +if model_alias: + from gemini_service import GEMINI_MODELS + effective_model_id = GEMINI_MODELS.get(model_alias) + +response = self.gemini_service.generate_text( + prompt=full_prompt, + temperature=0.7, + thinking_level=thinking_level, + user_id=user_id, + feature='chat', + model=effective_model_id +) +``` + +- [ ] **Step 5: Verify syntax** + +```bash +python3 -m py_compile nordabiz_chat.py && echo "OK" +``` + +- [ ] **Step 6: Commit** + +```bash +git add nordabiz_chat.py +git commit -m "feat(nordagpt): integrate smart router — selective context loading + adaptive model selection" +``` + +--- + +### Task 7: Deploy Phase 2 and verify + +- [ ] **Step 1: Push and deploy to staging** + +```bash +git push origin master && git push inpi master +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +``` + +- [ ] **Step 2: Test on staging — verify routing works** + +Test simple query: "Jaki jest telefon do TERMO?" — should be fast (2-3s), Flash-Lite model. +Test medium query: "Porównaj firmy budowlane w Izbie" — should load companies_all, medium speed. +Test complex query: "Jakie firmy mogłyby współpracować przy projekcie PEJ?" — should use full context. + +Check logs for routing decisions: +```bash +ssh maciejpi@10.22.68.248 "journalctl -u nordabiznes -n 30 --no-pager | grep 'Router'" +``` + +- [ ] **Step 3: Deploy to production** + +```bash +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +curl -sI https://nordabiznes.pl/health | head -3 +``` + +--- + +## Phase 3: Streaming Responses (Tasks 8-10) + +### Task 8: Add streaming endpoint in Flask + +**Files:** +- Modify: `blueprints/chat/routes.py` +- Modify: `nordabiz_chat.py` + +- [ ] **Step 1: Add SSE streaming endpoint** + +In `blueprints/chat/routes.py`, add a new route after `chat_send_message()` (after line ~309): + +```python +@bp.route('/api/chat//message/stream', methods=['POST']) +@login_required +@member_required +def chat_send_message_stream(conversation_id): + """Send message to AI chat with streaming response (SSE)""" + from flask import Response, stream_with_context + import json as json_module + + data = request.get_json() + if not data or not data.get('message', '').strip(): + return jsonify({'error': 'Wiadomość nie może być pusta'}), 400 + + message = data['message'].strip() + + # Check limits + from nordabiz_chat import check_user_limits + limit_result = check_user_limits(current_user.id, current_user.email) + if limit_result.get('limited'): + return jsonify({'error': 'Przekroczono limit', 'limit_info': limit_result}), 429 + + # Build user context + user_context = { + 'user_id': current_user.id, + 'user_name': current_user.name, + 'user_email': current_user.email, + 'company_name': current_user.company.name if current_user.company else None, + 'company_id': current_user.company.id if current_user.company else None, + 'company_category': current_user.company.category.name if current_user.company and current_user.company.category else None, + 'company_role': current_user.company_role or 'MEMBER', + 'is_norda_member': current_user.is_norda_member, + 'chamber_role': current_user.chamber_role, + 'member_since': current_user.created_at.strftime('%Y-%m-%d') if current_user.created_at else None, + } + + model_choice = data.get('model') or session.get('chat_model', 'flash') + model_key = '3-flash' if model_choice == 'flash' else '3-pro' + + def generate(): + try: + chat_engine = NordaBizChatEngine(model=model_key) + for chunk in chat_engine.send_message_stream( + conversation_id=conversation_id, + user_message=message, + user_id=current_user.id, + user_context=user_context + ): + yield f"data: {json_module.dumps(chunk, ensure_ascii=False)}\n\n" + except PermissionError: + yield f"data: {json_module.dumps({'type': 'error', 'content': 'Brak dostępu do tej konwersacji'})}\n\n" + except Exception as e: + logger.error(f"Streaming error: {e}") + yield f"data: {json_module.dumps({'type': 'error', 'content': 'Wystąpił błąd'})}\n\n" + + return Response( + stream_with_context(generate()), + mimetype='text/event-stream', + headers={ + 'Cache-Control': 'no-cache', + 'X-Accel-Buffering': 'no', # Disable Nginx buffering + } + ) +``` + +- [ ] **Step 2: Add send_message_stream() to NordaBizChatEngine** + +In `nordabiz_chat.py`, add a new method after `send_message()` (after line ~282): + +```python +def send_message_stream( + self, + conversation_id: int, + user_message: str, + user_id: int, + user_context: Optional[Dict[str, Any]] = None +): + """ + Generator that yields streaming chunks for SSE. + Yields dicts: {'type': 'thinking'|'token'|'done'|'error', 'content': '...'} + """ + import time + + db = SessionLocal() + try: + conversation = db.query(AIChatConversation).filter_by( + id=conversation_id, user_id=user_id + ).first() + if not conversation: + yield {'type': 'error', 'content': 'Konwersacja nie znaleziona'} + return + + # Save user message + original_message = user_message + sanitized = self._sanitize_message(user_message) + user_msg = AIChatMessage( + conversation_id=conversation_id, + role='user', + content=sanitized + ) + db.add(user_msg) + db.commit() + + # Smart Router + route_decision = route_query( + message=original_message, + user_context=user_context, + gemini_service=self.gemini_service + ) + + yield {'type': 'thinking', 'content': 'Analizuję pytanie...'} + + # Build selective context + context = build_selective_context( + data_needed=route_decision.get('data_needed', []), + conversation_id=conversation.id, + current_message=original_message, + user_context=user_context + ) + context['_route_decision'] = route_decision + + # Build prompt (reuse _query_ai logic for prompt building) + full_prompt = self._build_prompt(context, original_message, user_context, route_decision.get('thinking', 'low')) + + # Get effective model + from gemini_service import GEMINI_MODELS + model_alias = route_decision.get('model', '3-flash') + effective_model = GEMINI_MODELS.get(model_alias, self.model_name) + + # Stream from Gemini + start_time = time.time() + stream_response = self.gemini_service.generate_text( + prompt=full_prompt, + temperature=0.7, + stream=True, + thinking_level=route_decision.get('thinking', 'low'), + user_id=user_id, + feature='chat_stream', + model=effective_model + ) + + full_text = "" + for chunk in stream_response: + if hasattr(chunk, 'text') and chunk.text: + full_text += chunk.text + yield {'type': 'token', 'content': chunk.text} + + latency_ms = int((time.time() - start_time) * 1000) + + # Save AI response to DB + ai_msg = AIChatMessage( + conversation_id=conversation_id, + role='assistant', + content=full_text, + latency_ms=latency_ms + ) + db.add(ai_msg) + conversation.updated_at = datetime.now() + conversation.message_count = (conversation.message_count or 0) + 2 + db.commit() + + yield { + 'type': 'done', + 'message_id': ai_msg.id, + 'latency_ms': latency_ms, + 'model': model_alias, + 'complexity': route_decision.get('complexity') + } + + except Exception as e: + logger.error(f"Stream error: {e}", exc_info=True) + yield {'type': 'error', 'content': 'Wystąpił błąd podczas generowania odpowiedzi'} + finally: + db.close() +``` + +- [ ] **Step 3: Extract prompt building into reusable method** + +Add a `_build_prompt()` method to `NordaBizChatEngine` that extracts prompt construction from `_query_ai()`. This method builds the full prompt string without calling Gemini: + +```python +def _build_prompt( + self, + context: Dict[str, Any], + user_message: str, + user_context: Optional[Dict[str, Any]] = None, + thinking_level: str = 'low' +) -> str: + """Build the full prompt string. Extracted from _query_ai() for reuse in streaming.""" + # Build user identity section + user_identity = "" + if user_context: + user_identity = f""" +# AKTUALNY UŻYTKOWNIK +Rozmawiasz z: {user_context.get('user_name', 'Nieznany')} +Firma: {user_context.get('company_name', 'brak')} — kategoria: {user_context.get('company_category', 'brak')} +Rola w firmie: {user_context.get('company_role', 'MEMBER')} +Członek Izby: {'tak' if user_context.get('is_norda_member') else 'nie'} +Rola w Izbie: {user_context.get('chamber_role') or '—'} +Na portalu od: {user_context.get('member_since', 'nieznana data')} +""" + + # Reuse the existing system_prompt from _query_ai() lines 922-1134 + # This is the same static prompt — extract it to a class attribute or method + # For now, call _query_ai's prompt logic + # NOTE: In implementation, refactor the static prompt into a separate method + # to avoid duplication. The key point is that _build_prompt returns the + # same prompt string that _query_ai would build. + + # ... (reuse existing system prompt construction logic) ... + + return full_prompt +``` + +**Implementation note:** The actual implementation should refactor `_query_ai()` to call `_build_prompt()` internally, then the streaming method also calls `_build_prompt()`. This avoids prompt duplication. + +- [ ] **Step 4: Verify syntax** + +```bash +python3 -m py_compile nordabiz_chat.py && python3 -m py_compile blueprints/chat/routes.py && echo "OK" +``` + +- [ ] **Step 5: Commit** + +```bash +git add nordabiz_chat.py blueprints/chat/routes.py +git commit -m "feat(nordagpt): add streaming SSE endpoint + send_message_stream method" +``` + +--- + +### Task 9: Frontend streaming UI + +**Files:** +- Modify: `templates/chat.html` + +- [ ] **Step 1: Add streaming sendMessage function** + +In `templates/chat.html`, replace the existing `sendMessage()` function (lines 2373-2454) with a streaming version: + +```javascript +async function sendMessage() { + const input = document.getElementById('messageInput'); + const message = input.value.trim(); + if (!message || isSending) return; + + isSending = true; + document.getElementById('sendBtn').disabled = true; + input.value = ''; + autoResizeTextarea(); + + // Add user message to chat + addMessage('user', message); + + // Create conversation if needed + if (!currentConversationId) { + try { + const startRes = await fetch('/api/chat/start', { + method: 'POST', + headers: {'Content-Type': 'application/json', 'X-CSRFToken': csrfToken}, + body: JSON.stringify({title: message.substring(0, 50)}) + }); + const startData = await startRes.json(); + currentConversationId = startData.conversation_id; + } catch (e) { + addMessage('assistant', 'Błąd tworzenia konwersacji.'); + isSending = false; + document.getElementById('sendBtn').disabled = false; + return; + } + } + + // Add empty assistant bubble with thinking animation + const msgDiv = document.createElement('div'); + msgDiv.className = 'message assistant'; + msgDiv.innerHTML = ` +
AI
+
+
...
+
+ `; + document.getElementById('chatMessages').appendChild(msgDiv); + scrollToBottom(); + + const contentDiv = msgDiv.querySelector('.message-content'); + + try { + const response = await fetch(`/api/chat/${currentConversationId}/message/stream`, { + method: 'POST', + headers: {'Content-Type': 'application/json', 'X-CSRFToken': csrfToken}, + body: JSON.stringify({message: message, model: currentModel}) + }); + + if (response.status === 429) { + contentDiv.innerHTML = ''; + contentDiv.textContent = 'Przekroczono limit zapytań.'; + showLimitBanner(); + isSending = false; + document.getElementById('sendBtn').disabled = false; + return; + } + + const reader = response.body.getReader(); + const decoder = new TextDecoder(); + let fullText = ''; + let thinkingRemoved = false; + + while (true) { + const {done, value} = await reader.read(); + if (done) break; + + const text = decoder.decode(value, {stream: true}); + const lines = text.split('\n'); + + for (const line of lines) { + if (!line.startsWith('data: ')) continue; + try { + const chunk = JSON.parse(line.slice(6)); + + if (chunk.type === 'thinking') { + // Keep thinking dots visible + continue; + } + + if (chunk.type === 'token') { + if (!thinkingRemoved) { + contentDiv.innerHTML = ''; + thinkingRemoved = true; + } + fullText += chunk.content; + contentDiv.innerHTML = formatMessage(fullText); + scrollToBottom(); + } + + if (chunk.type === 'done') { + // Add tech info badge + if (chunk.latency_ms) { + const badge = document.createElement('div'); + badge.className = 'thinking-info-badge'; + badge.textContent = `${chunk.model || 'AI'} · ${(chunk.latency_ms/1000).toFixed(1)}s`; + msgDiv.appendChild(badge); + } + loadConversations(); + } + + if (chunk.type === 'error') { + contentDiv.innerHTML = ''; + contentDiv.textContent = chunk.content || 'Wystąpił błąd'; + } + } catch (e) { + // Skip malformed chunks + } + } + } + } catch (e) { + contentDiv.innerHTML = ''; + contentDiv.textContent = 'Błąd połączenia z serwerem.'; + } + + isSending = false; + document.getElementById('sendBtn').disabled = false; +} +``` + +- [ ] **Step 2: Add CSS for thinking animation** + +In `templates/chat.html`, in the `{% block extra_css %}` section, add: + +```css +.thinking-dots { + display: flex; + gap: 4px; + padding: 8px 0; +} + +.thinking-dots span { + animation: thinkBounce 1.4s infinite ease-in-out both; + font-size: 1.5rem; + color: var(--text-secondary); +} + +.thinking-dots span:nth-child(1) { animation-delay: -0.32s; } +.thinking-dots span:nth-child(2) { animation-delay: -0.16s; } +.thinking-dots span:nth-child(3) { animation-delay: 0s; } + +@keyframes thinkBounce { + 0%, 80%, 100% { transform: scale(0); } + 40% { transform: scale(1); } +} +``` + +- [ ] **Step 3: Verify locally and commit** + +```bash +python3 -m py_compile app.py && echo "OK" +git add templates/chat.html +git commit -m "feat(nordagpt): streaming UI — word-by-word response with thinking animation" +``` + +--- + +### Task 10: Deploy Phase 3 and verify streaming + +- [ ] **Step 1: Check Nginx/NPM config for SSE support** + +SSE requires Nginx to NOT buffer the response. The streaming endpoint sets `X-Accel-Buffering: no` header. Verify NPM custom config allows this: + +```bash +ssh maciejpi@10.22.68.249 "cat /etc/nginx/sites-enabled/nordabiznes.conf 2>/dev/null || echo 'Using NPM proxy'" +``` + +If using NPM, the `X-Accel-Buffering: no` header should be sufficient. If not, add to NPM custom Nginx config for nordabiznes.pl: +``` +proxy_buffering off; +proxy_cache off; +``` + +- [ ] **Step 2: Push, deploy to staging, test streaming** + +```bash +git push origin master && git push inpi master +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +``` + +Test on staging: open chat, send message, verify text appears word-by-word. + +- [ ] **Step 3: Deploy to production** + +```bash +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && sudo -u www-data git pull && sudo systemctl restart nordabiznes" +curl -sI https://nordabiznes.pl/health | head -3 +``` + +--- + +## Phase 4: Persistent User Memory (Tasks 11-15) + +### Task 11: Database migration — memory tables + +**Files:** +- Create: `database/migrations/092_ai_user_memory.sql` +- Create: `database/migrations/093_ai_conversation_summary.sql` + +- [ ] **Step 1: Create migration 092** + +```sql +-- 092_ai_user_memory.sql +-- Persistent memory for NordaGPT — per-user facts extracted from conversations + +CREATE TABLE IF NOT EXISTS ai_user_memory ( + id SERIAL PRIMARY KEY, + user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE, + fact TEXT NOT NULL, + category VARCHAR(50) DEFAULT 'general', + source_conversation_id INTEGER REFERENCES ai_chat_conversations(id) ON DELETE SET NULL, + confidence FLOAT DEFAULT 1.0, + created_at TIMESTAMP DEFAULT NOW(), + expires_at TIMESTAMP DEFAULT (NOW() + INTERVAL '12 months'), + is_active BOOLEAN DEFAULT TRUE +); + +CREATE INDEX idx_ai_user_memory_user_active ON ai_user_memory(user_id, is_active, confidence DESC); +CREATE INDEX idx_ai_user_memory_expires ON ai_user_memory(expires_at) WHERE is_active = TRUE; + +GRANT ALL ON TABLE ai_user_memory TO nordabiz_app; +GRANT USAGE, SELECT ON SEQUENCE ai_user_memory_id_seq TO nordabiz_app; +``` + +- [ ] **Step 2: Create migration 093** + +```sql +-- 093_ai_conversation_summary.sql +-- Auto-generated summaries of AI conversations for memory context + +CREATE TABLE IF NOT EXISTS ai_conversation_summary ( + id SERIAL PRIMARY KEY, + conversation_id INTEGER NOT NULL UNIQUE REFERENCES ai_chat_conversations(id) ON DELETE CASCADE, + user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE, + summary TEXT NOT NULL, + key_topics JSONB DEFAULT '[]', + created_at TIMESTAMP DEFAULT NOW(), + updated_at TIMESTAMP DEFAULT NOW() +); + +CREATE INDEX idx_ai_conv_summary_user ON ai_conversation_summary(user_id, created_at DESC); + +GRANT ALL ON TABLE ai_conversation_summary TO nordabiz_app; +GRANT USAGE, SELECT ON SEQUENCE ai_conversation_summary_id_seq TO nordabiz_app; +``` + +- [ ] **Step 3: Commit migrations** + +```bash +git add database/migrations/092_ai_user_memory.sql database/migrations/093_ai_conversation_summary.sql +git commit -m "feat(nordagpt): add migrations for user memory and conversation summary tables" +``` + +--- + +### Task 12: Add SQLAlchemy models + +**Files:** +- Modify: `database.py` (insert before line 5954) + +- [ ] **Step 1: Add AIUserMemory model** + +Insert before the `# DATABASE INITIALIZATION` comment (line 5954): + +```python +class AIUserMemory(Base): + __tablename__ = 'ai_user_memory' + + id = Column(Integer, primary_key=True) + user_id = Column(Integer, ForeignKey('users.id', ondelete='CASCADE'), nullable=False) + fact = Column(Text, nullable=False) + category = Column(String(50), default='general') + source_conversation_id = Column(Integer, ForeignKey('ai_chat_conversations.id', ondelete='SET NULL'), nullable=True) + confidence = Column(Float, default=1.0) + created_at = Column(DateTime, default=datetime.utcnow) + expires_at = Column(DateTime) + is_active = Column(Boolean, default=True) + + user = relationship('User') + source_conversation = relationship('AIChatConversation') + + +class AIConversationSummary(Base): + __tablename__ = 'ai_conversation_summary' + + id = Column(Integer, primary_key=True) + conversation_id = Column(Integer, ForeignKey('ai_chat_conversations.id', ondelete='CASCADE'), nullable=False, unique=True) + user_id = Column(Integer, ForeignKey('users.id', ondelete='CASCADE'), nullable=False) + summary = Column(Text, nullable=False) + key_topics = Column(JSON, default=list) + created_at = Column(DateTime, default=datetime.utcnow) + updated_at = Column(DateTime, default=datetime.utcnow) + + user = relationship('User') + conversation = relationship('AIChatConversation') +``` + +- [ ] **Step 2: Verify syntax** + +```bash +python3 -m py_compile database.py && echo "OK" +``` + +- [ ] **Step 3: Commit** + +```bash +git add database.py +git commit -m "feat(nordagpt): add AIUserMemory and AIConversationSummary ORM models" +``` + +--- + +### Task 13: Create memory_service.py + +**Files:** +- Create: `memory_service.py` + +- [ ] **Step 1: Create memory_service.py** + +```python +""" +Memory Service for NordaGPT +============================= +Manages persistent per-user memory: fact extraction, storage, retrieval, cleanup. +""" + +import json +import logging +from datetime import datetime, timedelta +from typing import Dict, Any, List, Optional + +from database import SessionLocal, AIUserMemory, AIConversationSummary, AIChatMessage + +logger = logging.getLogger(__name__) + +EXTRACT_FACTS_PROMPT = """Na podstawie tej rozmowy wyciągnij kluczowe fakty o użytkowniku {user_name} ({company_name}). + +Rozmowa: +{conversation_text} + +Istniejące fakty (NIE DUPLIKUJ): +{existing_facts} + +Zwróć TYLKO JSON array (bez markdown): +[{{"fact": "...", "category": "interests|needs|contacts|insights"}}] + +Zasady: +- Tylko nowe, nietrywialne fakty przydatne w przyszłych rozmowach +- Nie zapisuj: "zapytał o firmę X" (to za mało) +- Zapisuj: "szuka podwykonawców do projektu PEJ w branży elektrycznej" +- Max 3 fakty. Jeśli nie ma nowych faktów, zwróć [] +- Kategorie: interests (zainteresowania), needs (potrzeby biznesowe), contacts (kontakty), insights (wnioski/preferencje) +""" + +SUMMARIZE_PROMPT = """Podsumuj tę rozmowę w 1-3 zdaniach. Skup się na tym, czego użytkownik szukał i co ustalono. + +Rozmowa: +{conversation_text} + +Zwróć TYLKO JSON (bez markdown): +{{"summary": "...", "key_topics": ["temat1", "temat2"]}} +""" + + +def get_user_memory(user_id: int, limit: int = 10) -> List[Dict]: + """Get active memory facts for a user, sorted by recency and confidence.""" + db = SessionLocal() + try: + facts = db.query(AIUserMemory).filter( + AIUserMemory.user_id == user_id, + AIUserMemory.is_active == True, + AIUserMemory.expires_at > datetime.now() + ).order_by( + AIUserMemory.confidence.desc(), + AIUserMemory.created_at.desc() + ).limit(limit).all() + + return [ + { + 'id': f.id, + 'fact': f.fact, + 'category': f.category, + 'confidence': f.confidence, + 'created_at': f.created_at.isoformat() + } + for f in facts + ] + finally: + db.close() + + +def get_conversation_summaries(user_id: int, limit: int = 5) -> List[Dict]: + """Get recent conversation summaries for a user.""" + db = SessionLocal() + try: + summaries = db.query(AIConversationSummary).filter( + AIConversationSummary.user_id == user_id + ).order_by( + AIConversationSummary.created_at.desc() + ).limit(limit).all() + + return [ + { + 'summary': s.summary, + 'topics': s.key_topics or [], + 'date': s.created_at.strftime('%Y-%m-%d') + } + for s in summaries + ] + finally: + db.close() + + +def format_memory_for_prompt(user_id: int) -> str: + """Format user memory and summaries for injection into AI prompt.""" + facts = get_user_memory(user_id) + summaries = get_conversation_summaries(user_id) + + if not facts and not summaries: + return "" + + parts = ["\n# PAMIĘĆ O UŻYTKOWNIKU"] + + if facts: + parts.append("Znane fakty:") + for f in facts: + parts.append(f"- [{f['category']}] {f['fact']}") + + if summaries: + parts.append("\nOstatnie rozmowy:") + for s in summaries: + topics = ", ".join(s['topics'][:3]) if s['topics'] else "" + parts.append(f"- {s['date']}: {s['summary']}" + (f" (tematy: {topics})" if topics else "")) + + parts.append("\nWykorzystuj tę wiedzę do personalizacji odpowiedzi. Nawiązuj do wcześniejszych rozmów gdy to naturalne.") + + return "\n".join(parts) + + +def extract_facts_async( + conversation_id: int, + user_id: int, + user_context: Dict, + gemini_service +): + """ + Extract memory facts from a conversation. Run async after response is sent. + Uses Flash-Lite for minimal cost. + """ + db = SessionLocal() + try: + # Get conversation messages + messages = db.query(AIChatMessage).filter_by( + conversation_id=conversation_id + ).order_by(AIChatMessage.created_at).all() + + if len(messages) < 2: + return # Too short to extract + + conversation_text = "\n".join([ + f"{'Użytkownik' if m.role == 'user' else 'NordaGPT'}: {m.content}" + for m in messages[-10:] # Last 10 messages + ]) + + # Get existing facts to avoid duplicates + existing = db.query(AIUserMemory).filter( + AIUserMemory.user_id == user_id, + AIUserMemory.is_active == True + ).all() + existing_text = "\n".join([f"- {f.fact}" for f in existing]) or "Brak" + + prompt = EXTRACT_FACTS_PROMPT.format( + user_name=user_context.get('user_name', 'Nieznany'), + company_name=user_context.get('company_name', 'brak'), + conversation_text=conversation_text, + existing_facts=existing_text + ) + + response = gemini_service.generate_text( + prompt=prompt, + temperature=0.1, + max_tokens=300, + model='gemini-3.1-flash-lite-preview', + thinking_level='minimal', + feature='memory_extraction' + ) + + # Parse response + text = response.strip() + if text.startswith('```'): + text = text.split('\n', 1)[1].rsplit('```', 1)[0].strip() + + facts = json.loads(text) + if not isinstance(facts, list): + return + + for fact_data in facts[:3]: + if not fact_data.get('fact'): + continue + memory = AIUserMemory( + user_id=user_id, + fact=fact_data['fact'], + category=fact_data.get('category', 'general'), + source_conversation_id=conversation_id, + expires_at=datetime.now() + timedelta(days=365) + ) + db.add(memory) + + db.commit() + logger.info(f"Extracted {len(facts)} memory facts for user {user_id}") + + except Exception as e: + logger.warning(f"Memory extraction failed for conversation {conversation_id}: {e}") + db.rollback() + finally: + db.close() + + +def summarize_conversation_async( + conversation_id: int, + user_id: int, + gemini_service +): + """Generate or update conversation summary. Run async.""" + db = SessionLocal() + try: + messages = db.query(AIChatMessage).filter_by( + conversation_id=conversation_id + ).order_by(AIChatMessage.created_at).all() + + if len(messages) < 2: + return + + conversation_text = "\n".join([ + f"{'Użytkownik' if m.role == 'user' else 'NordaGPT'}: {m.content[:200]}" + for m in messages[-10:] + ]) + + prompt = SUMMARIZE_PROMPT.format(conversation_text=conversation_text) + + response = gemini_service.generate_text( + prompt=prompt, + temperature=0.1, + max_tokens=200, + model='gemini-3.1-flash-lite-preview', + thinking_level='minimal', + feature='conversation_summary' + ) + + text = response.strip() + if text.startswith('```'): + text = text.split('\n', 1)[1].rsplit('```', 1)[0].strip() + + result = json.loads(text) + + existing = db.query(AIConversationSummary).filter_by( + conversation_id=conversation_id + ).first() + + if existing: + existing.summary = result.get('summary', existing.summary) + existing.key_topics = result.get('key_topics', existing.key_topics) + existing.updated_at = datetime.now() + else: + summary = AIConversationSummary( + conversation_id=conversation_id, + user_id=user_id, + summary=result.get('summary', ''), + key_topics=result.get('key_topics', []) + ) + db.add(summary) + + db.commit() + logger.info(f"Summarized conversation {conversation_id}") + + except Exception as e: + logger.warning(f"Conversation summary failed for {conversation_id}: {e}") + db.rollback() + finally: + db.close() + + +def delete_user_fact(user_id: int, fact_id: int) -> bool: + """Soft-delete a memory fact. Returns True if deleted.""" + db = SessionLocal() + try: + fact = db.query(AIUserMemory).filter_by(id=fact_id, user_id=user_id).first() + if fact: + fact.is_active = False + db.commit() + return True + return False + finally: + db.close() +``` + +- [ ] **Step 2: Verify syntax** + +```bash +python3 -m py_compile memory_service.py && echo "OK" +``` + +- [ ] **Step 3: Commit** + +```bash +git add memory_service.py +git commit -m "feat(nordagpt): add memory_service.py — fact extraction, summaries, CRUD" +``` + +--- + +### Task 14: Integrate memory into chat flow + +**Files:** +- Modify: `nordabiz_chat.py` +- Modify: `blueprints/chat/routes.py` + +- [ ] **Step 1: Inject memory into system prompt** + +In `nordabiz_chat.py`, in the `_build_prompt()` or `_query_ai()` method, after the user identity block and before the data sections, add memory: + +```python +from memory_service import format_memory_for_prompt + +# After user_identity block, before data injection: +user_memory_text = "" +if user_context and user_context.get('user_id'): + user_memory_text = format_memory_for_prompt(user_context['user_id']) + +# Prepend to system prompt: +system_prompt = user_identity + user_memory_text + f"""Jesteś pomocnym asystentem...""" +``` + +- [ ] **Step 2: Trigger async memory extraction after response** + +In `send_message()` and `send_message_stream()`, after saving the AI response, trigger async extraction using threading: + +```python +import threading +from memory_service import extract_facts_async, summarize_conversation_async + +# After saving AI response to DB (end of send_message/send_message_stream): +# Async memory extraction — don't block the response +def _extract_memory(): + extract_facts_async(conversation_id, user_id, user_context, self.gemini_service) + # Summarize every 5 messages + if (conversation.message_count or 0) % 5 == 0: + summarize_conversation_async(conversation_id, user_id, self.gemini_service) + +threading.Thread(target=_extract_memory, daemon=True).start() +``` + +- [ ] **Step 3: Add memory CRUD API routes** + +In `blueprints/chat/routes.py`, add routes for viewing and deleting memory: + +```python +@bp.route('/api/chat/memory', methods=['GET']) +@login_required +@member_required +def get_user_memory_api(): + """Get current user's NordaGPT memory facts and summaries""" + from memory_service import get_user_memory, get_conversation_summaries + return jsonify({ + 'facts': get_user_memory(current_user.id, limit=20), + 'summaries': get_conversation_summaries(current_user.id, limit=10) + }) + + +@bp.route('/api/chat/memory/', methods=['DELETE']) +@login_required +@member_required +def delete_memory_fact(fact_id): + """Delete a memory fact""" + from memory_service import delete_user_fact + if delete_user_fact(current_user.id, fact_id): + return jsonify({'status': 'ok'}) + return jsonify({'error': 'Nie znaleziono'}), 404 +``` + +- [ ] **Step 4: Verify syntax** + +```bash +python3 -m py_compile nordabiz_chat.py && python3 -m py_compile blueprints/chat/routes.py && echo "OK" +``` + +- [ ] **Step 5: Commit** + +```bash +git add nordabiz_chat.py blueprints/chat/routes.py +git commit -m "feat(nordagpt): integrate memory into chat — injection, async extraction, CRUD API" +``` + +--- + +### Task 15: Deploy Phase 4 — migrations + code + +- [ ] **Step 1: Push to remotes** + +```bash +git push origin master && git push inpi master +``` + +- [ ] **Step 2: Deploy to staging with migrations** + +```bash +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && sudo -u www-data git pull" +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && /var/www/nordabiznes/venv/bin/python3 scripts/run_migration.py database/migrations/092_ai_user_memory.sql" +ssh maciejpi@10.22.68.248 "cd /var/www/nordabiznes && /var/www/nordabiznes/venv/bin/python3 scripts/run_migration.py database/migrations/093_ai_conversation_summary.sql" +ssh maciejpi@10.22.68.248 "sudo systemctl restart nordabiznes" +``` + +- [ ] **Step 3: Test on staging** + +1. Open chat, have a conversation about looking for IT companies +2. Open another chat, ask "o czym rozmawialiśmy?" — verify AI mentions previous topics +3. Check memory API: `curl https://staging.nordabiznes.pl/api/chat/memory` (with auth) +4. Verify facts are extracted + +- [ ] **Step 4: Deploy to production** + +```bash +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && sudo -u www-data git pull" +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && DATABASE_URL=\$(grep DATABASE_URL .env | cut -d'=' -f2) /var/www/nordabiznes/venv/bin/python3 scripts/run_migration.py database/migrations/092_ai_user_memory.sql" +ssh maciejpi@10.22.68.249 "cd /var/www/nordabiznes && DATABASE_URL=\$(grep DATABASE_URL .env | cut -d'=' -f2) /var/www/nordabiznes/venv/bin/python3 scripts/run_migration.py database/migrations/093_ai_conversation_summary.sql" +ssh maciejpi@10.22.68.249 "sudo systemctl restart nordabiznes" +curl -sI https://nordabiznes.pl/health | head -3 +``` + +- [ ] **Step 5: Update release notes** + +Add entry in `blueprints/public/routes.py` `_get_releases()`. + +--- + +## Post-Implementation Checklist + +- [ ] Verify AI greets users by name +- [ ] Verify Smart Router logs show correct classification +- [ ] Verify streaming works on mobile (Android + iOS) +- [ ] Verify memory facts are extracted after conversations +- [ ] Verify memory is private (user A cannot see user B's facts) +- [ ] Verify response times: simple <3s, medium <6s, complex <12s +- [ ] Monitor costs for first week — compare with estimates +- [ ] Send message to Jakub Pornowski confirming speed improvements