vek1-api — arquitetura
Arquitetura
Stack
| Camada | Usa |
|---|---|
| Web framework | FastAPI 0.104 + Uvicorn |
| Middleware | CORSMiddleware (allow_origins=*), GZipMiddleware (≥1KB), custom logging, custom large-file (50MB) |
| Rate limit | slowapi (opcional — não trava dev sem dep); Limiter(key_func=get_remote_address) |
| DB | Postgres 16 + pgvector 0.8.2 self-hosted (pgvector/pgvector:pg16) |
| DB driver | psycopg2-binary + ThreadedConnectionPool (síncrono) |
| Embeddings | Ollama bge-m3 (1024 dim) — endpoint /api/embed em vault-ollama |
| LLM | DeepSeek (deepseek-chat) via SDK openai com base_url=https://api.deepseek.com |
| Tool calling | OpenAI Function Calling protocol (DeepSeek compatível) |
| HTTP client | httpx (Ollama, Resend, AbacatePay outbound, stock sync) |
| Resend (transactional reset-password) | |
| Validação | Pydantic v2 com ConfigDict(extra="forbid") strict em todos endpoints |
Estrutura do repo
main.py # FastAPI app + endpoints públicos + middlewares + exception handlers
routers/
_deps.py # token scopes, actor, request_id, assert_uuid
auth.py # Better Auth backing store (29 endpoints)
orders.py # orders + payment-settings + abacate-pay webhook
leads.py # leads CRUD + events
agents.py # agents CRUD + KB + evolution-instance lookup
products.py # products + documents + product_files
stores.py # stores + company
dashboard.py # aggregated stats
messages.py # messages_history + audit_log read
token_usage.py # billing telemetry
email.py # Resend send-reset-password
stock.py # decrement/restore + ERP sync
services/
db.py # ThreadedConnectionPool, fetch_one/all/execute, match_documents
embeddings.py # Ollama /api/embed client
rag_service.py # function calling loop, memory injection, tool handlers
file_processor.py # PDF/CSV chunking
auth_service.py # CRUD auth tables + cache em memória
audit_service.py # log_audit fire-and-forget background
orders_service.py # state machine + transitions + stock movements coordination
leads_service.py # ensure/list/patch + similar leads (vector)
agents_service.py # CRUD + active toggle (1-per-store rule)
products_service.py # CRUD + batch upsert
stores_service.py # CRUD + company ensure
dashboard_service.py # joins/aggregates
token_usage_service.py # insert + summary
email_service.py # Resend client
stock_service.py # bidirectional sync (inbound HMAC, outbound HMAC)
config/
agents_config.yaml # prompts + settings + tools por tipo de agente
agent_config_manager.py # load + reload + validation
logging_config.py
init/01-init.sql # pgvector ext + HNSW indexes + RPC match_documents_test
regenerate_embeddings.py # script bulk re-embed
tests/
test_*.py # pytest suite (orders e2e, stock sync, function calling integration)
.claude/agents/vek1-api-architect.md
Dockerfile # python:3.10-slim
docker-compose.yml # postgres + vek1-api + networks
nginx.conf # reverse proxy template
pytest.ini, requirements-docker.txt, requirements.txt
Camadas
| Camada | Onde | Notas |
|---|---|---|
| HTTP surface | routers/*.py |
Pydantic strict + token scope dep + actor ownership check |
| Domain logic | services/*_service.py |
Sem FastAPI imports — usável fora do HTTP |
| Persistence | services/db.py |
psycopg2 pool sync; helpers fetch_one/all/execute retornam dict |
| Audit | services/audit_service.py |
log_audit(actor, action, target_type, target_id, ip, ua, meta) — escrita em audit_log via BackgroundTasks (não bloqueia request) |
| LLM | services/rag_service.py |
Function calling loop; tools como callables registrados |
| Cache | services/auth_service.py |
In-memory dict (singleton) — findUserByEmail + findSessionByToken TTL curto |
Dependency injection padrões
from routers._deps import (
require_app_token, # 401 se X-Internal-Token errado
require_auth_token, # 401 se X-Auth-Token errado
require_webhook_token, # 401 se X-Webhook-Token errado
require_actor_user_id, # 401 se X-Actor-User-Id ausente; valida regex
require_request_id, # gera novo se ausente
assert_uuid, # 400 se UUID inválido (anti-injection em path params)
)
router = APIRouter(
prefix="/internal/orders",
dependencies=[Depends(require_app_token)],
)
@router.post("/{order_id}/transition")
async def transition(
order_id: str,
body: TransitionIn,
actor: str = Depends(require_actor_user_id),
request_id: str = Depends(require_request_id),
):
assert_uuid(order_id, "order_id")
order = assert_owns_order(actor, order_id) # 404 se não dono
...
Ownership checks (assert_owns_*)
Cada domínio com escopo store-level tem helper local:
def assert_owns_store(actor: str, store_id: str) -> None:
if not db.fetch_one("SELECT 1 FROM stores WHERE id = %s AND company_id = %s", (store_id, actor)):
raise HTTPException(404, "store not found")
def assert_owns_order(actor: str, order_id: str) -> dict[str, Any]:
row = db.fetch_one("""
SELECT o.* FROM orders o
JOIN stores s ON s.id = o.store_id
WHERE o.id = %s AND s.company_id = %s LIMIT 1
""", (order_id, actor))
if not row:
raise HTTPException(404, "order not found")
return row
404 (não 403) deliberado — não vaza se recurso existe pra outro tenant.
Exception handlers globais (main.py)
global_exception_handler(Exception)— 500 com stack trace logado, body genérico ({error, message, type, path})http_exception_handler(HTTPException)— passa detail originalvalidation_exception_handler(RequestValidationError)— 422 com lista de campos inválidos
Logs com emoji prefixes pra leitura rápida: 📥 request, 📤 response, ✅ success, ❌ erro, 🚨 exceção crítica.
Schema Postgres
Source-of-truth no vek1 (src/lib/db/schema.ts Drizzle). vek1-api consome (não define).
init/01-init.sql aplica extras:
CREATE EXTENSION pgvector- HNSW index em
documents.embedding - HNSW index em
leads.profile_embedding - Unique compostas custom (
leads (store_id, channel, external_id)) - RPC
match_documents_test(query_embedding, threshold, count, p_agent_id?)— busca semântica com filtro opcional por agent KB (JOIN agent_knowledge_base seagent_idprovided)
Deploy
- Path no VPS:
/home/vek1-api/ - Containers:
vek1-api(127.0.0.1:8000) +vek1-postgres(0.0.0.0:5434) - Compose:
docker compose -f docker-compose.yml up -d --build - Volumes:
./pgdata,./uploads,./config,./logs - Networks:
vek1-net(interno) +vault-site_default(external pra alcançarvault-ollama) - Nginx vhost:
/etc/nginx/sites-available/vek1-api.kodama.solutions(TLS Let's Encrypt)
Performance
- Cold start primeiro embedding Ollama: ~10s (model load)
- Embeddings subsequentes: 0.4–1.7s
- Match pgvector (hnsw, ~100 docs): <15ms
- DeepSeek chat (sem tools): ~1.9s
- Chat com 1 tool call: ~3–5s end-to-end
- Chat com
create_order(search + create + format): ~6–8s /extract-lead: ~1.5–3s- Auth endpoints cached: <30ms; cold: <80ms