C.R.A.I.G is composed of four independent services that communicate asynchronously:
| Service | Role | Port |
|---|---|---|
| craig-auth | JWT authentication, user management, token generation | 8000 |
| craig-api-semantic | Job reception, RabbitMQ publishing, Redis status, webhook receiver | 8001 |
| craig-api-vision | Florence-2 inference: OCR, classification, object detection, captioning | 8002 |
| craig-worker | RabbitMQ consumer — orchestrates LLaVA → vision pipeline → webhook | — |
craig_tasks RabbitMQ queue as a persistent message.
| URL | Service |
|---|---|
| craig.weesoft.fr | Portal — main entry point |
| craig.lab.weesoft.fr | Lab — document analysis interface |
| craig.feedback.weesoft.fr | Incident — bug & feature reports |
| craig.api.weesoft.fr | Semantic API (public endpoint) |
| craig.auth.weesoft.fr | Auth API |
| craig.monitoring.weesoft.fr | Grafana (SSO) |
| Pipeline JSON | structuration_llm | prompt_by_type | Required? |
|---|---|---|---|
| absent | — | — | Yes — used to generate the pipeline |
| present | absent | — | No — no LLM text step |
| present | present | absent | Yes — becomes the LLM task |
| present | present | present | No — each doc type has its own prompt |
| Model | Type | Use Case |
|---|---|---|
| LLaVA | Multimodal LLM | Pipeline orchestration — analyses image and generates the sequence of vision steps |
| Florence-2-base | Vision model | OCR, image classification (tri), object detection, captioning |
| Mistral | Text LLM | Post-processing: structures raw OCR text into JSON |
| Qwen 2.5 VL | Multimodal LLM | Alternative to LLaVA for pipeline generation, handles complex layouts |
- tri — Classifies the document: determines if it is readable (
lisible: true/false) and provides a detailed caption. - ocr — Extracts all text content from the image using Florence-2's OCR engine.
- detection — Locates and labels objects or regions of interest within the image with bounding boxes.
- poser_question — Answers a specific visual question about the image content (VQA).
- structuration_llm — Sends raw OCR text to a text LLM (Mistral or Qwen) to return structured JSON output.
Advanced users can provide a custom pipeline JSON via the Pipeline field, bypassing the LLaVA generation step:
"etape1": { "action": "tri" },
"etape2": { "action": "ocr", "if_previous": "etape1.lisible" },
"etape3": { "action": "structuration_llm", "modele": "mistral" }
}
Steps are executed in alphabetical order. The if_previous field conditionally skips a step if the referenced field is falsy.
| Icon | Panel | Description |
|---|---|---|
| Architecture | Full Mermaid diagram of C.R.A.I.G components. Shown by default on startup. Zoomable and printable. | |
| Explorer | Main document analysis interface — source, pipeline, output. | |
| Prompt Catalog | 12 ready-to-use prompts across 5 business domains. Click Apply to set the model and prompt in one click. | |
| MCP Connectors | Register external MCP servers and send analysis results to them directly from the sidebar. | |
| Settings | Temperature, context window, system prompts, and system health check. |
- Set the Number of pages field before launching the pipeline.
- Upload one image per page. C.R.A.I.G assigns a shared
documentIdand processes each page in sequence. - For ≤10 pages: all OCR texts are merged before LLM extraction (aggregation mode).
- For >10 pages: sliding window mode — pages are processed in batches of 5 with context carryover.
- See the Multi-Page Analysis reference for full details.
- Access the catalog via the book icon in the sidebar activity bar.
- 12 prompts across Finance, Accounting, Legal, Technical, and HR domains.
- Each prompt specifies a recommended model (LLaVA, Mistral, or Qwen).
- Click APPLY: the model selector and system prompt update instantly. The catalog closes and the pipeline is ready to run.
- Changing the model mid-analysis automatically cancels the current job.
The Lab polls the status endpoint every 2 seconds. With a Tesla T4 GPU, processing typically completes in 15–60 seconds per page depending on model and document complexity.
| Parameter | Default | Description |
|---|---|---|
| Temperature | 0.00 | Controls output randomness. 0.0 = fully deterministic. Raise to 0.3–0.5 for more creative pipeline generation. |
| Context Window (NumCtx) | 8 192 | Token context passed to Ollama. Larger values handle longer documents but use more RAM/VRAM. |
| System Prompts | appsettings.json | Edit the system prompt for LLaVA, Qwen and Mistral. Click Apply Prompts — changes are stored in Redis and take effect immediately. |
The Lab offers two complementary ways to control AI behaviour:
| Where | What it controls | When to use |
|---|---|---|
| Settings Panel → System Prompts | The orchestration prompt sent to LLaVA / Mistral / Qwen to generate the pipeline JSON. | When you want to tune how the model interprets a document type (e.g. always extract IBAN, always output French). |
| Pipeline Editor (central zone) | The pipeline JSON itself — the list of actions (tri, ocr, detection…) that Florence-2 will execute. | When you already know exactly which actions to run and want to bypass LLM generation entirely. |
Both can be used together: set the system prompt in Settings to improve automatic generation, and override with the editor when you need precise control.
The Settings panel includes a System Check section that verifies the status of all installed components in real time. No manual SSH required.
| Component | What is checked |
|---|---|
| RabbitMQ | Queue broker connectivity — jobs cannot be submitted if unavailable. |
| Redis | Job status store — polling will not work if unavailable. |
| Ollama | LLM inference service reachability. |
| LLaVA / Mistral / Qwen2.5-VL | Model presence in Ollama (ollama list). Status missing means the model was not pulled yet. |
| Florence-2 | Vision microservice (OCR / detection / tri) reachability. |
Click REFRESH to re-run all checks. Statuses: ok — missing — unavailable.
POST /config and stored in Redis under craig:runtime_config.
appsettings.json value for the current session.
appsettings.json and redeploy.
Server-side defaults live in /opt/craig-services/api-semantic/appsettings.json. Edit this file to permanently change prompts, MCP connectors, or Ollama endpoints. A service restart is required after file changes.
- AI.OllamaModel — Default orchestration model.
- AIProfiles.*.Prompt — Default system prompts (overridable at runtime).
- AIProfiles.*.NumCtx / Temperature — Default context and creativity per model.
- MCPServers — MCP connector endpoints (ERP, Google Drive…).
craig_token cookie on the .weesoft.fr domain so it is shared with Lab and Incident automatically.
- Create user — Enter an email and password in the New User form and click Add. A confirmation appears and the user list refreshes.
- List users — All registered accounts are displayed with their login and active/inactive status.
- Delete user — Click the Delete button next to a user. A confirmation dialog prevents accidental deletion.
- Generate API token — Click Token next to a user to generate a long-lived JWT. Copy it to authenticate programmatic API calls.
Use the generated token as a Bearer token in any API call:
curl -X POST https://craig.api.weesoft.fr/ask \
-H "Authorization: Bearer <token>" \
-F "image=@document.png" \
-F "question=Extract the invoice total" \
-F "modeleIA=llava"
- Passwords are hashed server-side. They are never stored in plaintext.
- JWT tokens expire after 24 hours by default. Generated API tokens have a longer TTL.
- The
/jobs/{id}/webhookendpoint is internal-only and does not require authentication. - Grafana monitoring is protected by the Auth service via Caddy forward_auth.
| Type | When to use |
|---|---|
| Bug | Something is broken, produces an error, or does not behave as expected. |
| Feature | A new capability you would like added, or an existing behaviour you would like changed. |
- One report per issue — do not combine several problems in a single submission.
- For bugs, include the exact error message and the steps that led to it.
- For feature requests, explain the use case and the expected outcome.
- If the issue is urgent, contact the WeeSoft team directly in addition to filing a report.
"etape1": {
"action": "tri" // required
},
"etape2": {
"action": "ocr",
"if_previous": "etape1.lisible", // optional condition
"model": "florence-2-base" // optional, default: florence-2-base
},
"etape3": {
"action": "structuration_llm",
"modele": "mistral", // mistral | qwen2.5vl
"source": "etape2" // which step provides the text
}
}
| Action | Model | Output fields |
|---|---|---|
| tri | Florence-2 | description, lisible (bool) |
| ocr | Florence-2 | texte, succes (bool) |
| detection | Florence-2 | bboxes (array), labels (array) |
| poser_question | Florence-2 | reponse, succes (bool) |
| structuration_llm | Mistral / Qwen | resultat_structure (JSON string), succes (bool) |
Use if (or the alias if_previous) to skip a step based on a previous result. Supported syntax:
| Syntax | Behaviour |
|---|---|
"etapeX.field" | Skip if field is falsy (false, null, 0, empty string) |
"etapeX.field == true" | Skip unless field is strictly true |
"etapeX.field == false" | Skip unless field is falsy |
"etapeX.field == somevalue" | Skip unless field equals somevalue (case-insensitive) |
"etape4": {
"action": "mcp_store_mock",
"if": "etape3.found == true"
}
Add prompt_by_type to a structuration_llm step to automatically select the extraction prompt
based on the doc_type returned by a preceding classify step.
When this field is present, the AI Instructions field in the UI is ignored.
"action": "structuration_llm",
"modele": "mistral",
"source": "etape1",
"prompt_by_type": {
"bank_statement": "Extract IBAN, balances, dates and amounts as JSON.",
"invoice": "Extract vendor, invoice number, total, VAT and due date as JSON.",
"CV": "Extract candidate name, skills, experience and last position as JSON.",
"default": "Summarize the key information as structured JSON."
}
}
The default key is used when no entry matches the detected doc_type.
- The user specifies the total number of pages in the document before launching the pipeline.
- Each page image is submitted as a separate job, sharing a common
documentId(UUID). - Each job runs OCR and stores the extracted text in Redis under
craig:doc:{documentId}:page:{i}. - An atomic counter tracks completed pages. When the last page is processed, the final LLM extraction is triggered.
- All temporary Redis keys are cleaned up after the final result is produced.
| Mode | Threshold | Behaviour |
|---|---|---|
| Aggregation | ≤ 10 pages | All OCR texts are concatenated into a single prompt. Mistral extracts the requested information from the full document in one pass. |
| Sliding Window | > 10 pages | Pages are processed in batches of 5. Each batch receives a summary of the previous batch as context. A final synthesis pass consolidates all partial results. |
- Upload all page images in the Lab Source panel.
- Set the Number of pages field to the actual page count of the document.
- The label updates automatically: aggregation for ≤10 pages, sliding window for >10.
- A green progress badge shows real-time status: Page 2 / 5 processed…
- The final JSON result is delivered once all pages have been processed.
| Field | Type | Description |
|---|---|---|
documentId | string (UUID) | Shared identifier across all pages of the same document. |
pageIndex | integer | Zero-based index of the current page (0 = first page). |
totalPages | integer | Total number of pages in the document. |
| Status | Meaning |
|---|---|
page_traitee | This page's OCR is complete. Waiting for remaining pages. |
document_complet | All pages processed. Final extraction result is included. |
- MCP (Model Context Protocol) is an open standard that allows AI applications to communicate with external tools and data sources via a unified interface.
- C.R.A.I.G is an MCP Host + Client: it initiates connections to MCP Servers and calls their tools.
- MCP Servers expose tools (e.g.
insert_row,upload_file,create_issue) that C.R.A.I.G can invoke with the analysis JSON as payload. - C.R.A.I.G does not store analysis results on disk — all data flows to the external MCP server.
- Open the Lab sidebar and click the plug icon (MCP Connectors panel).
- Click + ADD and enter a name and the SSE URL of the MCP server (e.g.
http://127.0.0.1:8010/sse). - Click SAVE. The connector is stored in Redis and immediately available.
- Connectors declared in
appsettings.jsonare static (read-only). User-added connectors are marked custom and can be deleted.
- Run an analysis in the Lab. Once a result is available, the MCP panel shows a green "Result ready" banner.
- Each connector with status ok shows a SEND button.
- Clicking SEND calls
POST /mcp/send. The backend connects to the MCP server via SSE, lists available tools, and calls the first tool with the analysis JSON as argument. - The tool name and status are displayed below the connector card.
Install a local MCP PostgreSQL server on the VM using supergateway (no Docker image required for the MCP layer).
npm install -g supergateway
# Create the results table in craig-postgres
docker exec craig-postgres psql -U craig_admin -d craig -c "
CREATE TABLE IF NOT EXISTS craig_resultats (
id SERIAL PRIMARY KEY,
document_nom TEXT,
modele TEXT,
question TEXT,
resultat JSONB,
created_at TIMESTAMP DEFAULT NOW()
);"
# Start the MCP server as a systemd service (port 8010)
supergateway --port 8010 \
--stdio "npx -y @modelcontextprotocol/server-postgres \
postgresql://craig_admin:PASSWORD@127.0.0.1:5432/craig"
| Method | Path | Description |
|---|---|---|
| GET | /mcp/status | List all connectors (static + dynamic) with connectivity status. |
| GET | /mcp/connectors | List dynamic connectors stored in Redis. |
| POST | /mcp/connectors | Add a connector {"name": "…", "url": "…"}. |
| DELETE | /mcp/connectors/{name} | Remove a dynamic connector. |
| GET | /mcp/tools/{name} | List available tools on a connector. |
| POST | /mcp/send | Send analysis result to a connector {"serverName": "…", "data": {…}}. |
Form data:
image File PNG / JPEG image
question string Natural language question
modeleIA string llava | qwen2.5vl (default: llava)
pipelineSaisi string Optional custom pipeline JSON
Response:
{ "jobId": "uuid" }
Response (pending):
{ "status": "pending" }
Response (done):
{
"status": "done",
"result": "{ pipeline, resultats, resultat_final }"
}
JSON body:
{
"transactionId": "string",
"webhookUrl": "https://your-endpoint/callback",
"imagesBase64": ["base64...", "base64..."],
"questionGlobale": "Extract all dates",
"modeleIA": "llava"
}
| Method | Path | Description |
|---|---|---|
| GET | /health | Returns RabbitMQ + Redis connectivity status |
| GET | /resources | Returns available AI models and actions |
| POST | /jobs/{id}/cancel | Marks a job as cancelled in Redis |
| POST | /jobs/{id}/webhook | Internal — receives results from the worker |
| Container | Image | Port |
|---|---|---|
| craig-postgres | postgres:15 | 5432 (localhost only) |
| craig-rabbit | rabbitmq:3-management | 5672 / 15672 |
| craig-redis | redis:7-alpine | 6379 |
| craig-ollama | ollama/ollama | 11434 |
| craig-prometheus | prom/prometheus | 9090 |
| craig-grafana | grafana/grafana | 3000 |
| Unit | Command |
|---|---|
| craig-auth | uvicorn main:app --host 127.0.0.1 --port 8000 |
| craig-api-semantic | uvicorn main:app --host 127.0.0.1 --port 8001 |
| craig-api-vision | uvicorn vision_service:app --host 127.0.0.1 --port 8002 |
| craig-worker | python start_worker.py |
| mcp-postgres (optional) | supergateway --port 8010 --stdio "npx -y @modelcontextprotocol/server-postgres …" |
- Prometheus scrapes metrics from all FastAPI services via the
/metricsendpoint (prometheus-fastapi-instrumentator). - The worker exposes custom metrics on port 9092:
craig_jobs_processed_totallabelled by status (success/error). - Grafana is accessible at craig.monitoring.weesoft.fr. Authentication is delegated to craig-auth via Caddy
forward_auth.
systemctl status craig-auth craig-api-semantic craig-api-vision craig-worker
# View worker logs in real time
journalctl -u craig-worker -f
# Restart all Python services
systemctl restart craig-auth craig-api-semantic craig-api-vision craig-worker
# Pull a new Ollama model
docker exec craig-ollama ollama pull mistral
# Check MCP Postgres service
systemctl status mcp-postgres
# View all analysis results stored via MCP
docker exec craig-postgres psql -U craig_admin -d craig \
-c "SELECT id, document_nom, question, created_at FROM craig_resultats ORDER BY created_at DESC LIMIT 10;"
27 entries tested — 0 errors, 0 skipped — auto-retry active on 11 entries (avg 2.5 iterations).
| Test | Category | Score | Coverage | Note |
|---|---|---|---|---|
| Bank Statement (2 months) | Finance | 100 | 100% | — |
| Invoice Analysis (2 pages) | Accounting | 97 | 100% | — |
| Contract Analysis (2 pages) | Legal | 88 | 100% | — |
| Finishes Schedule (multi-page) | Technical | 100 | 100% | — |
| CV / Resume (2 pages) | HR | 80 | 60% | missing email, experience |
| Pay Slip (2 months) | HR | 49 | 33% | missing employee_name, period |
| Detect Document Type (2 docs) | Sort | 100 | 100% | — |
| Finance Pack (3 docs) | Sort | 100 | 100% | — |
| Mixed Pack (CV + Contract + Payslip) | Sort | 100 | 100% | — |
| Gift Card — Anonymize (2 cards) | Anonymization | 70 | 100% | — |
| SEPA Transfer Order | Finance | 100 | 100% | — |
| Purchase Order | Accounting | 100 | 100% | — |
| Expense Report | Accounting | 100 | 100% | — |
| Extrait KBIS | Legal | 100 | 100% | — |
| Product Datasheet | Technical | 100 | 100% | — |
| Invoice Reconciliation vs Database | MCP | 100 | 100% | — |
| Lettre manuscrite | Manuscripts | 100 | 100% | — |
| Carte régionale PACA | Cartography | 100 | 100% | — |
| Carte de France | Cartography | 100 | 100% | — |
| Personal Data — GDPR Anonymization | Anonymization | 100 | 100% | — |
| Architectural Floor Plan | Technical | 50 | 0% | missing project_name, scale, area |
| Relevé de compte bancaire (FR) | Finance | 100 | 100% | — |
| Virement SEPA (FR) | Finance | 100 | 100% | — |
| Analyse de facture (FR) | Accounting | 100 | 100% | — |
| Plan d'appartement (FR) | Technical | 63 | 33% | missing project_name, scale |
| Extract + Store in PostgreSQL | MCP | 50 | 0% | missing meta fields |
| Full pipeline: OCR + Embed + Store | MCP | 50 | 0% | missing meta fields |
From a machine with Python 3 and access to the API:
pip install httpx cairosvg
# Auth via JWT token
export CRAIG_TOKEN="<your_jwt>"
# Smoke test on MCP only (~4 min)
python run_benchmark.py --category MCP --include-mcp --output mcp_smoke.json
# Full benchmark with MCP (~25 min)
python run_benchmark.py --include-mcp --output full_with_mcp.json
The --include-hidden flag adds catalog entries marked hidden:true (skipped by default).
| Component | Specification | Notes |
|---|---|---|
| CPU | 4 vCPU (Intel/AMD x86_64) | e.g. GCP n1-standard-4 |
| RAM | 15 GB | 16 GB recommended |
| GPU | 1× NVIDIA Tesla T4 — 16 GB VRAM | or equivalent: L4 (24 GB), A10, V100, RTX 3090/4090 |
| Disk | 100 GB SSD (pd-balanced) | ~25 GB for models, ~30 GB for OS+containers |
| OS | Ubuntu 22.04 LTS (kernel 6.8+) | or Debian 12, RHEL 9 |
| Network | Public IP (static recommended) | + DNS for 4 subdomains, TLS auto via Caddy |
- NVIDIA Driver — 535+ for T4, 550+ for L4
- Docker + Compose — container runtime
- nvidia-container-toolkit — GPU access from containers
- Python 3.11 — backend services (managed via venv)
- PostgreSQL 15 — application database (Docker)
- Redis 7 — job state and dynamic MCP connectors (Docker)
- RabbitMQ 3 — job queue for the worker (Docker)
- Ollama 0.6+ — local LLM runtime (Docker, GPU-enabled)
- Caddy 2 — reverse proxy with automatic Let's Encrypt TLS
| Model | Role | Size |
|---|---|---|
minicpm-v | Vision (OCR, document understanding) | 5.5 GB |
mistral-nemo:12b | Text (structuring, delivery, chat) | 7.1 GB |
qwen2.5vl | Vision alternative (cartography, plans) | 6.0 GB |
nomic-embed-text | Semantic embeddings (RAG) | 0.3 GB |
| Provider | Instance | GPU |
|---|---|---|
| Google Cloud (current) | n1-standard-4 + T4 | NVIDIA Tesla T4 (16 GB) |
| Google Cloud (upgrade) | g2-standard-4 + L4 | NVIDIA L4 (24 GB) — ~2× faster |
| AWS | g4dn.xlarge | NVIDIA Tesla T4 (16 GB) |
| Azure | NC4as T4 v3 | NVIDIA Tesla T4 (16 GB) |
| OVH / Scaleway | GPU instance with T4 / L4 | Equivalent EU options |
| On-premise | Workstation / server | RTX 3090/4090, A4000+, or 2× RTX 3060 |
- The current production VM runs in preemptible mode on GCP (≈€80/month), trading stability (max 24h uptime, 30s termination notice) for ~65% cost savings.
- A standard non-preemptible setup costs ≈€250/month for the same hardware.
- Recommendation: preemptible is fine for staging/dev workloads or production with an auto-restart mechanism (Cloud Scheduler hitting the start endpoint every 5 minutes).
- Chat mode — describe what you want to extract in plain language. The assistant generates a complete pipeline (sort, OCR, classify, structure_llm, embed, MCP…) and renders it as a chip list inside the conversation bubble.
- Editor mode — the classic step-by-step visual editor. Add, remove, reorder steps; edit each step's parameters (model, engine, prompt, source, conditions).
- A toggle at the top of the panel switches between the two modes at any time; the conversation history is preserved.
| Persona | Typical workflow |
|---|---|
| Business user | Type the request in Chat mode → click APPLY THIS PIPELINE. Done. |
| Hybrid user | Generate a draft in Chat → click EDIT to switch to Editor mode with
the pipeline pre-filled → adjust 1-2 steps → APPLY. |
| Power user | Open Editor mode directly → build the pipeline manually step by step (same UX as the legacy Pipeline Creator). |
- Each generated pipeline is shown as a colored chip list
(e.g.
1. SORT · florence-2-base,2. OCR · paddleocr,3. CLASSIFY,4. LLM · mistral-nemo) — friendlier than raw JSON. - The raw JSON is still available via a collapsible
▸ View JSONdetails block. - Two action buttons:
EDIT(open in Editor mode) andAPPLY THIS PIPELINE →. - Validation badge: green check (Pipeline ready) or red triangle (Pipeline invalid), plus the model that generated it.
- The chat keeps full context: ask follow-up questions like "add a step to save the result in PostgreSQL via MCP" or "change step 4 to use qwen instead of mistral" and the assistant produces an updated pipeline.
- The conversation can be cleared with the
Resetbutton (top-right).
| You may | You may not |
|---|---|
| Use, install, copy and modify C.R.A.I.G for personal, educational or research purposes. | Resell C.R.A.I.G or any modified version of it. |
| Self-host the platform inside your organisation for non-commercial use. | Embed C.R.A.I.G into a commercial product or paid service. |
| Distribute the source to others under the same license. | Offer C.R.A.I.G as a hosted SaaS to third parties. |
| Submit pull requests and improvements (under the same license). | Remove or alter the license notices in the code. |
The full, legally binding text of the PolyForm Noncommercial License 1.0.0 is published by the PolyForm Project. Always refer to the official document — the table above is a plain-English summary, not the legal text.
- Official text — polyformproject.org/licenses/noncommercial/1.0.0
- In the C.R.A.I.G repository — see the
LICENSEfile at the root. - License author — PolyForm Project (lead drafter: Heather Meeker).
Any commercial use of C.R.A.I.G — including, but not limited to, reselling, embedding it inside a commercial product, or providing it as a hosted service to third parties — requires a separate commercial license from WeeSoft.
To request a commercial license, write to: contact@weesoft.fr.
- Deployment & installation on your infrastructure (on-premise or private cloud).
- Configuration & integration with your existing systems (databases, ERPs, document repositories) via the MCP layer.
- Custom pipelines tailored to your document types (invoices, contracts, technical plans, audio…).
- Training & onboarding for your teams (Lab usage, prompt design, monitoring).
- Production support with defined SLA, monitoring and incident response.
These services are billed independently of the underlying license and are available for both non-commercial and commercial users. Contact: contact@weesoft.fr.
- Pull requests are welcome — by contributing you agree that your contribution is licensed under the same PolyForm Noncommercial License 1.0.0.
- For larger changes, please open an issue first to discuss the scope.
This page provides a high-level summary of the C.R.A.I.G licensing model. It is not legal advice and does not replace the official PolyForm Noncommercial License 1.0.0 text. In case of any discrepancy between this summary and the official license, the official license text prevails.