WHAT'S NEW
C.R.A.I.G v1.0 is the first public release. This section presents the complete feature set shipped with this version.
RELEASE HIGHLIGHTS
LAB — AI Document Analysis
Upload images or scanned documents and let C.R.A.I.G orchestrate a multi-step AI pipeline to extract, classify, and structure their content.
AUTH — User Management
Centralized user registry with JWT authentication. Create accounts, manage access, and generate API tokens directly from the Portal.
INCIDENT — Bug & Feature Reports
Integrated feedback form to report bugs or request new features. All reports are tracked for continuous improvement.
Multi-Model AI Orchestration
LLaVA generates the pipeline dynamically. Florence-2 handles OCR and visual tasks. Mistral / Qwen structure the extracted data.
Multi-Page Document Analysis
Upload all pages of a document at once. C.R.A.I.G automatically selects aggregation mode (≤10 pages) or sliding window mode (>10 pages) for optimal extraction.
MCP Client Connectors
Send analysis results to external systems via the Model Context Protocol. Register connectors dynamically from the Lab sidebar — no code required.
Prompt Catalog
12 ready-to-use business prompts across Finance, Accounting, Legal, Technical, and HR domains. One click applies the prompt and selects the optimal model.
Architecture Overview
Interactive Mermaid diagram of the full C.R.A.I.G architecture displayed as the Lab home screen. Printable directly from the toolbar.
Async Job Queue
Jobs are processed asynchronously via RabbitMQ. Results are delivered by webhook and stored in Redis for real-time polling.
Monitoring
Prometheus metrics exposed by each service. Grafana dashboards available at craig.monitoring.weesoft.fr (SSO-protected).
TECHNOLOGY STACK
Python 3.11 FastAPI RabbitMQ Redis PostgreSQL 15 Ollama 0.6.8 LLaVA Mistral Qwen2.5-VL Florence-2 MCP (Model Context Protocol) React + TypeScript Mermaid Caddy Docker Prometheus / Grafana GCP — Tesla T4 GPU
OVERVIEW
C.R.A.I.G — Cognitive Reasoning Agent for Intelligent Graphical — is an AI platform designed to automate document analysis using a chain of specialized models.
ARCHITECTURE

C.R.A.I.G is composed of four independent services that communicate asynchronously:

ServiceRolePort
craig-auth JWT authentication, user management, token generation 8000
craig-api-semantic Job reception, RabbitMQ publishing, Redis status, webhook receiver 8001
craig-api-vision Florence-2 inference: OCR, classification, object detection, captioning 8002
craig-worker RabbitMQ consumer — orchestrates LLaVA → vision pipeline → webhook
JOB LIFECYCLE
1
SUBMIT The frontend (Lab) sends an image + question via POST /ask. The API returns a jobId immediately.
2
QUEUE The job is published to the craig_tasks RabbitMQ queue as a persistent message.
3
ORCHESTRATE The worker consumes the job, calls LLaVA with the image to generate a pipeline JSON describing which vision steps to execute.
4
EXECUTE The worker executes each pipeline step against craig-api-vision (Florence-2) and optionally a text LLM (Mistral/Qwen) for structuring.
5
DELIVER Results are sent via webhook to craig-api-semantic, stored in Redis. The frontend polls /jobs/{id}/status until status is "done".
ACCESS
URLService
craig.weesoft.frPortal — main entry point
craig.lab.weesoft.frLab — document analysis interface
craig.feedback.weesoft.frIncident — bug & feature reports
craig.api.weesoft.frSemantic API (public endpoint)
craig.auth.weesoft.frAuth API
craig.monitoring.weesoft.frGrafana (SSO)
LAB
The Lab is the main document analysis interface. Upload an image, select an AI model, ask a question, and get structured results.
GETTING STARTED
1
SIGN IN Log in with your C.R.A.I.G credentials at craig.lab.weesoft.fr. Your session is shared across the platform.
2
EXPLORE THE ARCHITECTURE The Lab opens on the Architecture screen — an interactive Mermaid diagram of all C.R.A.I.G components. Use the zoom controls or click FIT to fill the screen. Click LAUNCH THE LAB or any sidebar icon to proceed.
3
UPLOAD A DOCUMENT Drag and drop one or more images (PNG, JPEG, TIFF, PDF page) into the SOURCE panel, or click to browse. For multi-page documents, set the Number of pages field first. Click RESET next to the Source title to clear all loaded documents and start over.
4
SELECT AN AI MODEL Choose the orchestration model from the dropdown, or use the Prompt Catalog (book icon) to apply a business prompt — this sets both the model and the instruction in one click.
5
AI INSTRUCTIONS (optional) Type a natural language question or task about the document (e.g. "Extract the invoice total" or "Identify the patient name"). Whether this field is required depends on your pipeline:
Pipeline JSONstructuration_llmprompt_by_typeRequired?
absentYes — used to generate the pipeline
present absentNo — no LLM text step
present present absentYes — becomes the LLM task
present present presentNo — each doc type has its own prompt
6
RUN THE ANALYSIS Click LAUNCH THE PIPELINE. Jobs are processed asynchronously. For multi-page documents, a progress badge shows each page as it completes. The final JSON result appears in the Output panel when all pages are done.
7
SEND TO AN MCP CONNECTOR Once a result is available, open the MCP Connectors panel (plug icon). Connected servers show a green dot and a SEND button. Click it to push the result to an external system (e.g. PostgreSQL).
AI MODELS
ModelTypeUse Case
LLaVA Multimodal LLM Pipeline orchestration — analyses image and generates the sequence of vision steps
Florence-2-base Vision model OCR, image classification (tri), object detection, captioning
Mistral Text LLM Post-processing: structures raw OCR text into JSON
Qwen 2.5 VL Multimodal LLM Alternative to LLaVA for pipeline generation, handles complex layouts
PIPELINE STEPS (VISION)
  • tri — Classifies the document: determines if it is readable (lisible: true/false) and provides a detailed caption.
  • ocr — Extracts all text content from the image using Florence-2's OCR engine.
  • detection — Locates and labels objects or regions of interest within the image with bounding boxes.
  • poser_question — Answers a specific visual question about the image content (VQA).
  • structuration_llm — Sends raw OCR text to a text LLM (Mistral or Qwen) to return structured JSON output.
PIPELINE (CUSTOM)

Advanced users can provide a custom pipeline JSON via the Pipeline field, bypassing the LLaVA generation step:

{
  "etape1": { "action": "tri" },
  "etape2": { "action": "ocr", "if_previous": "etape1.lisible" },
  "etape3": { "action": "structuration_llm", "modele": "mistral" }
}

Steps are executed in alphabetical order. The if_previous field conditionally skips a step if the referenced field is falsy.

SIDEBAR PANELS
IconPanelDescription
ArchitectureFull Mermaid diagram of C.R.A.I.G components. Shown by default on startup. Zoomable and printable.
ExplorerMain document analysis interface — source, pipeline, output.
Prompt Catalog12 ready-to-use prompts across 5 business domains. Click Apply to set the model and prompt in one click.
MCP ConnectorsRegister external MCP servers and send analysis results to them directly from the sidebar.
SettingsTemperature, context window, system prompts, and system health check.
MULTI-PAGE DOCUMENTS
  • Set the Number of pages field before launching the pipeline.
  • Upload one image per page. C.R.A.I.G assigns a shared documentId and processes each page in sequence.
  • For ≤10 pages: all OCR texts are merged before LLM extraction (aggregation mode).
  • For >10 pages: sliding window mode — pages are processed in batches of 5 with context carryover.
  • See the Multi-Page Analysis reference for full details.
PROMPT CATALOG
  • Access the catalog via the book icon in the sidebar activity bar.
  • 12 prompts across Finance, Accounting, Legal, Technical, and HR domains.
  • Each prompt specifies a recommended model (LLaVA, Mistral, or Qwen).
  • Click APPLY: the model selector and system prompt update instantly. The catalog closes and the pipeline is ready to run.
  • Changing the model mid-analysis automatically cancels the current job.
JOB STATUS
pending — job is queued or processing done — results are available annule — job was cancelled error — processing failed

The Lab polls the status endpoint every 2 seconds. With a Tesla T4 GPU, processing typically completes in 15–60 seconds per page depending on model and document complexity.

CONFIGURATION
All parameters can be adjusted at runtime from the Lab's Settings panel. No service restart is required.
SETTINGS PANEL
ParameterDefaultDescription
Temperature0.00 Controls output randomness. 0.0 = fully deterministic. Raise to 0.3–0.5 for more creative pipeline generation.
Context Window (NumCtx)8 192 Token context passed to Ollama. Larger values handle longer documents but use more RAM/VRAM.
System Promptsappsettings.json Edit the system prompt for LLaVA, Qwen and Mistral. Click Apply Prompts — changes are stored in Redis and take effect immediately.
PROMPTS — SETTINGS PANEL VS PIPELINE EDITOR

The Lab offers two complementary ways to control AI behaviour:

WhereWhat it controlsWhen to use
Settings Panel → System Prompts The orchestration prompt sent to LLaVA / Mistral / Qwen to generate the pipeline JSON. When you want to tune how the model interprets a document type (e.g. always extract IBAN, always output French).
Pipeline Editor (central zone) The pipeline JSON itself — the list of actions (tri, ocr, detection…) that Florence-2 will execute. When you already know exactly which actions to run and want to bypass LLM generation entirely.

Both can be used together: set the system prompt in Settings to improve automatic generation, and override with the editor when you need precise control.

SYSTEM CHECK PANEL

The Settings panel includes a System Check section that verifies the status of all installed components in real time. No manual SSH required.

ComponentWhat is checked
RabbitMQQueue broker connectivity — jobs cannot be submitted if unavailable.
RedisJob status store — polling will not work if unavailable.
OllamaLLM inference service reachability.
LLaVA / Mistral / Qwen2.5-VLModel presence in Ollama (ollama list). Status missing means the model was not pulled yet.
Florence-2Vision microservice (OCR / detection / tri) reachability.

Click REFRESH to re-run all checks. Statuses: okmissingunavailable.

HOW PROMPTS WORK
1
EDIT Open the Settings panel, select a model tab (llava / qwen / mistral) and edit the system prompt.
2
APPLY Click Apply Prompts. The prompt is sent to POST /config and stored in Redis under craig:runtime_config.
3
EFFECT The worker reads the Redis config on each job. The override replaces the appsettings.json value for the current session.
4
PERSISTENCE Prompts survive service restarts as long as Redis is running (TTL: none). To make permanent, update appsettings.json and redeploy.
STATIC CONFIGURATION (appsettings.json)

Server-side defaults live in /opt/craig-services/api-semantic/appsettings.json. Edit this file to permanently change prompts, MCP connectors, or Ollama endpoints. A service restart is required after file changes.

  • AI.OllamaModel — Default orchestration model.
  • AIProfiles.*.Prompt — Default system prompts (overridable at runtime).
  • AIProfiles.*.NumCtx / Temperature — Default context and creativity per model.
  • MCPServers — MCP connector endpoints (ERP, Google Drive…).
AUTH
The Auth module manages all users and API access. It is accessible from the Portal via the AUTH button (administrator only).
AUTHENTICATION FLOW
1
LOGIN POST your email + password to /auth/login. A JWT bearer token is returned.
2
SSO COOKIE The Portal sets a craig_token cookie on the .weesoft.fr domain so it is shared with Lab and Incident automatically.
3
TOKEN VERIFICATION Each service validates tokens by calling GET /auth/verify on craig-auth. No shared secret is needed between services.
USER MANAGEMENT
  • Create user — Enter an email and password in the New User form and click Add. A confirmation appears and the user list refreshes.
  • List users — All registered accounts are displayed with their login and active/inactive status.
  • Delete user — Click the Delete button next to a user. A confirmation dialog prevents accidental deletion.
  • Generate API token — Click Token next to a user to generate a long-lived JWT. Copy it to authenticate programmatic API calls.
API TOKEN USAGE

Use the generated token as a Bearer token in any API call:

# Example: submit a job via the semantic API
curl -X POST https://craig.api.weesoft.fr/ask \
  -H "Authorization: Bearer <token>" \
  -F "image=@document.png" \
  -F "question=Extract the invoice total" \
  -F "modeleIA=llava"
SECURITY NOTES
  • Passwords are hashed server-side. They are never stored in plaintext.
  • JWT tokens expire after 24 hours by default. Generated API tokens have a longer TTL.
  • The /jobs/{id}/webhook endpoint is internal-only and does not require authentication.
  • Grafana monitoring is protected by the Auth service via Caddy forward_auth.
INCIDENT
The Incident module lets any authenticated user report a bug or request a new feature. Access it via the Portal → INCIDENT button.
HOW TO SUBMIT A REPORT
1
SELECT REPORT TYPE Choose Bug to report a malfunction, or Feature to suggest an improvement.
2
FILL IN THE SUBJECT Write a short, descriptive title summarizing the issue (e.g. "Invisible button on mobile").
3
DESCRIBE THE ISSUE Provide as much context as possible: steps to reproduce, expected behavior, screenshots if relevant.
4
SEND Click Send. A confirmation is displayed. You can submit another report immediately.
REPORT TYPES
TypeWhen to use
Bug Something is broken, produces an error, or does not behave as expected.
Feature A new capability you would like added, or an existing behaviour you would like changed.
GOOD PRACTICES
  • One report per issue — do not combine several problems in a single submission.
  • For bugs, include the exact error message and the steps that led to it.
  • For feature requests, explain the use case and the expected outcome.
  • If the issue is urgent, contact the WeeSoft team directly in addition to filing a report.
PIPELINE FORMAT
The pipeline is a JSON object describing the ordered sequence of AI steps to execute on a document. It is generated by LLaVA or provided manually.
STRUCTURE
{
  "etape1": {
    "action": "tri"                // required
  },
  "etape2": {
    "action": "ocr",
    "if_previous": "etape1.lisible"// optional condition
    "model": "florence-2-base"    // optional, default: florence-2-base
  },
  "etape3": {
    "action": "structuration_llm",
    "modele": "mistral",          // mistral | qwen2.5vl
    "source": "etape2"           // which step provides the text
  }
}
AVAILABLE ACTIONS
ActionModelOutput fields
tri Florence-2 description, lisible (bool)
ocr Florence-2 texte, succes (bool)
detection Florence-2 bboxes (array), labels (array)
poser_question Florence-2 reponse, succes (bool)
structuration_llm Mistral / Qwen resultat_structure (JSON string), succes (bool)
CONDITIONAL STEPS

Use if (or the alias if_previous) to skip a step based on a previous result. Supported syntax:

SyntaxBehaviour
"etapeX.field"Skip if field is falsy (false, null, 0, empty string)
"etapeX.field == true"Skip unless field is strictly true
"etapeX.field == false"Skip unless field is falsy
"etapeX.field == somevalue"Skip unless field equals somevalue (case-insensitive)
// Only store the document if the identity check found a known client
"etape4": {
  "action": "mcp_store_mock",
  "if": "etape3.found == true"
}
ADAPTIVE PROMPTS (prompt_by_type)

Add prompt_by_type to a structuration_llm step to automatically select the extraction prompt based on the doc_type returned by a preceding classify step. When this field is present, the AI Instructions field in the UI is ignored.

"etape5": {
  "action": "structuration_llm",
  "modele": "mistral",
  "source": "etape1",
  "prompt_by_type": {
    "bank_statement": "Extract IBAN, balances, dates and amounts as JSON.",
    "invoice": "Extract vendor, invoice number, total, VAT and due date as JSON.",
    "CV": "Extract candidate name, skills, experience and last position as JSON.",
    "default": "Summarize the key information as structured JSON."
  }
}

The default key is used when no entry matches the detected doc_type.

MULTI-PAGE ANALYSIS
C.R.A.I.G can analyze multi-page documents by processing each page independently via OCR, then aggregating the full text before running the LLM extraction. Two modes are available depending on the number of pages declared by the user.
HOW IT WORKS
  • The user specifies the total number of pages in the document before launching the pipeline.
  • Each page image is submitted as a separate job, sharing a common documentId (UUID).
  • Each job runs OCR and stores the extracted text in Redis under craig:doc:{documentId}:page:{i}.
  • An atomic counter tracks completed pages. When the last page is processed, the final LLM extraction is triggered.
  • All temporary Redis keys are cleaned up after the final result is produced.
ANALYSIS MODES
ModeThresholdBehaviour
Aggregation ≤ 10 pages All OCR texts are concatenated into a single prompt. Mistral extracts the requested information from the full document in one pass.
Sliding Window > 10 pages Pages are processed in batches of 5. Each batch receives a summary of the previous batch as context. A final synthesis pass consolidates all partial results.
FRONTEND USAGE
  • Upload all page images in the Lab Source panel.
  • Set the Number of pages field to the actual page count of the document.
  • The label updates automatically: aggregation for ≤10 pages, sliding window for >10.
  • A green progress badge shows real-time status: Page 2 / 5 processed…
  • The final JSON result is delivered once all pages have been processed.
JOB PAYLOAD FIELDS
FieldTypeDescription
documentIdstring (UUID)Shared identifier across all pages of the same document.
pageIndexintegerZero-based index of the current page (0 = first page).
totalPagesintegerTotal number of pages in the document.
RESULT STATUS VALUES
StatusMeaning
page_traiteeThis page's OCR is complete. Waiting for remaining pages.
document_completAll pages processed. Final extraction result is included.
MCP CONNECTORS
C.R.A.I.G acts as an MCP client (Model Context Protocol). After an analysis, results can be sent to any registered MCP server — PostgreSQL, GitHub, Slack, Google Drive, and more — without writing any code.
WHAT IS MCP?
  • MCP (Model Context Protocol) is an open standard that allows AI applications to communicate with external tools and data sources via a unified interface.
  • C.R.A.I.G is an MCP Host + Client: it initiates connections to MCP Servers and calls their tools.
  • MCP Servers expose tools (e.g. insert_row, upload_file, create_issue) that C.R.A.I.G can invoke with the analysis JSON as payload.
  • C.R.A.I.G does not store analysis results on disk — all data flows to the external MCP server.
ARCHITECTURE
C.R.A.I.G (MCP Client) │ ├──► supergateway (stdio → SSE bridge) │ └──► mcp-server-postgres ──► PostgreSQL (craig_resultats) │ ├──► MCP Server GitHub ──► Create issue / push file ├──► MCP Server Slack ──► Post message to channel └──► MCP Server Drive ──► Upload JSON to Google Drive
REGISTERING A CONNECTOR
  • Open the Lab sidebar and click the plug icon (MCP Connectors panel).
  • Click + ADD and enter a name and the SSE URL of the MCP server (e.g. http://127.0.0.1:8010/sse).
  • Click SAVE. The connector is stored in Redis and immediately available.
  • Connectors declared in appsettings.json are static (read-only). User-added connectors are marked custom and can be deleted.
SENDING A RESULT
  • Run an analysis in the Lab. Once a result is available, the MCP panel shows a green "Result ready" banner.
  • Each connector with status ok shows a SEND button.
  • Clicking SEND calls POST /mcp/send. The backend connects to the MCP server via SSE, lists available tools, and calls the first tool with the analysis JSON as argument.
  • The tool name and status are displayed below the connector card.
POSTGRESQL DEMO SETUP

Install a local MCP PostgreSQL server on the VM using supergateway (no Docker image required for the MCP layer).

# Install supergateway globally
npm install -g supergateway

# Create the results table in craig-postgres
docker exec craig-postgres psql -U craig_admin -d craig -c "
CREATE TABLE IF NOT EXISTS craig_resultats (
id SERIAL PRIMARY KEY,
document_nom TEXT,
modele TEXT,
question TEXT,
resultat JSONB,
created_at TIMESTAMP DEFAULT NOW()
);"

# Start the MCP server as a systemd service (port 8010)
supergateway --port 8010 \
--stdio "npx -y @modelcontextprotocol/server-postgres \
postgresql://craig_admin:PASSWORD@127.0.0.1:5432/craig"
API ENDPOINTS
MethodPathDescription
GET/mcp/statusList all connectors (static + dynamic) with connectivity status.
GET/mcp/connectorsList dynamic connectors stored in Redis.
POST/mcp/connectorsAdd a connector {"name": "…", "url": "…"}.
DELETE/mcp/connectors/{name}Remove a dynamic connector.
GET/mcp/tools/{name}List available tools on a connector.
POST/mcp/sendSend analysis result to a connector {"serverName": "…", "data": {…}}.
API REFERENCE
All endpoints exposed by craig-api-semantic (port 8001 / craig.api.weesoft.fr). All routes require a Bearer token except /health and the webhook receiver.
SUBMIT A JOB
POST /ask

Form data:
  image  File    PNG / JPEG image
  question  string  Natural language question
  modeleIA  string  llava | qwen2.5vl (default: llava)
  pipelineSaisi string Optional custom pipeline JSON

Response:
  { "jobId": "uuid" }
POLL JOB STATUS
GET /jobs/{jobId}/status

Response (pending):
  { "status": "pending" }

Response (done):
  {
    "status": "done",
    "result": "{ pipeline, resultats, resultat_final }"
  }
BATCH PROCESSING
POST /process/batch

JSON body:
{
  "transactionId": "string",
  "webhookUrl": "https://your-endpoint/callback",
  "imagesBase64": ["base64...", "base64..."],
  "questionGlobale": "Extract all dates",
  "modeleIA": "llava"
}
OTHER ENDPOINTS
MethodPathDescription
GET/healthReturns RabbitMQ + Redis connectivity status
GET/resourcesReturns available AI models and actions
POST/jobs/{id}/cancelMarks a job as cancelled in Redis
POST/jobs/{id}/webhookInternal — receives results from the worker
INFRASTRUCTURE
C.R.A.I.G runs on a single GCP VM (europe-west1-b, Tesla T4 GPU) with Docker for data services, systemd for Python services, and optional MCP servers managed manually by the operator.
DOCKER SERVICES
ContainerImagePort
craig-postgrespostgres:155432 (localhost only)
craig-rabbitrabbitmq:3-management5672 / 15672
craig-redisredis:7-alpine6379
craig-ollamaollama/ollama11434
craig-prometheusprom/prometheus9090
craig-grafanagrafana/grafana3000
SYSTEMD SERVICES
UnitCommand
craig-authuvicorn main:app --host 127.0.0.1 --port 8000
craig-api-semanticuvicorn main:app --host 127.0.0.1 --port 8001
craig-api-visionuvicorn vision_service:app --host 127.0.0.1 --port 8002
craig-workerpython start_worker.py
mcp-postgres (optional)supergateway --port 8010 --stdio "npx -y @modelcontextprotocol/server-postgres …"
MONITORING
  • Prometheus scrapes metrics from all FastAPI services via the /metrics endpoint (prometheus-fastapi-instrumentator).
  • The worker exposes custom metrics on port 9092: craig_jobs_processed_total labelled by status (success/error).
  • Grafana is accessible at craig.monitoring.weesoft.fr. Authentication is delegated to craig-auth via Caddy forward_auth.
USEFUL COMMANDS
# Check all service statuses
systemctl status craig-auth craig-api-semantic craig-api-vision craig-worker

# View worker logs in real time
journalctl -u craig-worker -f

# Restart all Python services
systemctl restart craig-auth craig-api-semantic craig-api-vision craig-worker

# Pull a new Ollama model
docker exec craig-ollama ollama pull mistral

# Check MCP Postgres service
systemctl status mcp-postgres

# View all analysis results stored via MCP
docker exec craig-postgres psql -U craig_admin -d craig \
-c "SELECT id, document_nom, question, created_at FROM craig_resultats ORDER BY created_at DESC LIMIT 10;"
BENCHMARK
The C.R.A.I.G benchmark suite runs the catalog pipelines against a reference set of fixtures and grades each result by structural correctness (JSON validity, field coverage) and ground-truth value matching.
LATEST RUN — 12 MAY 2026
88.7
Avg score / 100
21
Perfect (100)
2
Partial (60-99)
4
Weak (<60)

27 entries tested — 0 errors, 0 skipped — auto-retry active on 11 entries (avg 2.5 iterations).

PER-ENTRY RESULTS
TestCategoryScoreCoverageNote
Bank Statement (2 months)Finance100100%
Invoice Analysis (2 pages)Accounting97100%
Contract Analysis (2 pages)Legal88100%
Finishes Schedule (multi-page)Technical100100%
CV / Resume (2 pages)HR8060%missing email, experience
Pay Slip (2 months)HR4933%missing employee_name, period
Detect Document Type (2 docs)Sort100100%
Finance Pack (3 docs)Sort100100%
Mixed Pack (CV + Contract + Payslip)Sort100100%
Gift Card — Anonymize (2 cards)Anonymization70100%
SEPA Transfer OrderFinance100100%
Purchase OrderAccounting100100%
Expense ReportAccounting100100%
Extrait KBISLegal100100%
Product DatasheetTechnical100100%
Invoice Reconciliation vs DatabaseMCP100100%
Lettre manuscriteManuscripts100100%
Carte régionale PACACartography100100%
Carte de FranceCartography100100%
Personal Data — GDPR AnonymizationAnonymization100100%
Architectural Floor PlanTechnical500%missing project_name, scale, area
Relevé de compte bancaire (FR)Finance100100%
Virement SEPA (FR)Finance100100%
Analyse de facture (FR)Accounting100100%
Plan d'appartement (FR)Technical6333%missing project_name, scale
Extract + Store in PostgreSQLMCP500%missing meta fields
Full pipeline: OCR + Embed + StoreMCP500%missing meta fields
HOW TO RUN

From a machine with Python 3 and access to the API:

cd craig-api-semantic/benchmark
pip install httpx cairosvg

# Auth via JWT token
export CRAIG_TOKEN="<your_jwt>"

# Smoke test on MCP only (~4 min)
python run_benchmark.py --category MCP --include-mcp --output mcp_smoke.json

# Full benchmark with MCP (~25 min)
python run_benchmark.py --include-mcp --output full_with_mcp.json

The --include-hidden flag adds catalog entries marked hidden:true (skipped by default).

HARDWARE
C.R.A.I.G is fully self-hosted — no third-party AI APIs (OpenAI, Anthropic, etc.) required. All inference runs locally through Ollama on a GPU-equipped Linux host. The configuration below describes the minimum viable production setup, currently in use.
MINIMUM CONFIGURATION (CURRENT PRODUCTION)
ComponentSpecificationNotes
CPU4 vCPU (Intel/AMD x86_64)e.g. GCP n1-standard-4
RAM15 GB16 GB recommended
GPU1× NVIDIA Tesla T4 — 16 GB VRAMor equivalent: L4 (24 GB), A10, V100, RTX 3090/4090
Disk100 GB SSD (pd-balanced)~25 GB for models, ~30 GB for OS+containers
OSUbuntu 22.04 LTS (kernel 6.8+)or Debian 12, RHEL 9
NetworkPublic IP (static recommended)+ DNS for 4 subdomains, TLS auto via Caddy
SOFTWARE STACK
  • NVIDIA Driver — 535+ for T4, 550+ for L4
  • Docker + Compose — container runtime
  • nvidia-container-toolkit — GPU access from containers
  • Python 3.11 — backend services (managed via venv)
  • PostgreSQL 15 — application database (Docker)
  • Redis 7 — job state and dynamic MCP connectors (Docker)
  • RabbitMQ 3 — job queue for the worker (Docker)
  • Ollama 0.6+ — local LLM runtime (Docker, GPU-enabled)
  • Caddy 2 — reverse proxy with automatic Let's Encrypt TLS
AI MODELS (LOADED IN OLLAMA, ~25 GB TOTAL)
ModelRoleSize
minicpm-vVision (OCR, document understanding)5.5 GB
mistral-nemo:12bText (structuring, delivery, chat)7.1 GB
qwen2.5vlVision alternative (cartography, plans)6.0 GB
nomic-embed-textSemantic embeddings (RAG)0.3 GB
EQUIVALENT CLOUD CONFIGURATIONS
ProviderInstanceGPU
Google Cloud (current)n1-standard-4 + T4NVIDIA Tesla T4 (16 GB)
Google Cloud (upgrade)g2-standard-4 + L4NVIDIA L4 (24 GB) — ~2× faster
AWSg4dn.xlargeNVIDIA Tesla T4 (16 GB)
AzureNC4as T4 v3NVIDIA Tesla T4 (16 GB)
OVH / ScalewayGPU instance with T4 / L4Equivalent EU options
On-premiseWorkstation / serverRTX 3090/4090, A4000+, or 2× RTX 3060
COST REFERENCE (PREEMPTIBLE / SPOT)
  • The current production VM runs in preemptible mode on GCP (≈€80/month), trading stability (max 24h uptime, 30s termination notice) for ~65% cost savings.
  • A standard non-preemptible setup costs ≈€250/month for the same hardware.
  • Recommendation: preemptible is fine for staging/dev workloads or production with an auto-restart mechanism (Cloud Scheduler hitting the start endpoint every 5 minutes).
PIPELINE BUILDER
Pipeline Builder is the unified workspace inside the Lab sidebar where you compose a C.R.A.I.G pipeline. It merges the conversational AI assistant ("Ask C.R.A.I.G") and the visual step editor (formerly "Pipeline Creator") into a single panel, accessible from the sidebar with a single icon.
TWO COMPLEMENTARY MODES
  • Chat mode — describe what you want to extract in plain language. The assistant generates a complete pipeline (sort, OCR, classify, structure_llm, embed, MCP…) and renders it as a chip list inside the conversation bubble.
  • Editor mode — the classic step-by-step visual editor. Add, remove, reorder steps; edit each step's parameters (model, engine, prompt, source, conditions).
  • A toggle at the top of the panel switches between the two modes at any time; the conversation history is preserved.
WORKFLOW
PersonaTypical workflow
Business user Type the request in Chat mode → click APPLY THIS PIPELINE. Done.
Hybrid user Generate a draft in Chat → click EDIT to switch to Editor mode with the pipeline pre-filled → adjust 1-2 steps → APPLY.
Power user Open Editor mode directly → build the pipeline manually step by step (same UX as the legacy Pipeline Creator).
PIPELINE PREVIEW (CHAT MODE)
  • Each generated pipeline is shown as a colored chip list (e.g. 1. SORT · florence-2-base, 2. OCR · paddleocr, 3. CLASSIFY, 4. LLM · mistral-nemo) — friendlier than raw JSON.
  • The raw JSON is still available via a collapsible ▸ View JSON details block.
  • Two action buttons: EDIT (open in Editor mode) and APPLY THIS PIPELINE →.
  • Validation badge: green check (Pipeline ready) or red triangle (Pipeline invalid), plus the model that generated it.
CONVERSATIONAL REFINEMENT
  • The chat keeps full context: ask follow-up questions like "add a step to save the result in PostgreSQL via MCP" or "change step 4 to use qwen instead of mistral" and the assistant produces an updated pipeline.
  • The conversation can be cleared with the Reset button (top-right).
LICENSE
C.R.A.I.G is distributed under the PolyForm Noncommercial License 1.0.0, a source-available license that allows free personal, educational, research and other non-commercial use, while reserving commercial rights to WeeSoft.
AT A GLANCE
You mayYou may not
Use, install, copy and modify C.R.A.I.G for personal, educational or research purposes. Resell C.R.A.I.G or any modified version of it.
Self-host the platform inside your organisation for non-commercial use. Embed C.R.A.I.G into a commercial product or paid service.
Distribute the source to others under the same license. Offer C.R.A.I.G as a hosted SaaS to third parties.
Submit pull requests and improvements (under the same license). Remove or alter the license notices in the code.
OFFICIAL LICENSE TEXT

The full, legally binding text of the PolyForm Noncommercial License 1.0.0 is published by the PolyForm Project. Always refer to the official document — the table above is a plain-English summary, not the legal text.

COMMERCIAL USE

Any commercial use of C.R.A.I.G — including, but not limited to, reselling, embedding it inside a commercial product, or providing it as a hosted service to third parties — requires a separate commercial license from WeeSoft.

To request a commercial license, write to: contact@weesoft.fr.

PROFESSIONAL SERVICES (SEPARATE FROM LICENSING)
  • Deployment & installation on your infrastructure (on-premise or private cloud).
  • Configuration & integration with your existing systems (databases, ERPs, document repositories) via the MCP layer.
  • Custom pipelines tailored to your document types (invoices, contracts, technical plans, audio…).
  • Training & onboarding for your teams (Lab usage, prompt design, monitoring).
  • Production support with defined SLA, monitoring and incident response.

These services are billed independently of the underlying license and are available for both non-commercial and commercial users. Contact: contact@weesoft.fr.

CONTRIBUTING
  • Pull requests are welcome — by contributing you agree that your contribution is licensed under the same PolyForm Noncommercial License 1.0.0.
  • For larger changes, please open an issue first to discuss the scope.
DISCLAIMER

This page provides a high-level summary of the C.R.A.I.G licensing model. It is not legal advice and does not replace the official PolyForm Noncommercial License 1.0.0 text. In case of any discrepancy between this summary and the official license, the official license text prevails.