Independent Engineering · Open AI Integrations
Microsoft Copilot is an excellent answer to most office knowledge work. But not every AI use case can be mapped to it — when sovereignty, cost, or specialized models come into play, it's worth looking at the open LLM ecosystem.
Three reasons against Copilot
We recommend Copilot in most cases — it's well integrated, safely behind the Microsoft 365 tenant boundary, and immediately available for knowledge workers. But three situations are typical where open AI is the better answer.
You are an association, educational institution, public sector, or mid-sized company with US-data sensitivity. Mistral (EU-hosted) or Aleph Alpha (Germany) gives a clean answer beyond the US Cloud Act. Hosting at OVHcloud, Hetzner, or STACKIT.
Microsoft 365 Copilot costs around €30 net per user per month. With 1,000 employees, that's €360,000 per year. For many intensive use cases (customer-service pipeline, automated document analysis), a direct LLM API integration is significantly cheaper — typically 30–60% of the license cost.
Embedding models for semantic search. Vision models for structured document extraction. Audio models (Whisper, Deepgram) for transcription. Coding models (Claude Sonnet) for code generation. Here you need access to specific models Copilot doesn't expose.
Model selection
Model choice is not an ideological but a pragmatic decision. A task needs the right tool, not the politically correct one.
| Provider / Model | Strengths | Hosting | Price indication |
|---|---|---|---|
| OpenAI · GPT-4o, GPT-4.1 | All-rounder, excellent multi-modal support, huge ecosystem | USA, EU region via Azure OpenAI | from approx. $2.50 / 1M input tokens |
| Anthropic Claude · Sonnet, Opus | Reasoning, coding, long contexts (1M tokens), safety tuning | USA, AWS Bedrock EU | from approx. $3 / 1M input tokens |
| Mistral · Large, Small | EU provider, good multilingual support, competitive open-source models | France (Mistral), AWS Bedrock EU | from approx. $2 / 1M input tokens |
| Aleph Alpha · Pharia | German provider, public-sector-affine, focus on EU compliance | Germany (Heidelberg) | individual, license-based |
| Local LLMs · Llama, Qwen, Mistral | Fully isolated operation, no external API costs | Own infrastructure, ideally with GPU | Only hardware cost, from approx. €800 per month (Hetzner GPU) |
Price indications as of early 2026, rounded list-price indications per provider. Volume discounts and EU-specific terms aren't reflected here — we calculate project-specifically.
Four typical use cases
You have a knowledge base in SharePoint, Confluence, or a contract archive. We build a retrieval pipeline that generates answers from your corpus — with source references, with permission filtering, with audit log. Typically 6–12 weeks to the productive pipeline.
First-contact automation for standard inquiries, with seamless escalation to humans when the bot reaches limits. Integration with Microsoft Dynamics 365 Customer Service, Intercom, Zendesk, or a custom frontend. With clear separation between automated and human response.
For internal engineering teams: integration with GitHub Copilot Enterprise, Claude Code, Cursor, or a custom-built workflow. Including repository-specific context wiring, audit logs, and compliance setup. We use this ourselves — and advise from experience.
Structured generation of product descriptions, translations, marketing copy. With prompt templates, quality gate, human review step, A/B tests. Typical for e-commerce scaling or multilingual association communications.
EU AI Act — what we build in
The EU AI Act has been in force since August 2024, and its obligations apply in stages:
We build compliance into every AI integration:
Before implementation. For many mid-sized companies, a combined inventory + training concept is the first sensible step — even without new implementation. More under AI Governance & EU AI Act.
Further
AI integration without a custom frontend is usually only half as effective. We build web platforms in which AI is cleanly embedded.
Before implementation: where is the highest AI leverage in your company? 45-min conversation or multi-day workshops.
When Microsoft Copilot is the right answer after all. Copilot adoption, Copilot Studio agents, AI governance.
Inventory, risk classification, training duty (Art. 4) — important even independent of an implementation.
FAQ
Three typical reasons: sovereignty (Mistral or Aleph Alpha for strict EU requirements), cost control (with many thousand requests per day, Copilot becomes more expensive than a self-orchestrated setup), and special model requirements (embedding models for search, vision models for document processing, audio transcription).
OpenAI (GPT-4o, GPT-4.1) for general tasks. Anthropic Claude for complex reasoning and coding tasks. Mistral Large/Small (EU-hosted at Mistral or via AWS Bedrock EU) for sovereign setups. Aleph Alpha for German public-sector-affine projects. Local LLMs (Llama, Qwen, Mistral) on Ollama for fully isolated environments.
Retrieval-augmented generation: you combine an LLM with your own documents so the model generates answers from your knowledge base — not from generic internet knowledge. For internal knowledge bases, contract search, customer-service bots, RAG is today's standard architecture. We typically use PostgreSQL with pgvector or Qdrant for vector search.
A discovery spike (2–4 weeks) we calculate together. A productive RAG pipeline with your document corpus as a fixed-price range. Ongoing LLM provider API costs are separate — typically €200 to €4,000 per month, depending on volume and model choice.
Since February 2025, Article 4 of the EU AI Act applies: training duty for all employees who use AI systems in the work context. From August 2026 a fine framework takes effect — up to €35 million or 7% of global annual revenue. We help with inventory, training design, and documentation. More under AI Governance & EU AI Act.
Yes, with caveats. Local models like Llama 3, Qwen, Mistral can run on your own hardware (ideally with a GPU) via Ollama or vLLM. Quality, however, is significantly below GPT-4 or Claude. For highly sensitive use cases (full data isolation) it's a valid option — not recommended for general knowledge work.
They aren't mutually exclusive. Microsoft Copilot covers the standard Office world (email summaries, Teams notes, Word/Excel). Open AI integrations extend that with use cases Copilot doesn't serve — industry-specific RAG applications, customer-service bots, code assistance with specific models. In many mid-sized companies, both run in parallel.
45 min · free · no obligation
Bring your specific AI use case. We look together for 45 minutes: which model? Hosted where? Which compliance constraints? What effort until productive operation? With the person who will later build. Honest answers, even if the use case doesn't carry.
Accompanying services
Engineering projects rarely stand alone — license logic, architecture clarification, quality gates, knowledge transfer, and follow-on operations usually run in parallel. Below are the most common accompanying services we add via discovery spikes, fixed-price sprints, or application-care contracts.
Before · Architecture
Before implementation: tenant structure, data model, security concept, integration mapping. The result is an architecture document any engineering team can continue working with — even one other than us.
Read more →
Before · CSP
Which license bundles for which users, which add-on SKUs are necessary, where you're over- or under-licensed. Sourced as a Microsoft Licensing Partner — with the option to use CSP purely for control without margin maximization.
Read more →
During · Quality gate
Independent second opinion during an active implementation project — whether we run it or another partner does. CMMI-based quality gates, risk reviews, fixed price per gate.
During · Adoption
Not the classic 2-day workshop forgotten after a week — but a dynamic learning program over 4–6 weeks with initial training, application phases, and follow-up sessions. Training matrix for roles and topics.
Read more →
After · Operations
After go-live: a plannable application-care contract with a monthly flat rate, SLA-based. Includes releases, hotfixes, extensions, tenant hardening — and continuous accompaniment instead of mere ticket response.
Read more →
After · Knowledge
When the original developers are gone, the previous partner is no longer reachable, or the documentation is stale — reverse engineering of the existing solution with a documented result: code map, data model, customization inventory.
Read more →