Independent Engineering · Open AI Integrations

Open AI Integrations — when Microsoft Copilot doesn't fit.

Microsoft Copilot is an excellent answer to most office knowledge work. But not every AI use case can be mapped to it — when sovereignty, cost, or specialized models come into play, it's worth looking at the open LLM ecosystem.

OpenAI · Claude · Mistral · Aleph Alpha RAG architectures with pgvector, Qdrant EU AI Act · Art. 4 training duty since 2025 Local LLMs on Ollama as an option

Three reasons against Copilot

When not Microsoft Copilot, but open AI.

We recommend Copilot in most cases — it's well integrated, safely behind the Microsoft 365 tenant boundary, and immediately available for knowledge workers. But three situations are typical where open AI is the better answer.

Sovereignty

You are an association, educational institution, public sector, or mid-sized company with US-data sensitivity. Mistral (EU-hosted) or Aleph Alpha (Germany) gives a clean answer beyond the US Cloud Act. Hosting at OVHcloud, Hetzner, or STACKIT.

Cost control at scale

Microsoft 365 Copilot costs around €30 net per user per month. With 1,000 employees, that's €360,000 per year. For many intensive use cases (customer-service pipeline, automated document analysis), a direct LLM API integration is significantly cheaper — typically 30–60% of the license cost.

Specialized model choice

Embedding models for semantic search. Vision models for structured document extraction. Audio models (Whisper, Deepgram) for transcription. Coding models (Claude Sonnet) for code generation. Here you need access to specific models Copilot doesn't expose.

Model selection

Which model for which task — an honest overview.

Model choice is not an ideological but a pragmatic decision. A task needs the right tool, not the politically correct one.

Provider / Model Strengths Hosting Price indication
OpenAI · GPT-4o, GPT-4.1 All-rounder, excellent multi-modal support, huge ecosystem USA, EU region via Azure OpenAI from approx. $2.50 / 1M input tokens
Anthropic Claude · Sonnet, Opus Reasoning, coding, long contexts (1M tokens), safety tuning USA, AWS Bedrock EU from approx. $3 / 1M input tokens
Mistral · Large, Small EU provider, good multilingual support, competitive open-source models France (Mistral), AWS Bedrock EU from approx. $2 / 1M input tokens
Aleph Alpha · Pharia German provider, public-sector-affine, focus on EU compliance Germany (Heidelberg) individual, license-based
Local LLMs · Llama, Qwen, Mistral Fully isolated operation, no external API costs Own infrastructure, ideally with GPU Only hardware cost, from approx. €800 per month (Hetzner GPU)

Price indications as of early 2026, rounded list-price indications per provider. Volume discounts and EU-specific terms aren't reflected here — we calculate project-specifically.

Four typical use cases

Where open AI creates value for mid-sized companies today.

RAG with your own documents

You have a knowledge base in SharePoint, Confluence, or a contract archive. We build a retrieval pipeline that generates answers from your corpus — with source references, with permission filtering, with audit log. Typically 6–12 weeks to the productive pipeline.

Customer-service bots

First-contact automation for standard inquiries, with seamless escalation to humans when the bot reaches limits. Integration with Microsoft Dynamics 365 Customer Service, Intercom, Zendesk, or a custom frontend. With clear separation between automated and human response.

Code assistance

For internal engineering teams: integration with GitHub Copilot Enterprise, Claude Code, Cursor, or a custom-built workflow. Including repository-specific context wiring, audit logs, and compliance setup. We use this ourselves — and advise from experience.

Content pipelines

Structured generation of product descriptions, translations, marketing copy. With prompt templates, quality gate, human review step, A/B tests. Typical for e-commerce scaling or multilingual association communications.

EU AI Act — what we build in

Compliance is part of the architecture, not a retrofitted PDF.

The EU AI Act has been in force since August 2024, and its obligations apply in stages:

  • February 2025: Article 4 applies — training duty for all employees who use AI systems in the work context. It's not about mandatory slide-wiping, but about demonstrable AI competence per role.
  • August 2025: Obligations for general-purpose AI providers (OpenAI, Anthropic, Mistral) — affects you indirectly via contractual situations.
  • August 2026: Obligations for high-risk AI systems apply fully. Fine framework takes effect: up to €35 million or 7% of global annual revenue — the higher value.

We build compliance into every AI integration:

  • Inventory of all AI systems with use-case description and risk classification
  • Training concept for affected roles (in collaboration with your HR/compliance)
  • Audit logs at the API request level
  • Model datasheets with clear notes on model origin, training, and limitations
  • Data protection impact assessment under GDPR Art. 35, where required

Before implementation. For many mid-sized companies, a combined inventory + training concept is the first sensible step — even without new implementation. More under AI Governance & EU AI Act.

Further

Where AI integrations dock into the Microsoft and your own ecosystem.

FAQ

What clients ask before the architecture call.

When not Microsoft Copilot, but open AI?

Three typical reasons: sovereignty (Mistral or Aleph Alpha for strict EU requirements), cost control (with many thousand requests per day, Copilot becomes more expensive than a self-orchestrated setup), and special model requirements (embedding models for search, vision models for document processing, audio transcription).

Which AI models do you use?

OpenAI (GPT-4o, GPT-4.1) for general tasks. Anthropic Claude for complex reasoning and coding tasks. Mistral Large/Small (EU-hosted at Mistral or via AWS Bedrock EU) for sovereign setups. Aleph Alpha for German public-sector-affine projects. Local LLMs (Llama, Qwen, Mistral) on Ollama for fully isolated environments.

What is RAG, and do I need it?

Retrieval-augmented generation: you combine an LLM with your own documents so the model generates answers from your knowledge base — not from generic internet knowledge. For internal knowledge bases, contract search, customer-service bots, RAG is today's standard architecture. We typically use PostgreSQL with pgvector or Qdrant for vector search.

What does an AI integration cost?

A discovery spike (2–4 weeks) we calculate together. A productive RAG pipeline with your document corpus as a fixed-price range. Ongoing LLM provider API costs are separate — typically €200 to €4,000 per month, depending on volume and model choice.

What does the EU AI Act mean for my company?

Since February 2025, Article 4 of the EU AI Act applies: training duty for all employees who use AI systems in the work context. From August 2026 a fine framework takes effect — up to €35 million or 7% of global annual revenue. We help with inventory, training design, and documentation. More under AI Governance & EU AI Act.

Can you also host LLMs on-prem?

Yes, with caveats. Local models like Llama 3, Qwen, Mistral can run on your own hardware (ideally with a GPU) via Ollama or vLLM. Quality, however, is significantly below GPT-4 or Claude. For highly sensitive use cases (full data isolation) it's a valid option — not recommended for general knowledge work.

How do Microsoft Copilot and open AI combine?

They aren't mutually exclusive. Microsoft Copilot covers the standard Office world (email summaries, Teams notes, Word/Excel). Open AI integrations extend that with use cases Copilot doesn't serve — industry-specific RAG applications, customer-service bots, code assistance with specific models. In many mid-sized companies, both run in parallel.

45 min · free · no obligation

Book an architecture conversation.

Bring your specific AI use case. We look together for 45 minutes: which model? Hosted where? Which compliance constraints? What effort until productive operation? With the person who will later build. Honest answers, even if the use case doesn't carry.

Accompanying services

What typically runs alongside this engineering work.

Engineering projects rarely stand alone — license logic, architecture clarification, quality gates, knowledge transfer, and follow-on operations usually run in parallel. Below are the most common accompanying services we add via discovery spikes, fixed-price sprints, or application-care contracts.

Before · Architecture

Advisory & Architecture

Before implementation: tenant structure, data model, security concept, integration mapping. The result is an architecture document any engineering team can continue working with — even one other than us.

Read more →

Before · CSP

License Advisory & CSP

Which license bundles for which users, which add-on SKUs are necessary, where you're over- or under-licensed. Sourced as a Microsoft Licensing Partner — with the option to use CSP purely for control without margin maximization.

Read more →

During · Quality gate

Project Assurance

Independent second opinion during an active implementation project — whether we run it or another partner does. CMMI-based quality gates, risk reviews, fixed price per gate.

During · Adoption

Training & learning program

Not the classic 2-day workshop forgotten after a week — but a dynamic learning program over 4–6 weeks with initial training, application phases, and follow-up sessions. Training matrix for roles and topics.

Read more →

After · Operations

Application Care

After go-live: a plannable application-care contract with a monthly flat rate, SLA-based. Includes releases, hotfixes, extensions, tenant hardening — and continuous accompaniment instead of mere ticket response.

Read more →

After · Knowledge

Knowledge Recovery

When the original developers are gone, the previous partner is no longer reachable, or the documentation is stale — reverse engineering of the existing solution with a documented result: code map, data model, customization inventory.

Read more →