§ 001 — Introduction
Semsudin Sefić
AI Systems Architect
Principal QA & DevOps
I ship production-grade AI systems
that actually work.
Multi-agent platforms, enterprise RAG, and the DevOps discipline to keep them standing. Eleven years turning prototypes into systems CTOs actually trust.
Currently building
- → Alva multi-agent platform @ Alfa Laval
- → AI-driven QA agents @ Mars Petcare
- → CTO @ Qafana.com
11yrs
In the field
50+
Engineers led
1000+
Tests shipped
750%
Pipeline speed-up
Shipped alongside
Enterprise · SaaS · MarTech · Manufacturing
Alfa Laval
Alva GenAI Platform · 7 yrs, 4 projects
Mars Petcare
Kinship · AdoptAPet · AI-driven QA
APCOA Group
QA Transformation Lead
ERS
Test Automation Architect · Core + Cyprus
SPACE Sweden
Foundational automation · Gaming
Qafana
CTO · Founder
§ 002 — Services
What I do
Three pillars. One thing:
AI systems that ship, and stay shipped.
01
AI Systems Architecture
Multi-agent platforms that survive production.
ReAct agents on LlamaIndex, 10+ LLMs orchestrated through a single interface, full RAG with document parsing, chunking, pgvector. End-to-end streaming. Proper observability via OpenTelemetry. The architecture decisions that separate a cool demo from a system your ops team doesn't hate.
Proof Shipped Alva at Alfa Laval — React UI, .NET 8 orchestrator, Python agent, 10+ LLMs.
02
AI-Driven QA
Tests that write themselves, heal themselves, and ship themselves.
Custom AI agents that auto-generate test plans from UI exploration, create Playwright tests from tickets, self-heal failing tests, and move tickets through the pipeline. Integrated into the dev cycle to catch regressions, OWASP top-10 vulns, and performance bottlenecks — before QA even touches them.
Proof 1000+ tests across 6 Mars Petcare apps. Authenticated test pass rate 20% → 100%.
03
Enterprise DevOps
CI/CD pipelines that do the right thing by default.
Unified pipelines across 7 services with smart change detection and parallel deploys. Prod container registry isolation via image promotion. Bicep IaC with RBAC that actually works. Consolidated coverage reporting across .NET and Python. The boring infrastructure that makes the exciting stuff possible.
Proof System Verification — 300% stability gain, 750% faster test execution. Alva — unified 7-service pipeline.
§ 003 — Selected Work
Case studies
The work behind the metrics.
01 / 04
Feb 2018 – Present
Industrial Manufacturing · Enterprise GenAI
QA Lead → Sr. SDET → AI Systems Architect
Alfa Laval
A seven-year, four-project engagement with Alfa Laval. From modernizing legacy test automation on the Anytime project to architecting their flagship GenAI platform (Alva), contributing across QA, DevSecOps, and AI systems.
7+
Years engaged
4
Distinct projects
750%
Pipeline speed-up
10+
LLMs orchestrated
Read case study →
02 / 04
Oct 2025 – Mar 2026
MarTech · Consumer
Senior QA Engineer · AI-Driven QA Lead
Mars Petcare — Kinship & AdoptAPet
Built a multi-app Playwright framework from scratch covering 6 projects across two Mars Petcare brands (Kinship US/UK and AdoptAPet). Pioneered AI-driven QA agents that auto-generate test plans, create tests from work items, and self-heal failing suites.
1000+
Automated tests
6
Apps covered
20% → 100%
Auth pass rate
US + UK
Regions supported
Read case study →
03 / 04
Apr 2023 – Dec 2023
Parking Operations · Microservices
Lead QA Engineer · QA Transformation
APCOA Group GmbH
Led the QA transformation on APCOA's core project — defining strategy, processes, testing stages, and criteria for 'definition of done' across a microservices architecture. Built and directed a team of QA professionals integrated into the continuous delivery pipeline.
Lead
Role
Microservices
Architecture
Led
QA team
Read case study →
04 / 04
Apr 2023 – Feb 2025
Enterprise Software · Regulatory Technology
Test Automation Architect · QA Strategy Lead
ERS
Two back-to-back engagements with ERS — first establishing the QA strategy and automation framework on the Core project, then leading the Cyprus rollout with multi-country deployment and a framework migration from Playwright/TypeScript to Cypress/JavaScript.
−orders of magnitude
Regression runtime
2 back-to-back
Engagements
Multiple
Countries deployed
Read case study →
§ 004 — Toolbox
Stack
Not every tool. The right ones.
Eleven years of language-hopping and framework-chasing distilled into the stack I actually reach for when the stakes are real.
AI & Agents
- · LlamaIndex
- · FastAPI
- · Azure OpenAI
- · Anthropic Claude
- · OpenAI GPT-4o
- · Mistral
- · pgvector
- · Tavily
- · FLUX
- · ReAct agents
- · RAG
- · Streaming LLM
QA & Testing
- · Playwright
- · Cypress
- · Selenium
- · Appium
- · JMeter
- · LoadRunner
- · ISTQB AI Testing
- · OWASP scans
- · Self-healing tests
- · AI test generation
- · xUnit
- · pytest
DevOps & Infra
- · Azure DevOps
- · GitHub Actions
- · Jenkins
- · Bicep IaC
- · Docker
- · Azure App Service
- · Azure Functions
- · ACR
- · Application Insights
- · OpenTelemetry
- · Cloudflare
- · AWS Lambda
Backend & Data
- · C# / .NET 8
- · Python 3.12
- · TypeScript
- · Node.js
- · React
- · PostgreSQL
- · SQL Server
- · EF Core
- · Clean Architecture
- · REST · gRPC
- · Azure AD auth
- · Auth0
§ Playbook
Free · 12 pages · PDF
The Enterprise GenAI
Reliability Checklist.
42 items across 6 categories. Exactly what I run through when auditing enterprise AI systems. Real incidents, pass/fail format, genuinely useful on the Monday morning after you print it.
§ 005 — Contact
Let's talk
Got an AI platform to ship?
Whether you need a fractional CTO, a multi-agent architecture review, an enterprise QA transformation, or hands-on implementation — pick the fastest path.
→ Option A · Book instantly
30 min · Free
→ Option B · Send a brief
Async · < 24h
Quick facts
- Location
- Sarajevo, BiH · GMT+2
- Engagement
- Full-time · Contract · Fractional
- Response
- < 24 hours
- [email protected]