Levent Gündogdu

Senior trading-systems engineer  /  C++ & CUDA, native Bayesian optimisation, low-latency execution, HPC

German citizen  -  based in Portugal, relocating to London

I build native, measurement-driven trading systems: tick-level backtesters, GPU-accelerated optimisers, real-time execution engines, and the observability that proves they work. For the last three years I have designed and run Fathom, an integrated systematic trading platform, on my own capital - trading my own book while building it.


Current work: Fathom

An integrated systematic trading platform: tick-level backtesting, GPU-accelerated Bayesian optimisation, a live C++ execution engine, 3D telemetry, and an autonomous monitoring loop. Self-directed, funded from personal capital.

Native execution engine
The live decision path, rewritten in single-host C++ under a measurement-first discipline. Shared-memory transport and journaled durability are the end state; CUDA-resident decisioning is the final step.
C++ is the live execution engine for all 13 trading tracks, cut over track-by-track after a shadow-validation burn, gated by a custom divergence comparator with per-track rollback. Eight shared C++ libraries. Measurement landed first: 10 per-hop latency histograms across the live path, 17M+ cross-host observations, single-digit-millisecond decision p99.
Native CUDA Bayesian optimisation engine
Gaussian-process Bayesian optimisation rebuilt from scratch in C++/CUDA to tune strategy parameters, against a PyTorch/BoTorch baseline.
Matern 5/2 GP, qLogEI acquisition, parallel multi-stream L-BFGS-B. 33x faster end-to-end than the BoTorch baseline on a V100 - 8.1 s to 243 ms per iteration at d=19, n=256. 121 unit tests plus 28 SciPy reference checks to machine epsilon. Runs as elastic GPU workers pulling jobs over NATS across a six-GPU, three-architecture fleet (RTX 4090 / Ada, V100 / Volta, P100 / Pascal) spanning the production cluster and an external worker host.
Tick-level backtester
A C++20 engine that replays historical market data tick-by-tick through the production strategy chain.
~11K lines of C++20, 133 unit tests, a 10-plugin ordered decision chain. Runs against a TimescaleDB market-data store with 365-day raw-tick retention.
Autonomous monitoring loop
A Model Context Protocol server exposing the platform's internals to LLM agents, with two agents that sweep state on a fixed cadence and publish findings.
46 introspection tools across trading, analytics, market data, infrastructure, and experiments. Two agents (infrastructure and trading) running a local LLM, publishing deduplicated findings over NATS. Scope-based auth; fixture-tested tools.
Real-time 3D telemetry dashboard
A Swift / RealityKit dashboard that renders live platform state as procedural 3D geometry, fed by a custom binary WebSocket protocol.
~27,500 lines of Swift, 10 binary codec versions across 8 stream types. Radar webs, voxel grids of optimisation trials, gauge arcs, atlas-based glyph text. Profiled and optimised: dirty-flagged mesh caching, frame-budgeted scheduling, signpost instrumentation.
Infrastructure
The platform runs on a bare-metal Kubernetes cluster I build and operate end-to-end.
6 nodes: Cilium eBPF, kube-vip HA, Rook/Ceph, three independent PostgreSQL clusters (one TimescaleDB), NATS HA, NVIDIA GPU Operator (RTX 4090 + V100), plus a dedicated GPU dev host. OpenTelemetry to VictoriaMetrics / VictoriaLogs / Tempo. 52-job CI pipeline.

Writing


Contact

levent@feature-it.com | send mail

linkedin.com/in/levent-guendogdu