Thanos Ariyanayagam
Software Engineer & Tech Lead @ Google
Architecting planet-scale systems at the intersection of virtualization and AI. I build the high-performance, secure infrastructure that powers Google’s most critical workloads, including Gemini and Waymo.
About Me
I engineer the foundational infrastructure that powers the modern AI era. As a Tech Lead at Google, I specialize in designing planet-scale systems where uncompromising security meets extreme performance. My work directly enables the reliable, high-speed execution of Google’s most critical workloads, including Gemini and Waymo.
My expertise lies at the bleeding edge of virtualization, systems programming, and high-performance computing. Whether I'm architecting a next-generation C++ node runtime that saves tens of thousands of engineering hours, leading fleet-wide isolation initiatives, or crafting patent-pending compiler extensions for 3D deep learning hardware, I thrive on translating highly ambiguous architectural challenges into robust, scalable realities.
I don't just write code; I build resilient engines for scale. Armed with deep expertise in modern C++, distributed systems, and low-level hardware-software co-design, my drive is to push the boundaries of what's possible with today’s hardware to unlock tomorrow's technological leaps.
Experience
Software Engineer III & Tech Lead
Nov 2025 — PresentLeading node-level runtime architecture across AI/ML Infrastructure and virtualization frameworks. Key focus areas:
- Borg Isolation Commitment: Fortified ecosystem security by leading the transparent migration of untrusted workloads into isolated VMs, rigorously balancing security with system efficiency.
- TI-VM Virtualized Runtime: Architecting and optimizing the future of virtualized execution environments across the fleet.
- Confidential AI/ML Infrastructure: Leading node-level runtime architecture for a highly confidential project directly supporting Google's rapid AI growth.
- AI Compute Fungibility: Maximized global fleet capacity by engineering seamless compute fungibility (CPU/GPU/TPU) for high-demand ML workloads, unlocking scale for Gemini.
Software Engineer II
Aug 2024 — Nov 2025- Identified and architected significant performance improvements for virtualized workloads across Google's fleet, leading to infrastructure cost savings equivalent to 56.7+ SWE years.
- Designed and implemented a novel VM Package Hotplugging system to seamlessly plug and unplug NVMe devices from VMs without relying on passthrough systems or in-VM dependencies, unlocking native host-level performance with the absolute minimum in-VM overhead and resolving complex filesystem bottlenecks.
- Led the architectural design and C++ implementation of the VM resource management parity initiative, enabling seamless telemetry extraction and in-place VM updates to execute complex, untrusted ML workloads at scale.
Software Engineering Intern
Engineered a high-throughput, multithread-safe C++ API in Borglet utilizing lock-free programming to minimize contention. Designed a novel hierarchical data-sharing solution for modularizing fleet-wide autonomous services within the Borg cluster manager.
Software Engineer Intern
Authored a patent-pending C++ compiler extension for 3D deep learning models on proprietary hardware. Boosted ML inference throughput by 11% via advanced compiler optimizations (operator fusion, memory layout transformations, mixed-precision INT8).
Software Developer Intern
Deployed a highly optimized C++/Go ensembling library that boosted a production NLP API's F1 score by 4.28% and reduced annotation errors by 50%. Built a scalable ML experimentation pipeline on GCP.
STEP Intern
Enhanced C++ infrastructure connecting to Spanner DB for ML anomaly detection, resulting in an 8% increase in bad actor suspensions and a 10% reduction in scan quota.
Projects
OpenStreetMaps GIS
C++Achieved a >56% runtime improvement in a custom GIS by architecting a concurrent design utilizing thread pools to manage asynchronous tasks, parallelizing data parsing and pathfinding heuristics.
NEPIADA Drone Swarms
Python / RLDesigned novel multi-agent Reinforcement Learning algorithms (DQN, PPO) that outperformed state-of-the-art methods in adversarial, partial-information environments simulating drone swarm behavior.
AI Reversi Engine
CDeveloped a top-5% ranked Reversi AI in C. Implemented an aggressively optimized Minimax algorithm featuring alpha-beta pruning, transposition tables (memoization), and advanced move ordering to drastically reduce search space under tight constraints.
Technical Arsenal
Languages
- Modern C++ (17/20/23)
- Python
- Go
- C
- SQL & Bash
Infrastructure
- Borg & Kubernetes
- GCP / AWS
- Docker
- Spanner / BigQuery
- gRPC / Protocol Buffers
AI & ML
- TensorFlow
- PyTorch
- RLlib
- Model Pruning / INT8
- Scikit-Learn
Core Domains
- Systems Design
- Compiler Optimization
- High-Performance Comp.
- Multi-threading
- Lock-Free Programming