Technical Analysis Shows RTX 5090 eGPU Enables AI and Gaming on M4 MacBook Air
Researcher Scott J. demonstrates that an NVIDIA RTX 5090 external GPU can drive playable gaming and 120x faster AI prompt processing on an M4 MacBook Air via a Linux virtual machine.
A technical analysis by researcher Scott J. demonstrates that an NVIDIA RTX 5090 external GPU can be utilised with an M4 MacBook Air via a Linux virtual machine, enabling playable gaming and significantly improved AI inference. The setup requires complex workarounds for Thunderbolt passthrough, including a custom virtual DMA device and kernel patches to bypass Apple Silicon hardware limitations. While gaming performance is hampered by emulation overhead and Thunderbolt bandwidth, the eGPU provides a 120x improvement in AI prompt processing speed compared to native Apple Silicon.
The configuration operates by passing the GPU through Thunderbolt into a 64-bit ARM Linux virtual machine hosted on macOS. This approach circumvents the lack of native NVIDIA drivers on Apple Silicon. The process involves significant engineering hurdles, including the creation of a custom virtual PCI device called `apple-dma-pci` to manage Direct Memory Access within the strict 1.5GB mapping ceiling imposed by the Apple Silicon DART IOMMU. Additionally, kernel patches using kprobes were required to bypass memory alignment issues in the NVIDIA driver and to enable hardware Total Store Ordering for x86 emulation.
Gaming benchmarks reveal that while the setup is functional, it remains substantially slower than a native PC configuration. On the M4 MacBook Air, Cyberpunk 2077 at 1080p reached 42fps with the eGPU, up from 26fps natively, though the CPU under FEX emulation remains the bottleneck. Doom, which is unplayable on macOS natively due to a lack of OpenGL support, ran at approximately 49fps in the virtualised environment. The author noted that gaming performance is generally 2–4 times slower than a native PC with the same GPU due to these translation layers and Thunderbolt latency.
The most significant performance uplift was observed in local artificial intelligence inference. For the Qwen 35B model, the M4 Air’s prompt processing time dropped from 17 seconds to 150ms with the eGPU, representing a 120x improvement. Token generation speed increased from approximately 22 tokens per second natively to 155 tokens per second with the external GPU, outperforming the M4 Max Mac Studio in this specific workload. The RTX 5090 also demonstrated superior concurrency scaling, maintaining linear throughput increases with multiple requests, whereas Apple Silicon solutions saturated quickly.
The project is currently a proof-of-concept rather than a consumer-ready solution. It requires a special Apple entitlement for driver signing, which the author has requested but not yet received, forcing users to build their own driver versions. Stability issues persist, including Steam crashing in loops due to FEX bugs and potential DMA mapping fragmentation that may require VM restarts. The author is working with upstream QEMU to integrate these patches, though widespread adoption remains dependent on future improvements to Thunderbolt support on Apple Silicon.

