The new Cactus AI inference engine allows mobile devices to run local models using 10x less RAM through NPU optimization and ...