Q-Infer based on PowerInfer models, which are stored in a special format called PowerInfer GGUF based on GGUF format, consisting of both LLM weights and predictor weights. . ├── *.powerinfer.gguf ...