A desktop application written in PyQT5 (python). Has support for using openai chatGPT as well as using a locally running llama model. Local has support for inferencing in 8-bit as well as 4/3/2 bit inferencing (model must already be quantized and requires CUDA).
WIP: Should work but working on documentation and cleaning stuff up