Acerca de
The Ollama skill provides a streamlined interface for deploying and managing local Large Language Model (LLM) inference servers on Bazzite. It utilizes Podman Quadlet for efficient containerization and supports a single-instance design that optimizes shared GPU memory across loaded models. Whether you need to configure hardware acceleration for NVIDIA, AMD, or Intel GPUs, pull the latest open-source models like Llama 3 and Mistral, or integrate a local API into your development workflow, this skill automates the complex setup and management tasks involved in running local AI.