Ollama 0.133 introduces an experimental approach to parallel processing, empowering developers and researchers to optimize their AI applications, especially on single-machine environments such as Apple silicon laptops. By leveraging the newly introduced `OLLAMA_NUM_PARALLEL` and `OLLAMA_MAX_LOADED_MODELS` environment variables, users can now effortlessly manage multiple AI models and process several requests concurrently. This enhancement unlocks new levels of […] Read this story