You might need to make use of the gpu_memory_limit and/or lora_on_cpu config choices to stay away from managing away from memory. If you continue to run outside of CUDA memory, you could seek to merge in method RAM https://bookmarklinking.com/story3105973/the-greatest-guide-to-https-www-asgdfx-com