Replies: 1 comment
-
@bmabi17 it's a fantastic idea that could align with CodeInterviewAssist’s goals of zero-cost and full privacy for end users..., but there are some challenges we should consider, especially given the hardware many in our community might have, like 4GB GTX cards, and even a comparison with 8GB VRAM RTX cards. Challenges with 4GB GTX Cards Inference Performance: Even if a model fits, processing speeds could be sluggish...potentially taking 5-10 seconds per response , disrupting real-time features like debugging or solution generation during interviews. Resource Contention: With limited system RAM (e.g., 8GB or 16GB total) and a 4GB GPU, multitasking with an IDE, browser, and interview tools (e.g., Zoom) could cause memory swapping or instability. Comparison with 8GB VRAM RTX Cards Additional Considerations Collaborative Next Steps |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I was wondering if we can also add self hosted LLM model with ollama? It has openai compatible api end points. These will make running this model with zero cost and full privacy for end user.
Beta Was this translation helpful? Give feedback.
All reactions