softwareengineering

Best architecture to use LLM on laptop app [closed]

I'm designing an App that takes a document and then gets an LLM to review it. Ideally it would: run on a low end laptop (ie no GPU), work without internet access, run with minimal cost and have a RAG system that informs the review. I realise I might have to compromise on some of these. I've thought of three architectures: A. Make remote calls to eg Chatgpt with RAG on the laptop. This presumably incurs API call costs. B. Run RAG+eg mistral on the local machine. This might need a more powerful machine which I don't want. C. Run the whole thing in the cloud and just make calls to it from the laptop. Maybe use Chatgpt, maybe use llama or similar. Obviously cloud costs here. Is there a better/standard way?

Feb 15, 2025 - 18:29

0

I'm designing an App that takes a document and then gets an LLM to review it. Ideally it would:

run on a low end laptop (ie no GPU),
work without internet access,
run with minimal cost and
have a RAG system that informs the review.

I realise I might have to compromise on some of these.

I've thought of three architectures:

A. Make remote calls to eg Chatgpt with RAG on the laptop. This presumably incurs API call costs.

B. Run RAG+eg mistral on the local machine. This might need a more powerful machine which I don't want.

C. Run the whole thing in the cloud and just make calls to it from the laptop. Maybe use Chatgpt, maybe use llama or similar. Obviously cloud costs here.

Is there a better/standard way?

Tags:

Previous Article

Momodora: Moonlit Farewell Shines Even Brighter on the Switch

Microservices Patterns: A Comprehensive Guide

Related Posts

Ensuring type safety, honoring supertype contract

Mar 12, 2025 0

Hiding non-user-facing executables for Windows applicat...

Feb 21, 2025 0

Looking for Research Mentor(as a high schooler) [closed]

Mar 5, 2025 0

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.