AI Agents

DediRock · July 2025

Has anyone successfully written their own AI agent on their local machine? if so what computer specs did you actually need/have?

woinokiz · July 2025

Not , but yes with apis

nekomikoreimu · July 2025

I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

DediRock · July 2025

@woinokiz said:
Not , but yes with apis

gotcha have you looked into building your own on your local machine?

DediRock · July 2025

@nekomikoreimu said:
I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

okay that's cool so what were you running your agent for? and then how did it compare to just using say chat GPT on the web browser

wdmg · July 2025

LM Studio is great if you just want to mess around. Ollama's good as well if you want a (imo) nicer API.

rattlecattle · July 2025

AI agents are API calls under the hood with some added features like tool calling and mcp support which otherwise are not usually available on a chat based web interface. The API calls can be to a remote endpoint like OpenAI, OpenRouter etc or a self hosted Ollama instance - doesn't matter. Most of them are OpenAI compatible.

To build such an agent can use simple plain HTTP requests (like python requests). Tool and mcp details can be included in the system prompt itself. Then you would need to parse the response and if check if its a tool call, execute the tool implementation yourself and send the result back to the LLM (another HTTP request actually). LLM's are stateless - do not maintain session, so in each request need to include the past chat history.

This is too much bolierplate, so there are existing AI agent framework like langchain, CrewAI, AgnoAI, autogen on the Python side and myriad of other framework popping up almost daily.

From the specs perspective, a small VPS instance will work fine except for running Ollama.

DediRock · July 2025

@wdmg said:
LM Studio is great if you just want to mess around. Ollama's good as well if you want a (imo) nicer API.

okay awesome thanks yeah I used ollama 3. I created one to help me with my email accounts. But that model would mistake IP addresses and say cigars. small funny stuff like that so I looked into it and I just need a more powerful machine so I can use a different model

emaiI · July 2025

@DediRock said:

@wdmg said:
LM Studio is great if you just want to mess around. Ollama's good as well if you want a (imo) nicer API.

okay awesome thanks yeah I used ollama 3. I created one to help me with my email accounts. But that model would mistake IP addresses and say cigars. small funny stuff like that so I looked into it and I just need a more powerful machine so I can use a different model

Don't do that for anything serious... why not APIs of big providers?

nekomikoreimu · July 2025

@DediRock said:

@nekomikoreimu said:
I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

okay that's cool so what were you running your agent for? and then how did it compare to just using say chat GPT on the web browser

I remember I once linked it with Copilot in VS Code. Honestly, the web-browser version of ChatGPT seems to perform better. The only real advantage is that you can use it offline, I guess.

woinokiz · July 2025

@DediRock said:

@woinokiz said:
Not , but yes with apis

gotcha have you looked into building your own on your local machine?

Not , but last time I was trying to clone my video then electricity had issues , never seen after that

DediRock · July 2025

@rattlecattle said:
AI agents are API calls under the hood with some added features like tool calling and mcp support which otherwise are not usually available on a chat based web interface. The API calls can be to a remote endpoint like OpenAI, OpenRouter etc or a self hosted Ollama instance - doesn't matter. Most of them are OpenAI compatible.

To build such an agent can use simple plain HTTP requests (like python requests). Tool and mcp details can be included in the system prompt itself. Then you would need to parse the response and if check if its a tool call, execute the tool implementation yourself and send the result back to the LLM (another HTTP request actually). LLM's are stateless - do not maintain session, so in each request need to include the past chat history.

This is too much bolierplate, so there are existing AI agent framework like langchain, CrewAI, AgnoAI, autogen on the Python side and myriad of other framework popping up almost daily.

From the specs perspective, a small VPS instance will work fine except for running Ollama.

Okay well that's a little more in-depth than my current knowledge I will definitely research this a bit more. Okay perfect, did research about a month ago or so now I did not see anything about Lang chain crew AI or the other two you had said, however there was nothing out the box that would do what I wanted to do I use Outlook 2021 currently my local machine. So it seems like some sort of custom-built solution or some variation of it was the only way to do it I have more than just one email box I use in my Outlook, Outlook 365 has autopilot but is nowhere near what I need at least I think but thank you very much that's good stuff for me to read up on

CloudHopper · July 2025

Sounds like you need N8N: https://github.com/n8n-io/n8n

It has integrations for accessing services like Outlook and AI APIs so you can create workflows to read/send emails etc

DediRock · July 2025

@emaiI said:

@DediRock said:

@wdmg said:
LM Studio is great if you just want to mess around. Ollama's good as well if you want a (imo) nicer API.

okay awesome thanks yeah I used ollama 3. I created one to help me with my email accounts. But that model would mistake IP addresses and say cigars. small funny stuff like that so I looked into it and I just need a more powerful machine so I can use a different model

Don't do that for anything serious... why not APIs of big providers?

My understanding, you can't use the bigger ones because you had to make some big old API to call their system, it doesn't live locally on your machine. At least that was my understanding.

DediRock · July 2025

@CloudHopper said:
Sounds like you need N8N: https://github.com/n8n-io/n8n

It has integrations for accessing services like Outlook and AI APIs so you can create workflows to read/send emails etc

That is perfect. Thank you!

DediRock · July 2025

@nekomikoreimu said:

@DediRock said:

@nekomikoreimu said:
I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

okay that's cool so what were you running your agent for? and then how did it compare to just using say chat GPT on the web browser

I remember I once linked it with Copilot in VS Code. Honestly, the web-browser version of ChatGPT seems to perform better. The only real advantage is that you can use it offline, I guess.

Right, but technically it would be faster if your computer was strong enough to run its own version or engine right? Instead of an API calling ChatGPT, downloading etc. Then there's the problem that ChatGPT does not store all of your data. You had 30 gigabytes of data that you wanted your Agent to pull from, ChatGPT would not be able to do that. It'd be limited correct?

DrNutella · July 2025

Created with n8n on vps. Very easy.

DediRock · July 2025

@woinokiz said:

@DediRock said:

@woinokiz said:
Not , but yes with apis

gotcha have you looked into building your own on your local machine?

Not , but last time I was trying to clone my video then electricity had issues , never seen after that

what do you mean your electricity had issues?

woinokiz · July 2025

@DediRock said:

@woinokiz said:

@DediRock said:

@woinokiz said:
Not , but yes with apis

gotcha have you looked into building your own on your local machine?

Not , but last time I was trying to clone my video then electricity had issues , never seen after that

what do you mean your electricity had issues?

You would know if you were from tier 3 country

DediRock · July 2025

@DrNutella said:
Created with n8n on vps. Very easy.

yes, it sounds like you're a seasoned coder though

DrNutella · July 2025

@DediRock said:

@DrNutella said:
Created with n8n on vps. Very easy.

yes, it sounds like you're a seasoned coder though

JSON at best in this scenario

adanforest · July 2025

Don't know i just use copilot student account.

DediRock · July 2025

@woinokiz said:

@DediRock said:

@woinokiz said:

@DediRock said:

@woinokiz said:
Not , but yes with apis

gotcha have you looked into building your own on your local machine?

Not , but last time I was trying to clone my video then electricity had issues , never seen after that

what do you mean your electricity had issues?

You would know if you were from tier 3 country

tracking

DediRock · July 2025

@adanforest said:
Don't know i just use copilot student account.

gotcha that's only available though on Microsoft 365, I believe? how many email accounts do you have?

adanforest · July 2025

@DediRock said:

@adanforest said:
Don't know i just use copilot student account.

gotcha that's only available though on Microsoft 365, I believe? how many email accounts do you have?

Don't need Microsoft 365, i'm using copilot agent on VS Code

Peppery9 · July 2025

@DediRock said:

@nekomikoreimu said:

@DediRock said:

@nekomikoreimu said:
I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

okay that's cool so what were you running your agent for? and then how did it compare to just using say chat GPT on the web browser

I remember I once linked it with Copilot in VS Code. Honestly, the web-browser version of ChatGPT seems to perform better. The only real advantage is that you can use it offline, I guess.

Right, but technically it would be faster if your computer was strong enough to run its own version or engine right? Instead of an API calling ChatGPT, downloading etc. Then there's the problem that ChatGPT does not store all of your data. You had 30 gigabytes of data that you wanted your Agent to pull from, ChatGPT would not be able to do that. It'd be limited correct?

I think you're getting a bit mixed up with a local model verses an agent. Ollama or LM Studio are good options for running models locally - in general, you want as many GPUs with as much VRAM as you can get your hands on, and then some. Apple Silicon Macs are also a good choice with their unified memory. You can get a lot out of a small model on consumer hardware but temper your expectations accordingly, don't expect anywhere near ChatGPT-level performance or knowledge. Per-token API pricing is typically cheap enough that it's hard to justify a big investment in hardware.

An agent on the other hand is just an LLM with tools it can use. The tools can run on your local machine even if the model isn't, and they're very lightweight as they don't do any heavy lifting. Claude Desktop and VSCode (+others) can use MCP servers to interact with local apps, databases, files, etc on your machine. There's lots to choose from, or you could always write your own.

30GB is a lot of data for an LLM to process, and far outside any the context window of any model. Depending on what you're trying to do you might need to look into RAG.

@DediRock said:
Outlook 365 has autopilot but is nowhere near what I need at least I think but thank you very much that's good stuff for me to read up on

I have a work-provided Microsoft 365 Copilot subscription. I get some AI summaries and quick reply shortcuts in Outlook and chat buttons everywhere. I find it borderline useless.

Motion3549 · July 2025

not worth ROI.

DediRock · July 2025

@adanforest said:

@DediRock said:

@adanforest said:
Don't know i just use copilot student account.

gotcha that's only available though on Microsoft 365, I believe? how many email accounts do you have?

Don't need Microsoft 365, i'm using copilot agent on VS Code

wow you're right, I thought you needed Microsoft 365 to have co-pilot. Reading into this now, thank you

DediRock · July 2025

@Peppery9 said:

@DediRock said:

@nekomikoreimu said:

@DediRock said:

@nekomikoreimu said:
I tried running the Gemma 3 4b model on an RTX 3060 12GB. The performance wasn’t fully satisfying, but it wasn’t bad either.

okay that's cool so what were you running your agent for? and then how did it compare to just using say chat GPT on the web browser

I remember I once linked it with Copilot in VS Code. Honestly, the web-browser version of ChatGPT seems to perform better. The only real advantage is that you can use it offline, I guess.

Right, but technically it would be faster if your computer was strong enough to run its own version or engine right? Instead of an API calling ChatGPT, downloading etc. Then there's the problem that ChatGPT does not store all of your data. You had 30 gigabytes of data that you wanted your Agent to pull from, ChatGPT would not be able to do that. It'd be limited correct?

I think you're getting a bit mixed up with a local model verses an agent. Ollama or LM Studio are good options for running models locally - in general, you want as many GPUs with as much VRAM as you can get your hands on, and then some. Apple Silicon Macs are also a good choice with their unified memory. You can get a lot out of a small model on consumer hardware but temper your expectations accordingly, don't expect anywhere near ChatGPT-level performance or knowledge. Per-token API pricing is typically cheap enough that it's hard to justify a big investment in hardware.

An agent on the other hand is just an LLM with tools it can use. The tools can run on your local machine even if the model isn't, and they're very lightweight as they don't do any heavy lifting. Claude Desktop and VSCode (+others) can use MCP servers to interact with local apps, databases, files, etc on your machine. There's lots to choose from, or you could always write your own.

30GB is a lot of data for an LLM to process, and far outside any the context window of any model. Depending on what you're trying to do you might need to look into RAG.

@DediRock said:
Outlook 365 has autopilot but is nowhere near what I need at least I think but thank you very much that's good stuff for me to read up on

I have a work-provided Microsoft 365 Copilot subscription. I get some AI summaries and quick reply shortcuts in Outlook and chat buttons everywhere. I find it borderline useless.

yeah you're right on that, the definitions of those two need to be cleared and understood thoroughly. It seems like technical things, finding good definitions of things can be a challenge. but thank you for that, seems like trying to get towards a chat GPT type quality on your local machine is not something that's easily obtainable or done.

DediRock · July 2025

@Motion3549 said:
not worth ROI.

yeah seems to be that way right now.

Howdy, Stranger!

Categories

In this Discussion

AI Agents

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

AI Agents

Comments