Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
25% Recurring Discount on NVMe VPS
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How do you santize the data you put into commercial LLMs?

As the title suggests, when you copy logs, errors, broken code and whatever else into a commercial clanker, how do you anonymize your personally identifiable information first?

Comments

  • rpqurpqu Member

    Uh sed?

  • AlyxAlyx Member, Host Rep

    I would take bets that nobody really does that 🫣

  • zedzed Member

    wait what

  • nghialelenghialele Member

    So people did that? I just throw raw into it

  • I usually use Notepad or Notepad++ and use search & replace to bulk remove/replace data. For domain I usually use example.com, for IPs I usually use ipv4 or ipv6.

    Thanked by 2oloke nghialele
  • edited 5:16PM

    @Alyx said:
    I would take bets that nobody really does that 🫣

    Yeah fat chance people would trade in any of that sweet, sweet efficiency gain for something boring like privacy or security. Not like the whole concept is likely to cross their minds at all but even if it did: No way.

    Thanked by 1TrikeLike
  • TrikeLikeTrikeLike Member

    I'm sure someone (maybe you) could vibe-slop something together for this quickly. If that's too much work, I imagine you could just feed it through some lightweight local model that strips this info out first, then copy-paste that into the prompt of your commercial chatbot of choice

  • olokeoloke Member, Host Rep

    @JohnFilch123 said:
    I usually use Notepad or Notepad++ and use search & replace to bulk remove/replace data. For domain I usually use example.com, for IPs I usually use ipv4 or ipv6.

    Actually I do the same for now. Would love to know if there are better solutions. That said, I don't rely on LLMs for really sensitive things anyway.

    I think there are now companies specializing in not letting users put sensitive company data into clanker. I saw one ad on YouTube, can't remember the name of the product.

  • @oloke said: Would love to know if there are better solutions

    I guess for really bulk logs etc you can run it through you local LLM which will sanitize it. Or can probably vibe code a web app in JS or something.

    Thanked by 1oloke
  • PolyAnthiPolyAnthi Member

    I've seen more and more things like https://x.com/MaziyarPanahi/status/2073383825669849118 show lately on my twitter feed (yes I know it is renamed X, cope).

    I imagine that would be the perfect utilisation, using local models to pre-clean your output to commercial LLMs.

  • VoidVoid Member

    So you guys don’t say YOLO and send all the logs and code as is?

Sign In or Register to comment.