Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Self-hosted search (4get, searx, etc.)
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Self-hosted search (4get, searx, etc.)

daviddavid Member

For the past few years I've used various searx instances for searching (various, because they tend to stop working at some point and I have to switch).

Recently, I setup a private instance of 4get (4get.ca) on my vps (password protected). It seems to be a new alternative to searx. It's pretty lightweight. So far it's working well.

Has anybody else used it? Or heard of it?

I'm the only one using it, so I'm hoping my vps IP won't get throttled or blocked. The amount of traffic isn't much, but I'm not sure if the search traffic appears different enough to cause an issue.

Thanked by 1mrTom
«1

Comments

  • NanjaNanja Member

    I don't think you will have issue?
    That 4get says it runs on contabo and I feel like contabo is very overshared. I think their resources won't even be looked at.

    You aren't 4get and probably not using contabo, so I guess it depends on what your VPS provider will think.

    I highly doubt they will care about search traffic though, whatever company it is.

    A lot of people use VPS as a VPN, I imagine VPS companies are used to seeing search traffic. That's even if they bother looking.

  • I use whoogle all the time and never get throttled, so I doubt you would get throttled with that on a connection that you or only a few people use.

    Thanked by 1david
  • @Nanja said: A lot of people use VPS as a VPN, I imagine VPS companies are used to seeing search traffic. That's even if they bother looking.

    I believe he is worried about google throttling him not the VPS provider. If your VPS provider throttles you for this they are very very weird lol.

  • daviddavid Member

    Yeah, I'm using Vultr. Not worried about them. Just the search providers (google & others).

  • shruubshruub Member

    @david said:
    Yeah, I'm using Vultr. Not worried about them. Just the search providers (google & others).

    For the search providers, I've never had a persisting problem. I had one private instance which never got ratelimited (not even password protected) and nowadays have a semi-public (kind of iykyk), which only gets ratelimited temporary if there lots of traffic.

  • If you don't make it a public instance, who else except you would use it?

  • edited April 12

    @david said:
    For the past few years I've used various searx instances for searching (various, because they tend to stop working at some point and I have to switch).

    Recently, I setup a private instance of 4get (4get.ca) on my vps (password protected). It seems to be a new alternative to searx. It's pretty lightweight. So far it's working well.

    Has anybody else used it? Or heard of it?

    I'm the only one using it, so I'm hoping my vps IP won't get throttled or blocked. The amount of traffic isn't much, but I'm not sure if the search traffic appears different enough to cause an issue.

    Looks nice, better than searx. Wish the creator was not edgy kid.

  • daviddavid Member

    @TheGreatOakley said: Looks nice, better than searx. Would it be possible to install it on shared hosting or VPS wit HestiaCP? Looks like PHP.

    It is php. Not familiar with the control panel, but you could probably get it to work on the vps. Maybe shared hosting, too. There is a custom nginx or apache setup, so you'd have to have a way to do that.

  • daviddavid Member

    @niceghost said: If you don't make it a public instance, who else except you would use it?

    I don't think I'd get rate limited for the amount of traffic. But it might be unique enough to identify it as not organic. I'm not sure if they care or look for that sort of thing. Based on the comments here, I'd guess not.

  • daviddavid Member

    Well, that was fast. Google blocked it, already. But just that instance, not the IP address. I'm still able to search google directly from the same IP.

    I tried changing the cookie, and user agent, in the scraper but it didn't help.

  • shruubshruub Member

    @david said:
    Well, that was fast. Google blocked it, already. But just that instance, not the IP address. I'm still able to search google directly from the same IP.

    I tried changing the cookie, and user agent, in the scraper but it didn't help.

    Now that's wierd. How many requests did you send? (to be fair, I don't know how 4gets scraper works, but it shouldn't be all that different)

  • daviddavid Member

    I'm not sure how many, but it was just me, doing normal searching.

    It has the option to configure socks5 proxies, so I have it set to send it back through my home IP address for now, but it's a bit slower doing that.

    Maybe I'll try it tomorrow to see if they unblock it. I can't see it, but I imagine it's getting a "suspicious activity" captcha, but there's no way to get past that.

  • daviddavid Member

    The scraper's user agent is set to a very old Android mobile browser. Once I got it working with the proxy to my home IP, I tried changing it to a newer user agent, but it wouldn't work, it apparently needs that.

  • shruubshruub Member

    @david said:
    The scraper's user agent is set to a very old Android mobile browser. Once I got it working with the proxy to my home IP, I tried changing it to a newer user agent, but it wouldn't work, it apparently needs that.

    Maybe wait a bit and then try on your vps again with a newer useragent. Google doesn't like special useragents, maybe try a newer chrome version or something? Otherwise, email the dev kid or something.

  • daviddavid Member

    It wouldn't surprise me if different user agent's allow certain things that others don't, due to their support profile. Kind of like what's going on right now with android custom roms & trying to get them to pass play integrity, burning through fingerprints.

    Google has been extra busy lately blocking things.

  • @david said:
    It wouldn't surprise me if different user agent's allow certain things that others don't, due to their support profile. Kind of like what's going on right now with android custom roms & trying to get them to pass play integrity, burning through fingerprints.

    Google has been extra busy lately blocking things.

    Blame that on AI companies lol

  • @BruhGamer12 said:

    @david said:
    It wouldn't surprise me if different user agent's allow certain things that others don't, due to their support profile. Kind of like what's going on right now with android custom roms & trying to get them to pass play integrity, burning through fingerprints.

    Google has been extra busy lately blocking things.

    Blame that on AI companies lol

    it was never unblocked for inorganic traffic

    a major source of income is legitimate traffic, bot traffic rather not. lots of time is spent on identifying legitimate traffic over bot traffic. always has been.

    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

  • shruubshruub Member

    @lowenduser1 said:
    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    The hell do you mean

  • @shruub said:

    @lowenduser1 said:
    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    The hell do you mean

    well it's either being connected or living inside a cave entirely disconnected. there's no middle ground of a little bit of this and a little bit of that. for performance reasons, sure, although i don't see an issue with that with search engines. privacy wise its rather marking one as the privacy crowd and doing the opposite thing

  • shruubshruub Member

    @lowenduser1 said:

    @shruub said:

    @lowenduser1 said:
    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    The hell do you mean

    well it's either being connected or living inside a cave entirely disconnected. there's no middle ground of a little bit of this and a little bit of that. for performance reasons, sure, although i don't see an issue with that with search engines. privacy wise its rather marking one as the privacy crowd and doing the opposite thing

    Yeah nah. It's a huge difference between being in a huge crowd of requests (especially not to google and rather to an engine that doesn't suck) rather than you searching for mini dildos and then getting advertisments for some ...pills.

    You don't need to be part of the "privacy crowd", but saying random stuff without knowing doesn't help either. That's like an "influencer" saying that WestVPN helps against not getting hacked or whatever.

  • @shruub said:

    @lowenduser1 said:

    @shruub said:

    @lowenduser1 said:
    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    The hell do you mean

    well it's either being connected or living inside a cave entirely disconnected. there's no middle ground of a little bit of this and a little bit of that. for performance reasons, sure, although i don't see an issue with that with search engines. privacy wise its rather marking one as the privacy crowd and doing the opposite thing

    Yeah nah. It's a huge difference between being in a huge crowd of requests (especially not to google and rather to an engine that doesn't suck) rather than you searching for mini dildos and then getting advertisments for some ...pills.

    You don't need to be part of the "privacy crowd", but saying random stuff without knowing doesn't help either. That's like an "influencer" saying that WestVPN helps against not getting hacked or whatever.

    Sure one can spend a bit time on installing uBlock or DNS blocking and get rid of advertisements. you can then search for dildos or whatever like anyone else does

    but saying random stuff without knowing doesn't help either

    It's not about the amount of requests, rather thinking any of that matters. Some countries are starting to select their hardware but they still will have to deal with existing transport and eco systems. they're now facing the after thought of being in control of the entire chain

  • shruubshruub Member

    @lowenduser1 said:

    @shruub said:

    @lowenduser1 said:

    @shruub said:

    @lowenduser1 said:
    i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    The hell do you mean

    well it's either being connected or living inside a cave entirely disconnected. there's no middle ground of a little bit of this and a little bit of that. for performance reasons, sure, although i don't see an issue with that with search engines. privacy wise its rather marking one as the privacy crowd and doing the opposite thing

    Yeah nah. It's a huge difference between being in a huge crowd of requests (especially not to google and rather to an engine that doesn't suck) rather than you searching for mini dildos and then getting advertisments for some ...pills.

    You don't need to be part of the "privacy crowd", but saying random stuff without knowing doesn't help either. That's like an "influencer" saying that WestVPN helps against not getting hacked or whatever.

    Sure one can spend a bit time on installing uBlock or DNS blocking and get rid of advertisements. you can then search for dildos or whatever like anyone else does

    That doesn't necessary solve the data collection problem though. But ad I said, you do you.

    but saying random stuff without knowing doesn't help either

    It's not about the amount of requests, rather thinking any of that matters. Some countries are starting to select their hardware but they still will have to deal with existing transport and eco systems. they're now facing the after thought of being in control of the entire chain

    Are we talking about the same thing?

  • use whoogle. Once in a while I get strange freezes, so it returns timeout but overall I am pretty happy with it

  • daviddavid Member

    @lowenduser1 said: i don't understand the privacy crowd that they think that any of this really matters while living in a state of fear

    I'm not living in fear. I don't like being tracked and profiled, though. I don't think it's right. I also like the technical challenge.

  • daviddavid Member

    @JohnFilch123 said: use whoogle. Once in a while I get strange freezes, so it returns timeout but overall I am pretty happy with it

    One of the things I like about 4get is that there's a drop-down list of search engines, so you can search different ones, or choose a different default.

    Years ago (before I switched to searx), I tried using different alternate search engines (duckduckgo, startpage, maybe others) for many months. I found, in general, the search results were not as good as google. Maybe that's changed now, though. Even though they weren't as good, maybe 75% of the time they were good enough. But I found I still had to research on google, often.

    Thanked by 1JohnFilch123
  • daviddavid Member

    I switched back to my vps IP address, and google is unblocked now, the next day. So it was temporary.

  • CeranaCerana Member

    Yes, I also used 4get as an alternative for searching. This is a very reliable and effective tool, especially if you have limited resources or want to maintain privacy. As long as you don't have performance or data access issues, you'll probably be happy with it.

  • daviddavid Member

    Sometimes searching google fails, but clicking search again usually works. Apparently google is changing their interface, so the scraper needs to be rewritten to support it at some point.

    But I'm still pleased with it overall. It's better than the public searx instances I was using before.

    I added some remote proxies, so it will cycle through them. Sometimes they get blocked, and then later it's fine. I wrote a script and small web interface to let me change the proxies easily.

    Here's a tip for the themes.

    Create a new theme css file that is just the "/* text-result */" section for Gore's theme. It will be similar to the default theme, except with better definition for the search headings and links.

  • HannanHannan Member, Host Rep

    Is this very common to use? We can deploy this as an application into our platform.

  • daviddavid Member

    I don't think it's very common.

    Thanked by 1Hannan
Sign In or Register to comment.