All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
PHP or Python for local price tracking website?
My main is PHP and I can use MySQL and framework to develop web app.
However, I only had experience with a python selenium script (backend).
I'm planning for a 2nd website (price tracker for a local ecommerce store) where users can receive email when products in their wish list reach X target price.
As much as I like to upgrade my python skill to next level e.g., coding script and frontend (Django) but I'm on the fence because I am more likely to face development obstacles than using PHP which i already developed a tracker.php
that can e.g., scrap amazon product price
Should I even consider (python+scrapy+django) or just go ahead php+curl/file_get_contents+CodeIgniter(or some other light Bootstrap templates) ??
Comments
php or golang, better concurrency.
Scraping can be done easier with JS > GO > PHP. In the end, i will suggest you to go with what you already know or you want to go with.
How often will you scrape website? How many pages are there to scrape?
Calculate it first. One hour is 3600 seconds. If you want to scrape 1000 pages per day... then performance doesn't matter at all, but if you would want to scrape 10000 pages every hour... then not only you need performance, but you will deal with other problems - they can block you because you are wasting their resources, if they have Cloudflare then you will receive captcha blocks very fast.
Try to go more official way - you are driving customers to them.
When I was trying to built similar project 2 years ago I bothered with scraping, but when I asked them they were more than happy to give me access to REST API and whitelisted me from any limits/blocks.
Maybe this store has affiliate program? If so then you should be able to get prices via REST API. And also you can earn some legit money that way
interesting insight. if i'm going ahead, i will probably need to plan some sort of mechanism to spread the product crawl over 24hrs.
To do it efficiently without forking lots of PHP processes, perhaps you could use some asynchronous PHP like ReactPHP.
Have you considered Laravel? It's very popular in the PHP world, well-documented, and easy to learn (as much as any framework is). I find it quite pleasant.
This. Scraping is always brittle.
Both stacks are ok:
There's no additional advantages
Scraping is easy. Take use of the fact that mobile networks are CGNAT and get a proxy that can change the IP through an API.
I quite often have to define driver.get and requests.get with a sleep() in order to not crash said website.
This is with Python & Selenium.
So before thinking about speed, reflect on:
1. Do you need the speed?
2. Can the website handle the speed?