Nice post. How did you handle mails? Are you using an external api, queues, or do you directly rely on a local postfix installation? Just wondering how you also manage to send as many emails as this without being flagged as spam or not failling to send an email.
Hi. Good question! I just use exim4. I've been running my own email servers for a number of years, so I don't find it too difficult. I do take care to look at bounces and disable those accounts. I also include a link in each email to disable all alerts. My hope is that people would click that rather than mark it as spam. And of course every email address is verified at signup.
I like your article. I got your crawler working on Ubuntu 16.04 just fine and I'm inserting posts into mysql just fine. It runs every 5 minutes and grabs 3000 posts (about 2100 or so at the time of this posting are not dupes). Mysql holds the post_id so it knows if it has seen the post_id before. I'd like to grab comments too. Would be nice if your example has a bulk method of grabbing comments as well as your post example. Love that it's in PHP. I do a bunch of stuff in command-line PHP because it's quick and dirty and with PHP you don't need to npm or pip install anything - it's all just there! :)