Twitter’s engineer Raghav Jeyaraman shared through a blog post as to how Twitter’s recent spam fighter BotMaker helped reduce spam content by 40 percent.
The BotMaker has been in the works for quite a long time and Twitter had built it from scratch. It was designed based on Twitter’s principles about unsolicited content and it is powerful enough to handle billions of events a day reducing spam content by 40 percent.
Although the BotMaker makes the final decision about whether or not the tweet will be published, it depends on other bots to analyze the tweet.
The post describes how BotMaker works with Scarecrow, a real-time scanning service, the first to scrutinize any tweet for disapproved links, spammy links or suspicious links and based on the results, it either approves the tweet to pass through or denies permission. In some cases, the user is asked to enter a captcha code to verify that the tweet is not posted by a bot.
Twitter’s Sniper runs the next scan on tweets approved by Scarecrow to analyze the tweets for those parts which are tricky to analyze in real time to filter out spam. Following this, the tweets that make through Sniper are sent for batch analysis, which are basically queries that could be run offline too, ruling out the need to have a latency in posting.
Twitter also follows other methods like watching the behaviour of accounts whose tweets are marked as spam by several users and analyzing it further to trace a pattern of similar accounts.