Google is hiring a new anti-scraping czar, whose job will be to analyze search traffic to identify the patterns of search scrapers, assess the impact, and work with engineering teams to develop new anti-scraping models for improving anti-scraping defenses.
Search Results Scraping
SEOs rely on SERP tracking companies to provide search results data for understanding search ranking trends, enabling competitive intelligence, and other keyword-related research and analysis.
Many of these companies conduct massive amounts of automated crawling of Google’s search results to take a snapshot of ranking positions and data related to search features triggered by keyword phrases. This scraping is suspected of causing significant changes to what’s reported in Google Search Console.
In the early days of SEO, there used to be a free keyword data source via Yahoo’s Overture, their PPC service. Many SEOs used to search on Yahoo so often that their searches would unintentionally inflate the keyword volume. Smart SEOs would know better to not optimize for those keyword phrases.
I have suspected that some SEOs may also have intentionally scraped Yahoo’s search results using fake keyword phrases in order to generate keyword volumes for those queries, in order to mislead competitors into optimizing for phantom search queries.
&num=100 Results Parameter
There is a growing suspicion backed by Google Search Console data that search result scraping may have inflated the official keyword impression data and that it may be the reason why Search Console Data appears to show that AI Search results aren’t sending traffic while Google’s internal data shows the opposite.
This suspicion is based on falling keyword impressions that correlate with Google’s recent action to block generating 100 search results with one search query, a technique used by various keyword tracking tools.
Google Anti-Scraping Engineering Analyst
Jamie Indigo posted that Google is looking to hire an Engineering Analyst focused on combatting search scraping.
The responsibilities for the job are:
- “Investigate and analyze patterns of abuse on Google Search, utilizing data-motivated insights to develop countermeasures and enhance platform security.
Analyze datasets to identify trends, patterns, and anomalies that may indicate abuse within Google Search.- Develop and track metrics to measure scraper impact and the effectiveness of anti-scraping defenses. Collaborate with engineering teams to design, test, and launch new anti-scraper rules, models, and system enhancements.
- Investigate proof-of-concept attacks and research reports that identify blind spots and guide the engineering team’s development priorities. Evaluate the effectiveness of existing and proposed detection mechanisms, understanding the impact on scrapers and real users.
- Contribute to the development of signals and features for machine learning models to detect abusive behavior. Develop and maintain threat intelligence on scraper actors, motivations, tactics and the scraper ecosystem.”
What Does It Mean?
There hasn’t been an official statement from Google but it’s fairly apparent that Google may be putting a stop to search results scrapers. This should result in more accurate Search Console data, so that’s a plus.
Featured Image by Shutterstock/DIMAS WINDU