How does Similarweb categorize domains?

Each website in Similarweb is categorized based on the Similarweb categorization engine. Our code assigns a category for any website using content tags, similarity results, and a learning set of millions of websites that have verified category assignments. Our engine can accurately classify an unknown website into one of 25 main categories and 219 sub-categories. You can see the full list here.

How does Similarweb measure the accuracy of categories?

Similarweb has a categorization engine that is based on a proprietary algorithm that both categorizes new sites and reviews sites that already have a category. The engine refreshes each month (alongside the monthly snapshot release). We have a dedicated team of data scientists who are constantly working to strengthen the accuracy of our categorization engine.

The algorithm is based on many inputs, such as: 

  • content on the website itself,

  • search keywords that drive traffic to the website,

  • incoming and outgoing link relationships between websites,

  • many more data points.

We can also re-categorize some sites manually (for example, if a request comes through to Support).

Do we categorize subdomains?

We don't categorize most subdomains; however, we do make some exceptions:

  • If the site's subdomain is in Similarweb's whitelist. For example, "weather" is in the whitelist. So weather.yahoo.com could be categorized, but example.yahoo.com wouldn't be because the word “example” isn't in the whitelist.

  • If the site's main domain is in Similarweb's whitelist, then we will allow categorization of all of its subdomains. For example, "Blogspot" is in the whitelist. This way, different types of blog websites (all are subdomains of blogspot.com) can be categorized differently. 

Can you change the category and/or the functionality of a subdomain?

No, you cannot change either unless they are on the list of whitelisted subdomains. You can submit a support ticket to determine if the subdomain in question is on the whitelist. 

Was this article helpful?
4 out of 6 found this helpful