Regretfully, the answer is generally: NO.
How is any site to comprehend what is a "malicious" URL? Is
http://deancollli... malicious (or is it a third party's legitimate
name)? Trademarks are often the subjects of international disputes,
major companies have a mixed record on trademark registration
internationally. Third party sites attempting to self-legislate what is
correct would be a nightmare (as it already can be with SPAM reporting
based upon IP address).
Most spiders and bots look like normal agents. If one implements limits,
it is generally a losing game (e.g., IP addresses changes, throttling
limits can be triggered by look-ahead caching, active users, restoring
sessions, and a host of other causes.
Searching for key phases can be a tactic, with notices to the provider
of a copyright violation, but that does not scale well.
A difficult problem. There are likely no easy answers.
- Bob Gezelter, http://www.rlgsc....