A recent Domain Incite article quotes an ICANN Registrar Stakeholder Group (RrSG) claim that over “800,000 domain names have been suspended since the beginning of the year as a result of Whois email verification rules in the new ICANN Registrar Accreditation Agreement (RAA 2013)”. The cause for these suspensions is inaccurate domain registration data, in particular, email addresses that do not satisfy the validation criteria in the new agreement.
According to DI, the RrSG claim that the suspension figure represents suspension data collected by registrars representing approximately 75% of registered gTLD domains (com, net, org, biz, …). To put that 800,000 figure into context - and assuming a ballpark estimate of the total count of registered domains in the gTLDs is north of 150 million – that’s less than 1 per cent. Given the numerous reported and alleged estimates of Whois inaccuracy over the years 1, 2, 3, I find it hard to understand why anyone would be surprised or alarmed over this figure.
Let me temper this seeming insensitivity to possible registrant or Internet user harm or inconvenience by suggesting that, rather than looking at this single data point as alarming or indicative of any pattern or unintended consequence, consider the unprecedented opportunity the data set that corroborates this claim offers.
I applaud the registrars who took the time to collect these data.
I encourage the registrars to share the data.
With ICANN’s SSAC.
With respected members of the security, operations, research and public safety communities.
With ICANN’s Identifier Systems SSR team.
These data – the domain names and the associated registration records (Whois) – can be studied to answer important questions related to registrations (legitimate and malicious) and registrant Whois submission practices; for example:
- Are the registrant data other than email addresses evidently inaccurate?
- When did registrants first apply for the domain name?
- Have the registration data – in particular, email address or point of contact information – been modified by the registrant at any time since the original registration?
- Are the domain names on domain or URL block lists?
- Are the email addresses evidently inaccurate, one-time use (throwaway) email addresses, or validly composed email addresses that bounced?
- Are the email addresses associated with other domains outside the data set?
- Who operates the name servers of these domains?
- Do the name servers of these domains have a positive or negative reputation (e.g., are the name servers known to host malicious domains)?
- What services do these domains offer publicly (and uniquely identify in their zone data using A, CNAME, MX, or SRV resource records)?
- What can be deduced from passive DNS data associated with this domain?
- Where are these services hosted?
- Do the hosting providers of these domains have a positive or negative reputation?
- Where do the prevalence of these domains reside on SEO or site popularity rankings?
- Do hosted services or content (e.g., web site) provide evidence that the domain is active, dormant or malicious?
- What is the characteristic use of the domain name (e.g., online presence, merchant, social medium, pay per click, mail exchange, streaming content…)?
This list is not exhaustive but it does illustrate how valuable such data could be if shared with researchers. Such data could continue to be valuable if registrars are willing to repeat periodic collection and provide additional data points of this kind.
Single data points, especially when presented as an unqualified statistic, rarely provide sufficient insight to characterize the entire set of affected registrants. Concluding such without sharing or subjecting the data to deeper analysis is premature. Rather than invite others to produce similar data without commensurate access to registration data, I encourage the RrSG to work in cooperation with security, operations and public safety communities to better understand the data already collected.
As always... the opinions I express here are my own.