Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
WHOIS Misuse Study Draft report for public comment
26 NOVEMBER 2013
Nektarios Leontiadis
Nicolas Christin
Carnegie Mellon University
2
Table of Contents Table of Contents .......................................................................................................... 2
1. Introduction ............................................................................................................ 8
2. Background and overview of the study ............................................................. 10
2.1. Descriptive study ......................................................................................................11
2.2. Experimental study ..................................................................................................12
3. Study Samples ..................................................................................................... 13
3.1. Selecting a survey panel ..........................................................................................13
3.2. Creating a microcosm sample of the world’s registered gTLD domain names ...14
A proportional probability microcosm .................................................................................15
Registrant sample ..............................................................................................................15
Registrar/Registry sample ..................................................................................................16
4. Law Enforcement & Researchers survey ........................................................... 18
4.1. Survey methodology and design details ................................................................18
4.2. Analysis of responses ..............................................................................................19
Demographics ....................................................................................................................19
Level of expertise ...............................................................................................................21
Attack experiences .............................................................................................................22
Specific WHOIS misuse incidents ......................................................................................23
4.3. Discussion ................................................................................................................25
5. WHOIS misuse reported by Registrants ............................................................ 27
5.1. Survey methodology and design details ................................................................27
Methodology ......................................................................................................................27
Survey translations ............................................................................................................28
3
Types of questions .............................................................................................................29
5.2. Response and error rates ........................................................................................29
5.3. Analysis of responses ..............................................................................................30
Characteristics of the participants ......................................................................................30
Reported WHOIS misuse ...................................................................................................32
Adverse effects ..................................................................................................................35
Countermeasures ..............................................................................................................36
5.4. Discussion ................................................................................................................36
6. Assessing Registrar/Registry anti-harvesting................................................... 38
6.1. Survey methodology and design .............................................................................38
6.2. Analysis of responses ..............................................................................................39
Demographics ....................................................................................................................39
Employed anti-harvesting techniques .................................................................................40
Incidents of WHOIS misuse ...............................................................................................41
Incidents of WHOIS harvesting and their effect in deploying new countermeasures ..........41
6.3. Testing of WHOIS query rate limiting techniques ..................................................42
6.4. Discussion ................................................................................................................45
7. Experimental Study .............................................................................................. 47
7.1. Registrars ..................................................................................................................48
7.2. Domain names ..........................................................................................................49
7.3. Registrants associated with domains .....................................................................50
Names of Registrants.........................................................................................................50
Email addresses ................................................................................................................50
Physical addresses ............................................................................................................51
Phone numbers..................................................................................................................52
4
7.4. Registering domains ................................................................................................52
7.5. Duration of the experiment ......................................................................................53
7.6. Breakdown of the collected instances of misuse...................................................55
Postal address misuse .......................................................................................................55
Email address misuse ........................................................................................................56
Attempted malware delivery ...............................................................................................60
Phone number misuse .......................................................................................................61
Other types of misuse ........................................................................................................63
7.7. Overall experiment incidents of WHOIS misuse ....................................................64
7.8. Discussion ................................................................................................................65
8. Comparative result analysis................................................................................ 67
8.1. Correlation between measured and reported incidence of misuse ......................67
8.2. Domain characteristics affecting email address misuse .......................................69
8.3. Domain characteristics affecting phone number misuse ......................................73
8.1. Domain characteristics affecting postal address misuse ......................................76
9. Discussion ............................................................................................................ 77
10. Appendix A – Law Enforcement/Researcher survey ........................................ 81
10.1. Invitation to participate ..........................................................................................81
10.2. Consent form ..........................................................................................................82
10.3. Survey questions ...................................................................................................85
11. Appendix B – Registrant survey ......................................................................... 90
11.1. Invitation to participate ..........................................................................................90
11.2. Consent ...................................................................................................................91
11.3. Survey questions ...................................................................................................94
11.4. Terms .................................................................................................................... 109
5
12. Appendix C – Registrar and Registry Survey .................................................. 115
12.1. Invitation to Participate ........................................................................................ 115
12.2. Consent form ........................................................................................................ 116
12.3. Survey questions ................................................................................................. 119
13. Bibliography ....................................................................................................... 128
6
Executive summary Does public access to WHOIS-published data lead to a measurable degree of misuse1? This
study, sponsored by the Internet Corporation for Assigned Names and Numbers (ICANN) and
initiated by ICANN’s Generic Names Supporting Organization (GNSO, 2010), attempts to
answer this question, with a focus on the five most populous generic Top Level Domains
(gTLDs). To do so, we first surveyed experts, law enforcement agents, Registrants, Registrars,
and Registries, and collected their input on the prevalence of WHOIS misuse, thereby obtaining
a descriptive data set. We then complemented this descriptive portion of the study with a set of
experimental measurements of WHOIS misuse, which we obtained by registering 400 domains
in the top five gTLDs across 16 Registrars, associating unique, synthetic WHOIS contact
information with these domains, and monitoring incidents of misuse for a period of 6 months.
The main finding of the descriptive study is that there is a statistically significant occurrence of
WHOIS misuse affecting Registrants’ email addresses, postal addresses, and phone numbers,
published in WHOIS when registering domains in these gTLDs. Overall, we find that 44% of
Registrants experience one or more of these types of WHOIS misuse. Other types of WHOIS
misuse are reported, but at a smaller, non-significant rate. Among those, a handful of reported
cases appear to be highly elaborate attempts to achieve high attack impact.
As a caveat, most findings of the descriptive study are affected by low response rates from the
parties we surveyed. Most importantly, we are unable to draw meaningful conclusions about the
geographical aspects of WHOIS misuse. Indeed, the great majority of survey responses
originated from the US, even though we used a much more geographically diverse Registrant
population sample, and tried to survey Registrants in their native language.
The experimental study corroborates the findings of the descriptive study. In particular, it offers
quantitative insights regarding both the extent of WHOIS misuse, and the parameters affecting
WHOIS misuse. A limitation of the experimental study is that the impact of geographical location
1 In this study, WHOIS misuse refers to harmful acts that exploit contact information obtained from
WHOIS. Harmful acts may include generation of spam, abuse of personal data, intellectual property theft,
loss of reputation or identity theft, loss of data, phishing and other cybercrime related exploits,
harassment, stalking, or other activity with negative personal or economic consequences.
7
on postal address misuse could not be measured, due to the prohibitively expensive cost of
setting up postal boxes in countries without having an actual residence there.
Among the measurable factors analyzed by this experiment, we identify the gTLD as the sole
statistically-significant characteristic that affects the occurrence of the associated misuse of
phone numbers published in WHOIS. For example, the rates of WHOIS phone number misuse
are negatively correlated with .ORG domains (less misuse), but positively with .BIZ and .INFO
(more misuse).
Similarly, we find that the domain price is negatively correlated with the possibility of misuse of
email addresses published in WHOIS (i.e., experimental domains purchased at greater cost had
less email address misuse). We also discover that .COM, .NET, and .ORG domains are
associated with less email address misuse, while .BIZ domains are associated with more
misuse.
We also studied whether the composition of domain names themselves impacts the probability
of WHOIS misuse. We find that experimental domain names representing natural person names
appear to foster less email misuse, while for other experimental domain name categories (e.g.,
professional, randomly-generated, etc.), WHOIS misuse probability seems independent of the
domain name composition.
We find that WHOIS anti-harvesting techniques, applied both at the Registry and Registrar level,
is statistically significant in reducing the possibility of WHOIS email address misuse. Overall, we
find that experimental WHOIS data registered with Registries/Registrars with no observable
anti-harvesting countermeasures was twice more likely to result in unwanted emails compared
to cases where a countermeasure was deployed. We do not offer, however, a comparative
analysis of the effectiveness of specific anti-harvesting techniques against WHOIS misuse, as
any differences we could observe were not statistically significant.
Finally, we do not find other statistically significant correlations between specific Registrars used
to register experimental domains and measured rates of WHOIS misuse.
8
1. Introduction WHOIS is an essential information service that primarily allows anyone to map domain names
to Registrants and their contact information. There is increasing anecdotal evidence of misuse
of the data made publicly available through the WHOIS service. For instance, some Registrants2
have reported that their WHOIS publicly available data was used by a third-party to register a
domain name similar to the Registrant’s, while listing contact information identical to that
provided by the Registrant. The domain name registered with the fraudulently acquired
Registrant information was subsequently used to impersonate the owner of the original,
legitimate domain, for nefarious purposes. Other studies have concluded that WHOIS data
could be used for phishing attempts (SAC028, 2008), or even for sending spam email (SAC023,
2007).
The purpose of this WHOIS Misuse study is to provide a quantitative and qualitative
assessment of the types of WHOIS data misuse experienced by gTLD domain name
Registrants, the magnitude of these misuse cases and characteristics such as anti-harvesting
measures that may impact misuse.
The study offers the following contributions:
We test and validate the hypothesis that public access to WHOIS data leads to a
measurable degree of misuse of certain kinds of gTLD domain name Registrant identity
and contact information, via a combination of a descriptive study (surveys), and of an
experimental study.
We examine gTLD domain names, associated Registry and Registrar anti-harvesting
characteristics, and their effect on WHOIS misuse.
We describe the major types of misuse stemming from public WHOIS access to
Registrant identity and contact data.
We assess the effectiveness of anti-harvesting defenses against WHOIS misuse.
We design and describe a large-scale experiment to empirically measure the type and
extent of misuse of WHOIS information. This empirical work provides a framework for
the design of similar future studies.
2 See http://www.eweek.com/c/a/Security/Whois-Abuse-Still-Out-of-Control/.
9
The rest of this report is organized as follows. Section 2 provides the background of the study
and its objectives, and section 3 characterizes the population samples we utilized for the
different components of this study. The following sections (4, 5, and 6) discuss the descriptive
part of the study; each section separately describes each of three surveys we conducted with
law enforcement and researchers, with Registrants, and with Registrars and registries,
respectively. Section 7 discusses our experimental study, and includes a detailed presentation
of the experimental design and parameters. Section 8 provides an empirical analysis of data
collected from both the descriptive and the empirical part of the study. Section 9 concludes with
an overall discussion of the study outcomes.
10
2. Background and overview of the study Based on their operational agreement with ICANN (ICANN, 2013), all gTLD Registrars are
required to collect Registrant identification and contact information that is subsequently
published in each Registrar’s WHOIS directory. While the original purpose of WHOIS was to
provide the necessary information to get in contact with a Registrant for legitimate purposes
(e.g., abuse notifications or other operational reasons), uncontrolled public access to WHOIS
also allows the collection of the same information for nefarious purposes such as unsolicited
email or phone calls (i.e., spam). The Generic Names Supporting Organization (GNSO), which
is responsible for the development of gTLD domain name policies, including those pertaining to
WHOIS data, identified in a Task Force Report on WHOIS Services (GNSO, 2007) the
possibility of misuse of WHOIS data for phishing and identity theft, among others.
A later study by the ICANN Security and Stability Advisory Committee (SAC023, 2007) looked
into the potential of misuse of email information posted exclusively in WHOIS. During a three-
month measurement study, they registered an arbitrary number of randomly chosen domain
names, with and without the use of privacy and proxy services, and monitored the mailboxes for
spam email. The study found evidence that the public availability of WHOIS data contributes to
the frequency of spam email; and that protective services applied either to all WHOIS data (e.g.
rate limiting) or to WHOIS data associated with a single domain name (e.g. privacy and proxy
services), can deter WHOIS misuse.
This WHOIS misuse study builds on this previous work by providing updated results, and a
more comprehensive set of experiments. This study heavily draws upon the Terms of Reference
for WHOIS Misuse Studies (ICANN, 2009). This work was designed and conducted in response
to the GNSO’s decision to pursue WHOIS studies (GSNO, 2010); the goal of this study is to
provide empirical data to help ICANN determine if there is substantial WHOIS misuse which
warrants further action. Therefore, this study is designed to try to answer the following
questions:
Validate or invalidate the hypothesis that public access to gTLD WHOIS data leads to a measurable degree of misuse.
If the hypothesis is validated, identify major types of misuses stemming from public access to gTLD WHOIS data.
Determine which anti-harvesting measures appear to be most effective against gTLD WHOIS misuse.
11
We adopted a two-pronged approach – that is, we conducted both a descriptive study and a
complementary, experimental study. The descriptive study aims at collecting past instances of
misuse cases, through interviews and surveys of potential victims and Registrars/Registries. We
also surveyed law enforcement and cybercrime researchers and agencies that deal with
incidents of misuse, to better determine the nature and overall magnitude of WHOIS misuse.
We complemented the descriptive study by an experimental study. The goal was to acquire
controlled data on misuse events by setting up a representative environment attractive to those
who could be tempted to misuse WHOIS to measure the impact of anti-harvesting measures
that could affect the degree of misuse observed.
2.1. Descriptive study Pursuant to the Terms of Reference, the descriptive study consists of a set of four surveys: a)
Registrant survey, b) Registrar/Registry survey, c) Cybercrime Researchers survey, and d)
Consumer Protection, Regulatory and Law Enforcement organizations survey. Because they
relied on identical questionnaires, we will subsequently consider surveys c) and d) as a single
survey.
The goals of each of the surveys are as follows.
A) Registrant survey. Gathered a representative sample of domain names registered in the
top five gTLDs, and surveyed experiences of specific harmful acts attributed to WHOIS
misuse.
B) Registrar and Registry surveys. Surveyed Registries and Registrars associated with the
registration of the domain name sample from survey (A), to identify WHOIS anti-
harvesting mechanisms employed, and collect aggregate information about known
WHOIS harvesting attacks.
C/D) Cybercrime researchers and law enforcement surveys. These surveys intend to further
broaden the study’s perspective of WHOIS misuse by contacting a representative set of
researchers and consumer protection, regulatory, and law enforcement organizations, to
gather examples and statistics on harmful acts in general, and more specifically those
attributed to WHOIS misuse.
Our goal for survey A was to obtain a representative sample by randomly selecting domain
names from the top five gTLDs, maintaining the population proportions, and generate study
results with 95% confidence interval. Owing to the much smaller populations involved, surveys
12
B and C/D, on the other hand, are intended to provided qualitative insights rather than
quantitative measurements.
2.2. Experimental study The second facet of this work is an experimental study, which attempts to complement the
observations gained from the descriptive study by gathering a controlled set of network
measurements. The platform that is used for the measurements is a set of domain names,
registered as part of the study across the top five gTLDs through a representative sample of
Registrars, and associated with artificial Registrant identities. The goal is to measure the extent
of illegal or harmful Internet activity experienced by domain name Registrants that can be
exclusively attributed to WHOIS misuse, given that the experimental design eliminates any
extraneous variables that may correlate (positively or negatively) with the observed misuse.
In the surveys collected from the descriptive study, it is hard to completely eliminate external
plausible causes for illegal or harmful Internet activity to draw conclusions on WHOIS misuse
with certainty. For example, a Registrant might experience misuse of his/her personal phone
number used with the registration of his domain name. However, if that same number is also
listed in his/her Facebook profile and s/he has set poor privacy controls to protect his/her
profile, then misuse cannot be attributed to WHOIS with certainty. On the other hand, in the
experimental study, Registrant identities (a term defined in Section 7.3) are artificially
constructed and solely used for the purpose of this experiment.
The experimental study lasted six months, during which we collected emails, voicemails, and
postal mail received by the Registrants associated with the experimental domains. We
registered 400 domains with a geographically diverse set of 16 Registrars, distributed
proportionally across the top 5 gTLDs, with domain names that are classified in four categories
of interest plus one control category. Our analysis provides insights into the different degrees of
correlation between WHOIS misuse and gTLDs, types of misuse, types of domains, cost of
domains, and anti-harvesting techniques deployed. However the experimental design did not
allow us to gain major insights on how regions and countries are affected by WHOIS misuse; in
particular, we were not able to set up postal boxes out of the United States, due to mail
regulations requiring proof of residency, in most countries, and “virtual office” solutions being
prohibitively expensive at the scale at which we needed to run the experiment.
13
3. Study Samples In this section we discuss how we created domain name samples and selected invitees for the
different parts of the study. We first describe how we chose the invitees for the researcher and
law enforcement survey, before presenting the sampling process of the domain names and
resulting invitees of the Registrar/Registry and Registrant surveys.
3.1. Selecting a survey panel As part of the Law Enforcement and Researchers survey, we assembled a geographically
diverse group of experts in the fields of security and privacy affiliated with research institutes,
academia, law enforcement agencies, Internet Service Providers (ISPs), and national data
protection commissions. The goal was to survey experts to whom WHOIS misuse incidents are
reported, to ultimately obtain a qualitative global overview of WHOIS misuse, rather than a mere
collection of individual misuse incidents.
Geographical region Type of expertise North America Agencies to which security incidents are reported South America Large commercial vendor research labs Europe Large Internet service providers Africa Academic cybercrime research organizations Asia / Pacific Law enforcement agencies Commercial cybercrime investigators National Data Protection Commissioners
Table 1 Recruiting requirement in terms of geographical region and type of expertise
Our approach for recruiting participants was to build upon contacts established at Carnegie
Mellon University (CMU) with additional input from ICANN to fill coverage gaps. Once this
invitee list was completed, we identified remaining gaps and omissions in terms of the type of
expertise we were looking for and geographic coverage, and we successfully managed to
amend these deficiencies by researching online for additional invitees that would match our
requirements. Table 1 lists the coverage goals for this survey’s participants.
Toward the end of the time interval over which the survey was initially conducted, and despite
the high response rate (email-based invitation, 25% response rate, corresponding to 29
responses out of 114 invitations at the time), an initial analysis of the responses informed us
14
that we had collected a small number of individual misuse incidents and that we were lacking
coverage for South America. We therefore extended the duration of the survey and invited a
broader population of law enforcement experts attending the Costa Rica ICANN meeting to
participate. The required level of expertise of the additional participants was verified by survey
questions specifically structured for that purpose. Ultimately, the survey was run between
September 2011 and April 2012, with answers provided by every eligible3 participant who
completed the study being included in survey results.
3.2. Creating a microcosm sample of the world’s registered gTLD domain names
Domain name registrations in the top 5 generic Top Level Domains (gTLDs) in the summer of
2011 exceeded 127 million (Table 2). As we aspired to draw conclusions on characteristics of
the gTLD population as a whole, we decided to take a representative sample of those domains
– a microcosm – and employ statistical inference techniques on that microcosm. A similar
technique was employed by the NORC study of the Accuracy of WHOIS Registrant Contact
Information (NORC, 2010), with the exception that this WHOIS misuse study did not attempt to
geographically stratify the sample. The microcosm was selected randomly in an unbiased
proportional way from the population of 127 million.
gTLD Domains Proportion COM 95,185,529 74.54% NET 14,078,829 11.03% ORG 9,021,350 7.06% INFO 7,486,088 5.86% BIZ 2,127,857 1.67% TOTAL 127,694,306 100%
Table 2 Number of domain registrations in the top 5 gTLDs in August 2011
We select such a microcosm to investigate WHOIS misuse from a number of perspectives. At
the most basic level, we surveyed Registrants to learn about their experience of misuse of
personal or corporate information listed in WHOIS. We then surveyed the top 5 gTLD Registries
3 The eligibility was dependent on the participant being at least 18 years old, and on their explicit consent to participate. These criteria are defined by CMU’s Institutional Review Board (IRB).
15
and the Registrars associated with the sampled domains to understand how they are protecting
the Registrants’ information from WHOIS misuse. Finally, using a subset of the aforementioned
Registrars, we registered 400 test domains using artificial Registrant information, and we
monitored instances of WHOIS misuse experienced by those domains for six months. This
experiment enabled us to correlate domain name and directly-associated or observable
Registrar/Registry characteristics with WHOIS misuse (e.g, gTLD, cost, anti-harvesting).
A proportional probability microcosm In November of 2011 we received from ICANN, at our request, a sample of 6,000 domains,
selected randomly from gTLD zone files with equal probability of selection.4 Of those 6,000
domains, 83 were not within the top 5 gTLDs to be studied and so were discarded. Additionally,
we were provided with the WHOIS records associated with 5,921 of the domains, obtained over
a period of 18 hours on the day following domain sample generation. We used a WHOIS record
parser internally developed at CMU to convert the loosely formatted WHOIS records into
structured information that allowed further automated processing.
With this set of structured WHOIS information, we created a proportional probability microcosm
of the 127 million domains, using the proportions in Table 2. In deciding the size of the
microcosm we used as a baseline the size of the microcosm in (NORC, 2010). In 2009 NORC
assembled a proportional probability sample of 2,400 domains. Taking into account the growth
in the population of domain names under the 5 gTLDs from 2009 to 2011, we created a
proportional probability microcosm of 2,905 domain names, which we used to draw a sample of
domain names for data collection.
Registrant sample For the purpose of surveying domain Registrants, we needed a representative sample of the
microcosm of domain names, to identify their Registrants and invite them to participate. Our
sample design parameters are listed in Table 3. As an equal probability sample, every domain
in the microcosm has an equal probability of being selected. As with similar studies, we adopted
a confidence interval (CI) of 95% and margin of error (ME) of 5%. With the microcosm of 2,905
domains we estimated that a sample size of 340 Registrants5 would provide the necessary
4 At one point, we considered duration of registration as a sample parameter, but eventually decided not to use it, due to the relative difficulty to properly assess this parameter.
5 𝑆𝑎𝑚𝑝𝑙𝑒 ≥ 𝑤ℎ𝑒𝑟𝑒 𝑁 = 2905, 𝑛 = . × , 𝑆𝐷 = 𝑝(1 − 𝑝), 𝑎𝑛𝑑 𝑝 = 0.5.
16
insights for the given CI and ME. Additionally, provided that survey participants would be invited
via an email invitation, we projected a 15%-25% response rate. We consequently drew a
sample of 1,619 domains from the microcosm, which, with a 21% response rate, would yield the
desired 340 Registrants. This sample did not explicitly exclude or include Proxy-registered
domain names.
Method of selection Simple Random Sampling Confidence interval 95% Margin of error 5% Expected response rate 15%-25% Table 3 Sample design parameters
Registrar/Registry sample Before we provide the details about this sample, we need to clearly define the distinction
between Registrars and Registries. Registrars are entities that process individual domain name
registration requests. Each Registrar operates under agreement with at least one Registry – that
is, an organization responsible for maintaining an authoritative list of all domain names
registered in a given gTLD. For example, VeriSign is the Registry for all domain names
registered in the .COM gTLD; individual Registrars such as GoDaddy and Network Solutions
register .COM domain names under an agreement with VeriSign.
ICANN-accredited gTLD Registrars are responsible for collecting WHOIS information during
domain name registration, but WHOIS data storage and access varies across Registries. Thick WHOIS Registries maintain a central database of all WHOIS information associated with
registered domain names; they can respond directly to WHOIS queries with all available WHOIS
information. Thin WHOIS Registries maintain only basic WHOIS information centrally; they rely
on the Registrar for each domain name to store and supply all other available WHOIS
information.
In this study, we were concerned with the .BIZ, .INFO, and .ORG gTLD thick WHOIS Registries
and the .COM and .NET thin WHOIS Registries. Per the GNSO’s request for this study, we did
not attempt to study domain names registered under other smaller gTLDs or under ccTLDs.
The sample of Registrars and Registries that we surveyed as part of the Registrant and Registry
(R/R) survey, is directly associated with the previously described sample of Registrants. We
build a sample of 111 Registrars and Registries by simply looking up the Registrars who
17
maintain the registration information of the 1619 sampled domains, and the associated
Registries.
In the case of Registrar affiliates operating as resellers, the association between a domain
name and the Registrar that actually performed its registration cannot be identified in a
straightforward way. That is because WHOIS does not hold information about the Registrar-
Reseller relationship. So, for domains associated with known resellers, we used information in
WHOIS on domains’ name servers to identify some of the Registrars. This approach is based
on the assumption that in many cases domains use the DNS services of the Registrars with
which they are registered. We acknowledge that the method we described is problematic in
cases when (a) a domain has been registered with Registrar A, but the associated DNS server
is hosted by Registrar B, and (b) the Registrant delegates its domain name’s DNS services to a
company C that is not evidently associated with Registrar A. Nevertheless, we believe our
design choice provides a systematic and reproducible method of acquiring the required
information.
18
4. Law Enforcement & Researchers survey We ran an expert survey to gather examples and statistics on illegal or harmful Internet acts (as
defined by ICANN through the Terms of Reference for this and other WHOIS studies) in general,
and more specifically those attributed to WHOIS misuse, and to broaden our perspective of
WHOIS misuse. Survey invitees included a diverse set of researchers and consumer protection,
regulatory, and law enforcement organizations.
4.1. Survey methodology and design details For the invitation process we built up on contacts established at Carnegie Mellon University and
we requested ICANN’s input in finalizing the list of parties invited to participate in the survey. We
made significant effort to build a geographically diverse set of experts that enabled us to capture
the impact and the extent of WHOIS misuse around the world. We were also able to achieve
diversity in terms of the types of the expertise of survey participants. (See Section 3.1 for a
description of invitee list.)
We used email messages to invite individual experts to participate in the survey. The invitation
contained a short description of the study, information about the principal investigator, and links
to either participate in the survey or opt out from any future messages and reminders from us.
We also offered the option to download the questionnaire and email the responses to us. The
content of the invitation is available in Appendix A – Law Enforcement/Researcher survey:
Invitation to participate.
When a participant clicks on the link to participate he is presented with a consent form that
describes briefly the procedures, requirements, risks, benefits, associated compensation (none),
and privacy assurances we offered. The text is available in Appendix A – Law
Enforcement/Researcher survey: Consent form.
The survey lasted 8 months – from August 2011 until May 2012 – and collected responses from
101 participants. The survey was implemented with SurveyMonkey and all connections to this
service were protected with SSL.6 The survey questions are available in Appendix A – Law
6 Using SSL is just one of the measures we took to preserve the confidentiality of responses. In addition, only authorized personnel (researchers on our team) handled the survey responses. At the completion of the study all responses were removed from SurveyMonkey and kept at a secure location at Carnegie Mellon.
19
Enforcement/Researcher survey: Survey questions. Invitees were assured that all responses
would be treated as confidential, with survey data published in only in aggregate, anonymized
form.
4.2. Analysis of responses In the following sections we first describe the demographics of the participants, which establish
their level of expertise and geographical diversity, and then we delve into the WHOIS misuse-
specific responses. We then provide an overall summary of our findings from this survey.
Demographics The participants were initially asked to self-classify their occupation (Figure 1) and the type of
employer they are working for (Figure 2). As expected, security researchers and
government/law enforcement agents constituted about 90% of the responses. Based on the
description of the respondents’ employers, it is evident that the government view is over-
represented in responses. However, assuming that government agencies have a more
extensive and clear awareness of the misuse incidents, this characteristic of our population
sample is an acceptable bias.
Figure 1 Occupation of participants.
SecurityConsultant
Researcher(Industry)
Lawenforcement
agent
Researcher(Academia)
Governmentagency Other Manager
Consumerprotection
agencyOccupation 25% 20% 20% 12% 10% 7% 5% 1%
0%
5%
10%
15%
20%
25%
30%
% o
f par
ticip
ants
20
Figure 2 Description of employer.
In terms of geographical coverage, the respondents mainly provided responses for the
American and the European continent (Figure 3). While we made significant effort to invite
experts in the Asia, Africa, and the Pacific regions, participation from these regions was limited.
Figure 3 Reporting regions
Governmentalorganization
Securityindustry Academia Other IT
industryNot-for-profit
NGO Other
Employer 32% 23% 14% 14% 12% 5%
0%
5%
10%
15%
20%
25%
30%
35%%
of p
artic
ipan
ts
NorthAmerica
SouthAmerica Europe Central
America Africa Asia Oceania
Reporting region 37% 32% 18% 6% 4% 1% 1%
0%
5%
10%
15%
20%
25%
30%
35%
40%
% o
f par
ticip
ants
21
Level of expertise In the survey we included a set of questions that would inform us about the level and type of
expertise of the participants in the subject we are studying. Therefore we used a Likert scale (1:
low – 5: high) to rate the participants’ familiarity with the domain name registration process, the
requirement to provide personal information during that process, and the existence of the
WHOIS directory that makes this personal information available to the public, based on self-
reporting.
The results (Table 4) show that the majority of respondents are cognizant of the domain
registration process (mean:4.1, std.dev: 2.03), the requirement to submit personal information
(mean: 4.23, std.dev: 2.06), and almost 60% of participants rated themselves as experts in the
specifics of the WHOIS directory (mean: 4.35, std.dev: 2.1).
We also included questions that would not only evaluate the participants’ understanding of two
domain-specific notions (WHOIS harvesting, WHOIS anti-harvesting techniques), but would also
provide us with an insight of the level of expert awareness about WHOIS misuse, and the
techniques to thwart it.
Table 4 Familiarity with key domain registration concepts
1 - Notfamiliar 2 3 - Know
the basics 4 5 - Expert
Domain registration process 2% 0% 23% 33% 41%Requirement to supply contact
information with domainregistration
1% 1% 16% 35% 46%
Availability of contactinformation on WHOIS
directory3% 1% 10% 30% 56%
0%
10%
20%
30%
40%
50%
60%
22
81% of participants stated awareness of WHOIS harvesting, and 63% of WHOIS anti-harvesting
techniques. When the participants were asked to describe some anti-harvesting techniques,
most of them mentioned CAPTCHAs, port 43 rate limiting, and privacy or proxy registration
services.
Attack experiences In this section of the survey we sought to collect information related to direct and indirect
(reported) experiences of security related attacks overall, before we considered the role of
WHOIS misuse. The combined measures show the prevalent types of attacks that Internet
users are faced with in general. Further on, we tried to look for relationships (if any) between
reported security incidents and WHOIS misuse.
Table 5 and Table 6 list a variety of types of security incidents that can be triggered by network
attacks; participants are asked to note the ones that they have directly (Table 5) and indirectly
(Table 6) observed. Not surprisingly, email spam is the most observed type of network attack in
both cases. It is noteworthy though that all types of attacks (e.g., postal spam and blackmail)
have a high rate of occurrence. Comparing the directly observed and reported security incidents
we see a lower rate of reporting of email spam, email viruses, and postal spam. This could be
attributed to the widespread nature of these types of attacks, which could make the reporting of
these security incidents deemed unnecessary.
Table 5 Directly observed network attack experiences (overall, not specifically related to WHOIS misuse)
Email spam Email virusMalware
installation/drive by
downloadsPhishing
Unauthorizedintrusion on
serversPostal spam Denial of
Service
Abuse ofpersonal data
or identitytheft
Blackmail/ransom
demands/intimidation
Haveexperiencedattacks, butprefer not to
divulgespecifics
Vishing(voicemailphishing)
Yes 97% 82% 78% 77% 58% 55% 54% 49% 36% 26% 20%No 3% 18% 22% 23% 42% 45% 46% 51% 64% 74% 80%
0%10%20%30%40%50%60%70%80%90%
100%
% o
f par
ticip
ants
23
Table 6 Security Incidents reported to the expert (overall, not specifically related to WHOIS misuse).
Only 40% of the respondents reported that they consider the possible contribution of WHOIS misuse when analyzing security incidents. Such an observation has two possible
interpretations (or a combination of interpretations); either misuse of WHOIS data is an attack
vector that is being underestimated by the security experts and, thus, is not considered as
valuable aspect to analyze, or that WHOIS misuse is found to be insignificant in examining
security incidents. However, in a few cases, the experts reported that they were able to trace
back an attack to the public availability of WHOIS information, as described next.
Specific WHOIS misuse incidents In Figure 4 we show that 18 respondents (18%) were able to provide details in relation to 23
individual incidents involving suspected harvesting of WHOIS information.7 The experts directly
experienced about half (45%) of those incidents, as they were the targets of the misuse. In most
of the cases, the effect of the misuse was the reception of electronic and postal spam mail
containing marketing materials or bills for services that were not requested. However, a few of
those incidents (4) show highly sophisticated planning to extract money, distribute malware, and,
in one case, to poison DNS servers by deploying a phishing attack using WHOIS information. In
another case, Registrant information was used to register numerous domains for illegal
purposes.
7 The nature of the survey (expert survey) does not allow us to extrapolate this rate of WHOIS misuse occurrence, and it is merely an illustration of the kinds of misuse of WHOIS reported on a global scale.
Email spam PhishingMalware
installation/drive by
downloadsEmail virus
Abuse ofpersonal data
or identitytheft
Unauthorizedintrusion on
serversDenial ofService
Blackmail/ransom
demands/intimidation
Postal spamVishing
(voicemailphishing)
Haveexperiencedattacks, butprefer not to
divulgespecifics
Yes 74% 69% 67% 67% 67% 62% 58% 50% 47% 39% 30%No 26% 31% 33% 33% 33% 38% 42% 50% 53% 61% 70%
0%10%20%30%40%50%60%70%80%90%
100%
% o
f par
ticip
ants
24
Figure 4 Portion of survey respondents, reporting at least one incident of WHOIS misuse.
The types of personal information reportedly misused were mainly the email address (16 cases,
or 70% of all 23 cases of misuse). However, there were many instances where Registrant name
(6 cases, 26% of all 23 cases of misuse), postal address (6 cases, 27% of all 23 cases of
misuse), and phone number (4 cases, 17% of all 23 cases of misuse) were misused as well,
either individually or in combination with other personal details. Figure 5 summarizes these
findings.
.
Figure 5 Breakdown of reported cased of WHOIS misuse, based on the type of personal information misused. Certain cases of misuse involved more than one type of information being misused, hence the total is greater than 100%.
18%
82%
0%10%20%30%40%50%60%70%80%90%
Respondents reportingexperience of WHOIS misuse
incidents
Respondents NOT reportingexperience of WHOIS misuse
incidents
% o
f par
ticip
ants
Emailaddress
Registrantname
Postaladdress
Phonenumber
Misused information 70% 26% 26% 17%
0%
10%
20%
30%
40%
50%
60%
70%
80%
% of cases involving
specific misuse
25
Note that the percentages in Figure 5 correspond to the fraction of misuse cases; but recall that
only 18% of our respondents experienced any form of misuse at all. Furthermore, certain cases
involved multiple types of information being misused – and thus the percentages add to more
than 100%.
In 11 (48%) of the reported WHOIS misuse cases, experts reported taking no action to mitigate
the misuse (either the effects of it, or a future reoccurrence). However in 11 out of the 12
remaining cases where anti-harvesting techniques were subsequently employed, WHOIS
misuse incidents were eradicated. A few examples of such techniques include CAPTCHA
challenges and IP blocking, and one less technical mechanism where the legal department of
the affected company identified the WHOIS harvesters and demanded that they destroy the
misused WHOIS data.
4.3. Discussion We surveyed law enforcement and security research experts to comprehend the extent of
misuse of the publicly available WHOIS information globally. We succeeded in having a
geographically diverse sample with different types of expertise providing us with their insights on
WHOIS misuse. However, as this is an expert survey with a limited population sample, we do
not achieve statistical significance in our findings. (Note that this was not a goal, due to the
inherent nature of an expert survey.)
Overall, we found that, according to experts participating in this survey, WHOIS data misuse is
generally not considered when investigating security incidents, possibly because it is
underestimated as an attack vector. It is also noteworthy that contrary to the wide net we cast in
this survey, we were able to collect only a moderate-sized list of WHOIS misuse incidents from
organizations that should have an extensive understanding of the matter. This could mean that
WHOIS misuse is either under-reported or not as prevalent as conjectured. The other parts of
this study attempt to provide a more definitive answer to this question.
We collected reports from a minority of the respondents that they had directly observed WHOIS
misuse incidents. The effects of these incidents range from simple spam, to a well-orchestrated
phishing attack with the purpose of DNS-poisoning. Additionally, the countermeasures deployed
in those cases (mainly CAPTCHA and IP blocking) were adequate in preventing future WHOIS
26
misuse incidents. Again, other parts of this study explore anti-harvesting measures more
empirically.
27
5. WHOIS misuse reported by Registrants We surveyed a representative sample of top 5 gTLD domain name Registrants described in
Section 3.2 to gain a better understanding of their direct experiences with WHOIS misuse. In the
following sections we will first discuss the methodology and design details of the Registrant
survey. Later, we describe issues presented during the survey, which affected the
representativeness of our findings. We then present our discoveries related to the ways
Registrants experience misuse of their personal information as a consequence of its public
availability in WHOIS.
5.1. Survey methodology and design details
Methodology We used email messages to invite Registrants to participate in the survey. We acquired the
contact information through the WHOIS entries associated with the domains in our sample. The
invitation contained a short description of the study, information about the principal investigator,
and links to either participate in the survey or opt out from any future messages and reminders
from us. Because this survey was designed to be taken by non-Internet-savvy Registrants, the
invitation briefly described domain registration and the role of WHOIS data in simplified
language, included the name of the sampled domain name included in our survey, and
suggested that invitees query that domain name to see data about them published in WHOIS.
We also offered the option to download the questionnaire and email the responses to us. The
content of the invitation is available in Appendix B – Registrant survey: Invitation to participate.
When participants clicked on the link to participate they were presented with a consent form that
describes briefly the procedures, requirements, risks, benefits, associated compensation (entry
into a random prize drawing), and privacy assurances we offered. The text is available in
Appendix B – Registrant survey: Consent .
Between May 2012 and August 2012 we ran two pilots of the survey, which guided us in making
adjustments that increased the observed response rate. The actual survey lasted three and a
half months, from September 2012 until December 2012. The invitations were sent out in stages,
and each group of invitees was offered a period of 5 weeks to complete the survey. We also
scheduled the distribution of weekly reminders to non-respondents that increased the response
rate. The survey was implemented with SurveyMonkey and all connections to the service were
28
protected with SSL.8 Invitees were assured that all responses would be treated as confidential,
with survey data published in only aggregate, anonymized form.
Survey translations Because potential for WHOIS misuse is not restricted to English-speaking countries and this
survey was targeted at typical Internet users across the world, we developed translations of our
survey. We relied on native speakers of various languages from CMU for the translations. Our
translators all had a background in computer network or computer security, which meant they
not only had the required technical background to produce meaningful translations, but they
were also able to integrate nuances of the different cultures, making the international invitee
more likely to understand the survey materials and therefore more willing to participate.
Our sample of 1619 domain name Registrants covers 81 countries, which would have required
a disproportionate effort to translate the survey in some languages that would be mapped to a
handful of participants. In addition, the expected low response rate of the survey (15%) was a
good indicator that a number of translations would not be necessary, as the expected number of
responses for certain languages was close to zero, regardless of the language used. We
observed that 90% of our sample was located in just 18 countries, with the other 10% spread
across 63 countries. Hence, we decided to provide translations for the top 90% of the
participants (which includes English), and offer the English version of the survey to the other
10%. We offered the survey in the following languages: English, Chinese, French, Japanese,
Spanish, Italian, and Portuguese. We also intended to have German and Turkish translations,
but were not able to secure proper translations and ended up offering the English version of the
survey to participants from those two countries. This effectively reduced the portion of
participants surveyed in their expected native language to 84.9%.
As the expected response rate for the 10% of the invitees that belong to one of 63 countries is
close to zero, regardless of the language used in the survey, we do not expect that not providing
translations for this portion affected the outcome of the survey. Invitees from Germany and
Turkey represent 5% of the sample. Considering the expected response rate, and assuming
that none of the invitees from those counties have knowledge of English (which is certainly an
extremely conservative assumption), we estimate that the upper bound of the misrepresented
population is only 0.7%.
8 See footnote 6.
29
Types of questions The survey is divided into three parts. The first set of questions was designed to collect data on
the demographics of the participants. The second part of the survey was associated with seven
different types of misuse of WHOIS: postal spam, email spam, voice spam, identity theft,
unauthorized intrusion to servers, denial of service, Internet blackmailing, or any other type of
misuse a Registrant may have experienced. We requested that the participants optionally
provide a detailed description of their experiences in any of the previous categories. Due to the
length of the survey, which could take up to 30 minutes to complete, and could therefore lead to
participants abandoning the survey before completion, we randomized the sequence of
questions for different types of misuse, in an effort to avoid biases related to the design of the
survey. The third and final part of the survey collected information related to actions taken by
the participants in response to the WHOIS misuse. The survey questions are available in
Appendix B – Registrant survey: Survey questions. Through an online glossary, we also offered
definitions for key terms used in the survey questions, to accommodate typical Internet user
participants not familiar with the technical DNS and cybersecurity jargon. The terms are
available in Appendix B – Registrant survey: Terms.
5.2. Response and error rates Between May and August of 2012, we ran two pilots of the Registrant survey to assess possible
issues with the design and/or implementation of the survey. One pilot involved tech-savvy
colleagues at CMU with great experience in user surveys. This pilot helped us identify and fix a
number of design issues. The second pilot was targeted to a broader audience of randomly-
selected English speaking Registrants, and was intended to assess the expected response rate.
As shown in Table 3, we expected a response rate of 15%. However, in this second pilot, we did
not receive any responses out of the 48 invitations sent. We identified as a possible problem the
excessive length of the survey, which apparently discouraged participation. Therefore, we
attempted to remedy this by offering entry into a random prize drawing9 to participants that
would complete the survey in its entirety. Note there was no incentive to report having
encountered misuse; respondents were only required to complete survey sections that
pertained to their experiences.
9 The prizes were one Apple iPad 3 and four Apple iPod Shuffles, selected by random drawing among all participants who completed a survey.
30
Overall, we sent out 1619 invitations and had 57 participants: 52 in English, 3 in Japanese, and
2 in Spanish, achieving a response rate of 3.6%. Out of these 57 participants, we had 41
complete responses. Such a low number in collected responses impacts our targeted levels of
significance, namely the error rate. The resulting error rate for the statistic we are measuring (is there observed WHOIS misuse?) is 12.7%. This means that for 95% of the population, the
measured misuse deviates from the actual misuse in 12.7% of Registrants. For the other 5% of
the population, the deviation of the measured misuse can deviate by more than 12.7% of the
actual value (i.e. far more or far less misuse).
We should point out that inviting more Registrants was not expected to help us reach the goal of
5% error rate. If we were to invite every one of the 2,905 Registrants in the Registrant
microcosm, with an observed response rate of 3.6%, we would collect 105 responses. This
number of responses would result in a 9.4% error rate. This lower error rate would be
associated with a higher cost of running the survey, due to additional translations required.
5.3. Analysis of responses We start the analysis of the collected responses by first giving an overview of the characteristics
of the sample in terms of the demographics as well as the knowledge reported about the
WHOIS directory. We then delve into the specific types of WHOIS misuse reported.
Characteristics of the participants From a demographic standpoint, the participants are mainly from English speaking countries
(92%) even though we made efforts – as previously discussed – to include a wide geographical
range of participants. We collected responses from the following countries (in descending order
of number of participants): USA, Japan, United Arab Emirates, Australia, Canada, Switzerland,
Germany, Spain, UK, India, and Mexico (Figure 6). There were also respondents that did not
disclose their location.
31
Figure 6 Reported origin of participants.
Although each Registrant was surveyed just once, in regards to a single sampled domain name,
the majority of the participants (60%) have more than 10 domains registered, with 9% of the
participants operating a single domain. Additionally, the domains in our sample are mainly
registered by self-described for-profit businesses or organizations (49%), followed by the
domains registered by individuals (33%), and domains registered by non-for-profit organizations
(14%)10 (Figure 7). Moreover, respondents reported that most of the domains (46.5%) in our
sample are used for commercial activities. Finally, the great majority of the participants (93%)
indicated they are aware that any personally identifiable information included in Registrant name
and contact data can be accessed via the public WHOIS directory.
10 This survey asked Registrants to indicate whether a domain name was registered by an individual, for-profit business or organization, non-profit organization, informal group, or other. Unlike other WHOIS studies (NORC, 2010 and 2013), we did not attempt to verify these answers or to classify entities actually using domains for any stated purpose.
31
2 1 1 1 1 1 1 1 1 1 05
101520253035
Part
icip
ants
from
a
sing
le c
ount
ry
32
Figure 7 Self-reported use of surveyed domains.
Comparing the self-reported demographics of our survey with the WHOIS-based findings of the
WHOIS Registrant Identification Study (NORC, 2013), we see that the top two categories are
occupied by similar entities in both studies, with individual /natural person Registrants appearing
roughly with the same frequency (30% vs. 33%). In our study, the combined share of categories
representing legal person Registrants is 62% compared to 39% in (NORC, 2013).
Reported WHOIS misuse We now present our findings for each specific type of WHOIS misuse that we studied. In each
set of questions, we first asked the participants to report if they have experienced misuse of
specific type of information supplied when registering their domain. If the answer is yes, we then
asked more specific questions about those misuse incidents.
25 of the respondents (43.9%) reported experiencing some kind of misuse of their WHOIS information. In Table 7 we provide a breakdown of the reported WHOIS misuse for the three
types of information published in WHOIS that are reportedly subject to misuse: postal and email
address, and phone number.
For-profitbusiness ororganization
Individual use Non-profitorganization
Informalinterest group
Use of domains 48.8% 32.6% 14.0% 4.7%
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%%
of d
omai
ns
33
Table 7 Breakdown of participants reporting misuse, based on the type of reported misuse
Postal address misuse 38.6% of surveyed Registrants (22) have received postal spam mailed to an address published
in WHOIS, and 29.8% (17) believed the unsolicited mail resulted from misuse of their WHOIS
postal address. As a proof of their suspicion, participants provided details of the unsolicited mail;
it was either directly related to one of their domains, or it advertised web services. Moreover,
21.1% (12) of the participants reported that their WHOIS postal address was not published in
any other public directory (e.g. phone book, website, etc.).
The majority of the respondents that have received postal spam (14% of total, 8) experience this
a few times a year, with 11% (6) receiving postal spam a few times a month, and 5% (3) less
than once a year. The reported subjects of the unsolicited correspondence were mainly related
to fake domain name renewals and transfers, followed by messages related to website hosting,
and search-engine optimization (SEO) services.
Email address misuse 25 Registrants (43.9%) reported receiving spam email at an account associated with a WHOIS
email address. 29.8% (17) of those associate the misuse of their email address to WHOIS
because the topics of the spam emails specifically targeted domain name Registrants (e.g.
domain name transfer offers, domain name SEO offers). 14% (8) of the Registrants stated they
have not listed the misused email address in any other public directory.
Phonenumber
Emailaddress
Postaladdress Combined
Experienced misuse, andinformation was published in
WHOIS only8.8% 14% 21.1% 43.9%
Experienced misuse andattribute misuse to WHOIS 12.3% 29.8% 29.8% 43.9%
Experienced misuse 22.8% 43.9% 38.6% 73.7%
0%10%20%30%40%50%60%70%80%
% o
f res
pons
es
34
The majority of the respondents (10%, 6 Registrants) identifying WHOIS data misuse as a
cause for email spam reported that they receive spam email at the email address published in
WHOIS a few times a day, followed by 9% of responses (5 Registrants) receiving unsolicited
email a few times a week. The topics of the unsolicited messages are similar to the ones
reported for postal spam.
Phone number misuse 22.8% (13) of Registrants reported receiving voicemail spam, with 12.3% (7) attributing the
spam to WHOIS misuse. They were able to associate the voicemails with WHOIS because the
caller either explicitly referred to a domain name under the Registrant’s control or they were
offering domain services. 9% (5) of the Registrants who claimed to have experienced the
misuse of their WHOIS phone number said they had not listed their number in any other public
directory.
Identity theft Two of the participants reported that they have experienced identity theft but none could tie this
to WHOIS misuse.
Unauthorized intrusion to servers In order to measure the extent of misuse of WHOIS information to gain unauthorized access to
servers, we first asked the participants if they are the system administrators of Internet servers
associated with one of their registered domains. The number of participants that have this role is
very small (7%, 4), with just one person experiencing unauthorized intrusion. That respondent
could not tie the intrusion to WHOIS misuse.
Blackmail One participant reported being a victim of blackmail11 as a result of their information being
published in the WHOIS directory. The Registrant was allegedly accused by a third-party
company of violating the terms of domain registration because of the name the Registrant chose
for the domain. The Registrant said he was asked to pay some amount to settle, but after
consulting with lawyers, the Registrant decided to not take any action. After a few months, and a
series of emails from the third party, the latter stopped communicating with the Registrant. The
11 We describe this incident as reported by the Registrant, but cannot know the veracity of this claim or whether the domain name dispute was founded.
35
Registrant reported being adversely affected in terms of time (reading emails), and money
(lawyer consultation).
Other Although this survey gave Registrants an opportunity to describe WHOIS misuses not otherwise
covered, no participant claimed to have experienced any other type of WHOIS misuse.
Adverse effects In Figure 8 we present the portion of Registrants that reported they were adversely affected by
the misuse of their information, reportedly caused by WHOIS. In all types of misuse the main
adverse effect is the frustration caused by the extra time the Registrants need to go through the
spam email, postal mail, and voicemail. Spam calls associated with WHOIS misuse, even
though they only occur a few times in a year, appear to cause the highest level of frustration
(12%), possibly because spammers directly interact with the person picking up the phone.
Spam postal mail causes the least frustration (5%): people are used to junk mail, and WHOIS
associated postal spam is relatively infrequent. WHOIS-related email spam, even though it is
the type of misuse most prevalent and frequent, adversely impacted 10.5% of the Registrants. A
plausible explanation for this discrepancy is that people in general, and Registrants in this case,
are used to receiving many unsolicited emails on a daily basis. Therefore the marginal cost of
deleting one more spam email originating due to WHOIS misuse may be considered negligible
by sampled Registrants.
Figure 8 Portion of participants adversely affected by the misuse of their information published in the WHOIS, broken down into the three main types of misuse.
12.3%
10.5%
5.3%
0%
2%
4%
6%
8%
10%
12%
14%
Phone number Email address Postal address
% a
dver
sely
affe
cted
Type of misuse
36
Countermeasures 40% (8) of the 20 Registrant survey participants that have experienced at least one type of
WHOIS misuse reported having taken actions to protect themselves from additional WHOIS
misuse. On the other hand, 60% (12) of Registrants experiencing misuse did not take any
countermeasures. Registrants that took action reported utilizing a combination of the following:
Moving to a different Registrar (3). Change misused portions of WHOIS information (4). Change contact addresses and names with ones from a service provider (proxy
services) (4). Change contact addresses with forwarding addresses provided by a service provider
(privacy services) (3). Supply partially incorrect or incomplete information (2). Apply spam filter or register with an identity theft protection service (5).
The last option attracted the most interest, even though it only deals with the consequences of
the misuse, rather than trying to remedy possible factors leading to the WHOIS misuse itself.
24.5% of participants (14) were aware of strategies used by their domains’ Registrars to deter
WHOIS misuse. Most of the responses indicated the availability of proxy and privacy services
as part of the Registrars’ strategies against WHOIS misuse; and the use of CAPTCHAs in web-
based WHOIS queries as part of the set of strategies.
5.4. Discussion Getting Registrants to communicate their experiences in terms of the possible misuse of their
personally identifiable information listed in WHOIS proved to be a challenging task. Even with
an incentive to participate (a raffle at the end of the survey), we were only able to collect
responses from a small portion of invitees (57 out of 340, or 17%). However we were able to get
a clear insight into the prevalence of WHOIS misuse and the specific types of information that is
usually targeted.
Our study showed that 43.9% of Registrants claim to have experienced some type of WHOIS misuse. Given the margin of error rate of 12.7% this observation neither confirms or disproves that WHOIS-misuse is affecting the majority of Registrants. It does confirm though the hypothesis that public access to WHOIS data leads to a measurable and statistically significant degree of misuse.
The email address is mostly targeted, followed closely by the postal address. Phone numbers
are also misused, but with a much smaller occurrence and higher adverse impact per incident.
37
In terms of certainty of whether the misuse is originating from WHOIS, postal address misuse
comes first.
Potential survey biases We need to contemplate the biases the survey design introduced to evaluate the possibility of
over or under-reporting of WHOIS misuse. First, by not providing translated versions of the
survey to 15% of the sample, we may have missed some incidents of misuse experienced by
Registrants that do not speak English. However, given the observed response rate (3.6%), the
expected response rate of that portion of the sample (15%) is less that 1%. (3.6% of 15%) In
other words, even if we had all the possible translations, we expect that we would not get a
statistically significant number of responses from this group.
Another possible bias is that Registrants may be more willing to report a harmful act (e.g.
experience with misuse) rather than a lack of harmful incidents, which could lead to over-
representation of the incidents. In addition, we did not attempt to verify or corroborate any
WHOIS misuse incident, which could lead to false representation of the extent of WHOIS
misuse. However, the strong economic incentive we provided (entry into a random prize
drawing) was given for completing the survey, regardless of the kind of responses entered, and
should mitigate this potential source of bias.
One may argue that as this is a survey with a fair amount of technical content, it is biased
towards tech-savvy participants. We attempt to mitigate this possibility by providing explanatory
links throughout the survey. Additionally, since the registration of a domain assumes some level
of technical understanding about the Internet, we believe that the technical complexity of this
survey should be within the technical understanding of most Registrants.
Finally, as the described, the great majority of the survey participants originate from North
America. This fact affects our findings in the following ways; first, we are unable to analyze the
geographical distribution of misuse, as the survey suffers from coverage bias. Consequently,
findings are also descriptive of a narrower portion of the world population than we had wished.
As a result, the survey cannot accurately capture potential geographical diversity in the
occurrence of WHOIS misuse.
38
6. Assessing Registrar/Registry anti-harvesting In this section we discuss the WHOIS anti-harvesting techniques offered by the Registrars and
Registries. We first present the results of a survey that collected information from Registrars and
Registries regarding their experiences in terms of WHOIS harvesting incidents and employed
countermeasures. Then, we empirically tested the Registrars’ infrastructures when faced with
WHOIS queries at high rates, and we present our findings here.
6.1. Survey methodology and design This survey targeted the top five gTLD Registries and a globally diverse sample of Registrars to
collect information related to their experiences in terms of WHOIS misuse incidents, and their
efforts to counter such activity. We used email messages to invite a sample of Registrars and
Registries to participate in the survey. The invitation contained a short description of the study,
information about the principal investigator, and links to either participate in the survey or opt out
from any future messages and reminders from us. We also offered the option to download the
questionnaire and email the responses to us. The content of the invitation is available in
Appendix C – Registrar and Registry Survey: Invitation to Participate
When invitees click on the link to participate they are presented with a consent form that
describes briefly the procedures, requirements, risks, benefits, associated compensation (none),
and privacy assurances we offered. The text is available in Appendix C – Registrar and Registry
Survey: Consent form. In the consent form we offered assurances in terms of the confidentiality
of the reported results, in that no Registrar or Registry would be mentioned explicitly, and all the
results would be presented in aggregate form.
Before running the survey we ran a pilot with a small number of Registrars to evaluate the
quality of the questions and the related material. Some questions and part of the consent form
were modified to reflect pilot-reported sensitivity, particularly around disclosure of anti-
harvesting techniques.
The Registrar survey lasted 6 months – from March 2012 until September 2012 – and collected
in total 22 responses out of 111 invitees. For the invitation process, we used information
associated with the Registrant Survey sample, by identifying the Registrars and Registries that
collected and/or store WHOIS information for those sampled domains. Since our sample is
targeted based on the survey design, we do not make any claims of statistical significance in
39
terms of the overall gTLD Registrar and Registry population. However we do claim that we have
collected responses from 22 out of the 107 largest Registrars and, regrettably, despite
personalized invitations and multiple follow-up phone calls to Registry contacts in March 2013,
only one of the 4 top 5 gTLD Registries. The survey was implemented with SurveyMonkey and
all connections to the service were protected with SSL.12
6.2. Analysis of responses We first describe the demographics of the Registrar/Registry survey participants in terms of their
location, the volume of domain registrations and WHOIS queries they process monthly. We then
provide an overall summary of our findings from this survey.
Demographics The majority of 22 Registrars that participated in the survey were located in the United States
(5), with the rest distributed across the following countries: China, Germany, Spain, Poland,
Turkey, France, India, South Korea, and UK. About 64% of the Registrars handle under 1
million domain registrations each, and 14% handle between 1 and 10 million registrations each
(Figure 9).
Registrars reported that the most popular method of querying their WHOIS databases is by port
43, which 56% of the Registrars said was used for 100,000 to 10 million queries per month.
Figure 9 Number of domains registered with Registrars participating in survey
12 See footnote 6.
Exactly or under 100000 100 001 to 1 000 000 1 000 001 to 10 000 000 More than 10 000 000
Number of domain registrations 50% 14% 14% 0%
0%
10%
20%
30%
40%
50%
60%
% o
f Reg
istr
ars
40
Table 8 WHOIS queries received by Registrars participating in survey. Note that not all participants answered all questions, so that the columns do not add to 100%.
Employed anti-harvesting techniques 57% of surveyed Registrars and Registries (13 of 23) implement at least one WHOIS anti-
harvesting technique, and in Figure 10 we present a breakdown of the techniques implemented
per Registrar/Registry. 39% (9 of 23) reportedly implementing port 43 rate limiting. 56.5% (13 of
23) provide web forms for interactive WHOIS queries, and 39% (9) require an answer to a
CAPTCHA type challenge to receive the WHOIS response. 30% of surveyed Registrars and
Registries (7) reported that they use permanent IP/domain blacklisting when necessary, while
52% (12) blacklist temporarily abusers of the service for 5 to 10 minutes.
In addition to direct anti-harvesting measures designed to deter active harvesting, we also
asked Registrars and Registries about Privacy and Proxy services that make harvesting less
desirable. Only 22% of surveyed Registrars and Registries (5) said they offer privacy services
that shield contact details of the domain Registrant except for the Registrants name, and 9% (2)
said they offer proxy services that completely shield all contact details. However, Registrants
can also use privacy and proxy services offered by third parties that are not Registrars or
Registries. Interestingly, when looking at the Registrant survey responses for Registrants who
chose countermeasures other than privacy and proxy services, surveyed Registrants reported
only one instance where the Registrar did not offer a privacy/proxy service.
Port 43 WHOIS protocol queryresponses/month
Web form WHOIS queryresponses/month
Bulk WHOIS data purchasetransactions/month
Do not know or do not measure 18% 27% 27%1,000,001 to10.000.000 9% 5% 0%100 001 to1 000 000 32% 14% 9%Exactly orunder 100 000 14% 27% 32%
0%
5%
10%
15%
20%
25%
30%
35%
% o
f Reg
istr
ars
41
Figure 10 Proportion of Registrars and Registries implementing a specific WHOIS anti-harvesting technique.
Incidents of WHOIS misuse We inquired about harmful events associated with incidents of alleged WHOIS misuse that were
reported by any Registrant13. Table 9 shows the reported events in a descending order of
prevalence. On the top of the list is email spam, which was reported to 39% (9 of 23) of the
Registrars. It is followed by phishing (22%, 5), postal spam (17%, 4), email virus (9%, 2), ID
theft (9%, 2), and various forms of blackmail (9%, 2). 26% of the Registrars and Registries (6 of 23) said they were able to verify that the reported harmful acts originated from misuse of the WHOIS information.
Incidents of WHOIS harvesting and their effect in deploying new countermeasures 30% (7) of the surveyed Registrars and Registries have reportedly experienced attempts of
automated harvesting of WHOIS information from their directories, but the respondents did not
classify any as successful. The same respondents also reported that they have adopted new
anti-harvesting techniques in the past 2 years, as a result of the observed attacks. The most
prominent additions to their defenses are permanent and temporary IP and domain blacklisting
13 We did not ask Registrars or Registries about specific incidents that were discussed in Registrant survey responses.
52%
39% 39%
30% 22%
9%
0%
10%
20%
30%
40%
50%
60%
Temporaryblacklisting
Port 43 ratelimiting
CAPTCHAtype
challenge
Permanentblacklisting
Privateregistration
services
Registrationvia proxy
% o
f Reg
iatr
ars\
Regi
atri
es
Type of anti-harvesting technique implemented at Registrar/Registry
42
along with port 43 rate-limiting (4), privacy/proxy protections services (3), and CAPTCHA (2).
Respondents were not asked to evaluate the perceived effectiveness of these measures.
Many participants did not provide responses in this section. That can be attributed to the
sensitive nature of the information we requested. Even though we provided assurances for the
safe handling and aggregation/anonymization of any information collected by this survey,
Registrars and Registries appear to be hesitant about providing WHOIS misuse specifics.
Table 9 Registrars receiving reports related to suspected types of WHOIS misuse
6.3. Testing of WHOIS query rate limiting techniques We complement our survey with an experimental validation of methods employed by Registrars
and Registries to combat WHOIS misuse. More precisely, we performed two types of tests on a
sample of Registrars and Registries to evaluate the availability and effectiveness of WHOIS
harvesting countermeasures. First, we performed rate-limiting tests on port 43 of Registrars and
registries, the well-known network port used for the reception of WHOIS queries. Additionally we
carried out rate-limiting tests for interactive WHOIS query web forms provided by Registries.
Table 10 presents our findings related to the 3 thick Registries that are within the focus of this
study. Based on our test results, we observed that one Registry provides none of the tested
anti-harvesting mechanisms whatsoever; however the other two Registries employ a
combination of anti-harvesting techniques. For instance one Registry employs relatively strict
measures by enforcing the use of CAPTCHA, and it allows a very small number of queries to be
issued to port 43 before applying a temporary blacklist.
Email spam Phishing Postal spam Email virusAbuse of
personal dataor identity
theft
Blackmail/ransom
demands/intimidation
Denial ofService
Vishing(voicemailphishing)
Unauthorizedintrusion on
servers
Registrantshave reportedexperiencingharmful acts,
but I prefer notto divulgespecifics
Registrars 39% 22% 17% 9% 9% 9% 4% 4% 4% 4%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
% o
f par
ticip
ants
43
Type of defense Number of Registries Details
Registries with no observed anti-
harvesting techniques on port 43 1
Registries that limit number of
queries, then blocking further
requests 2
Range of allowed queries: 4 - 40
Observed blacklisting duration: 1
minute14
Table 10 WHOIS query rate limiting at the thick Registries
In testing each ”thin WHOIS” Registry’s port 43 rate limiting, we issued a number of WHOIS
queries (1000) targeted at a specific Registrar, by requesting WHOIS information about a
domain registered at that Registrar. We then measured how soon the Registrar would block
further WHOIS query requests.
We tested Registrars’ rate limiting on port 43 in two stages. The first stage involved the 16
Registrars we used in the experimental study (Table 13). In this case we used domains that we
registered as part of the experiment in order to issue WHOIS queries.
In Table 11 we see that only half of the Registrars employ rate limiting as an anti-harvesting
technique, while for the others we observed no such measures. On average, they allow 83
queries, before stopping from further responding to additional queries. Just two Registrars in
this group provided information (as part of the WHOIS response message) related to the
duration of the temporary blacklisting, which in both cases was 30 minutes. One Registrar would
not provide responses in a timely manner, causing our testing script to identify this behavior as
a temporary blacklisting. By repeated testing with those Registrars we verified that the error was
not caused by a problem in the testing environment. It is unclear if this was an intended
behavior from the Registrar to prevent automated queries, or if it was just a temporary glitch in
their systems.
It is noteworthy that, when more than 100 queries originate from the same IP address, two of
the Registrars tested with port 43 queries would provide only the name of the gTLD domain
name Registrant and not any other WHOIS address details like email or postal address, and
14 We did not test if the duration of the blacklisting would change, after attempting to harvest WHOIS data for longer periods of time. We also did not vary our query rate to explicitly trigger or bypass rate limits.
44
would invite interested parties to use a web form to acquire more information about domains.
We did not perform any additional measurements using this form.
Type of defense Number of Registrars
Details
Registrars with no observed anti-
harvesting techniques on port 43. 6 (37.5%)
Registrars that limit number of queries,
then block further requests. 7 (44%)
Mean allowed queries: 83
Standard deviation: 66
Registrars that deter automated
queries, by delaying their responses by
a few seconds.
1(6%)
Registrars providing only name of
Registrant when harvesting is
detected.15
2 (12.5%)
Table 11 Aggregate results of WHOIS query rate limiting at 16 Registrars used in the experimental study.
The second stage involved the remaining Registrars associated with the domains in the
Registrant sample. Here, we queried for domains from the Registrant sample. There were a few
occasions, though, where the domains from the Registrant sample associated with specific
Registrars had expired or moved to different Registrars. In those cases we used the name of
the Registrar itself to initiate WHOIS queries, assuming that a Registrar uses its own
infrastructure to register its own domain. This assumption was not always true in the cases of
reseller Registrars. Therefore cases where a single Registrar appears multiple times were
consolidated and each Registrar counted once.
Similar to Table 11, Table 12 presents our findings for the second part of the Registrar testing.
Registrars are mainly divided between those that do not exhibit any anti-harvesting capacity and
those limiting the number of queries, followed by temporary blocking. However, most of the
Registrars (10 vs. 6) used in the experimental study offer some type of WHOIS anti-harvesting
protection, contrary to the majority of the Registrars in the second set that do not offer any
protection (48 vs. 37).
15 When more than 100 queries originate from the same IP address.
45
This observed difference in proportions potentially means that, if we conclude that WHOIS anti-
harvesting measures deter WHOIS misuse, then the measured misuse in our experimental
study will represent a lower bound of the total misuse occurrence. This conclusion assumes that
we test a representative sample of Registrars; this assumption appears valid, as the set of
Registrars is itself derived from a representative set of Registrants, as described in Section 3.2.
Type of defense Number of Registrars Details
Registrars with no observed anti-
harvesting techniques on port 43 48 (54%)
Registrars that limit number of
queries, then block further
requests 37 (42%)
Mean allowed queries: 92 Standard deviation: 99 3 Registrars imposed 24 hour
ban 2 Registrars imposed 20 min
ban 1 Registrar imposed 1 min ban 2 Registrars limit max requests
per time frame (day/minute) per IP address.
1 Registrar exponentially increases waiting period if not obeying wait time of 1 second after 15th query.
Registrars that deter automated
queries, by delaying their
responses by a few seconds.
3 (3%)
Registrars providing only name
of Registrant 1 (1%)
Table 12 Aggregate query rate limiting test results for 89 Registrars appearing in the Registrant sample, but not in the experimental study. We present the findings of this group of Registrars separately, because of the differences in the testing methodology.
6.4. Discussion Possibly the most interesting finding of this part of the study is the hesitation of the Registrars
and Registries to provide insights on the reported and experienced incidence of WHOIS misuse,
making it difficult to draw representative conclusions. Considering the responses that were
46
answered by most of the participants, we see that WHOIS queries are mainly carried out
through port 43, followed by web forms, and then by bulk purchases. However the latter has the
potential for higher impact in misuse, as the number of WHOIS records exchanged is by
definition very large. Nevertheless, port 43 rate limiting appears to be the most widely-adopted
anti-harvesting technique.
It is more insightful to focus on the rate limiting tests that we undertook. The three thick
Registries represented 14.6%16 of the total domains registered in August of 2011, and the 92
Registrars used to test thin Registries have a combined 77.4% market share17. With .COM
and .NET domains representing 85.6% of total domains for the same period, the 92 Registrars
cover 66% of the total combined .COM and .NET domain population. Combining information
from Tables 11, 12, and 13 we conclude that 51.4% of tested18 Registrars and Registries do not
employ any port 43 rate limiting technique with the remaining 48.6% employing some type of
rate-limiting technique.
The approach pursued by the two Registrars which only provided the name of the gTLD domain
name Registrant and none of the other WHOIS details – instead referring interested parties to
filling out a web form – appears to be an interesting compromise between protection of
personally identifiable information against port 43 harvesting and the contractually mandated
port 43 availability of WHOIS information imposed by ICANN.
In Sections 8.2 and 8.3 we study the correlation of the existence (or lack thereof) of anti-
harvesting mechanisms with the measured occurrence of WHOIS misuse from the experimental
study to evaluate the effectiveness of such measures.
16 BIZ, INFO, and ORG domains combined (Table 2). 17 Market share was estimated using the Registrars’ proportions in the Registrant sample, which is representative of the Internet. 18 Note that we did not test a representative set of Registrars and Registries, but rather a subset of Registrars and Registries used to register sample domain names.
47
7. Experimental Study The experimental study attempts to complement the descriptive study by gathering a set of
controlled network measurements. The experimental study aims to capture directly incidents of
WHOIS misuse by registering 400 domain names with a variety of Registrars, using artificial
contact information, and then monitoring possible misuse of this publicly available information
that we did not publish or use anywhere else. The channels that are monitored to this end
include email, postal addresses and phone numbers.
To provide a sound basis for comparison, we built upon the framework laid out in the WHOIS
spam study (SAC023, 2007). The authors of the study set up domains in 3 gTLDs
(.info, .com, .org) and 1 ccTLD (.de), and used contact information they did not publish
anywhere else. They used four different types of registration: 1) with Registrars that provide
anti-harvesting features (e.g., port 43 rate-limiting, CAPTCHAs), 2) registration by proxy, 3)
using a combination of both methods, and 4) using no such method. They then measured the
amount of spam received in all different conditions over a 90-day period. They also provided a
simple data analysis of the different types of spam being received, distinguishing between the
different types of products advertised, and phishing scams.
Taking this study as our starting point, we expanded on it as follows. We used 16 of the most
popular Registrars identified by our Registrant Survey sample to register 400 domains (note that
this expands the study (SAC023, 2007) to NET and BIZ domains; on the other hand, we did not
register any ccTLD domain). For each domain we registered, we set up a WHOIS-published
Registrant contact email address in the form of contact@domain_name.TLD. We also set up
additional unpublished email addresses for each domain name in the form of a “catchall”
account that collects all emails sent to an email address in the form *@ domain_name.TLD. The
only location where we published the Registrant email addresses was in WHOIS. In addition, we
set up incoming VoIP numbers (published as WHOIS Registrant contact information) to quantify
the amount of phone spam (and “vishing,” voicemail phishing) received. To reduce personnel
costs, the VoIP accounts were not actively monitored by an individual, but were instead
forwarded to a voicemail box that was periodically checked.
We also set up 3 PO Box accounts in the United States to detect possible spam postal mail sent
to our experimental domain Registrant names and addresses published only in WHOIS.
48
In the following sections we describe the design of the different components of the experiment,
followed by the findings of the experiment.
7.1. Registrars Out of the 107 Registrars associated with the domains in the Registrant sample, we selected a
small subset to register the domains for the experiment (Table 13). We selected Registrars
based on a number of study design parameters that we developed, namely:
Registrars were ordered based on their relative popularity (market share) in our sample
of Registrants, and the most popular ones that satisfied the rest of our design
parameters were selected. This way, we were able to create a test environment that
reflects the experiences of most Registrants.
Each Registrar selected for use in this experiment should allow registration of domain
names in all 5 gTLDs in the study’s scope. This way we would be able to effectively infer
if the measured misuse was affected by the Registrar or by other parameters of the
misused domains.
Each Registrar must allow registrations by individual natural persons. Thus, we did not
test Registrars that provide domain registration services just to businesses. Including
these Registrars would introduce bias to our findings.
Each Registrar selected must allow the purchase of a single domain name, without
requiring purchase of other services for that domain (e.g. hosting).
Each Registrar selected must allow the purchase of domains without us having to reveal
the actual identities of the researchers. We identified one Registrar that required a valid
photo ID of the Registrant; they were consequently omitted from this experiment
because we could not register test domains without having to disclose our identity and
possibly introducing result bias.
We identified three Registrars that only allow domain registration through their affiliated
resellers. These are Enom, Tucows, and Wild West Domains (WWD). In these cases we
tried to identify the resellers used by Registrants in our survey by looking at the name
server information from the domains’ WHOIS records. For example, the response of a
WHOIS query for the domain BEYONDWHOIS.COM 19 states that Tucows is the
domain’s Registrar. However, Tucows does not itself provide the actual domain
19 This domain was not part of our surveyed domains and is only listed here for illustration purposes.
49
registration services. By looking at the name servers in the WHOIS response, we can
identify that the associated domain name server is theplanet.com, which indicates
ThePlanet (now Hover) is likely to be the reseller that was used. Whenever this method
did not reveal the reseller used to register domain names included in our survey, we
randomly selected one of these Registrars’ reseller for use in this experiment.
GoDaddy Network Solutions Dotster Gandi Namecheap (Enom) Brinkster (WWD) Hover (Tucows) Tierra 1and1 Domain People Xinnet Name Joker Gandi Onamae DirectNIC Table 13 List of 16 Registrars (and affiliates) used for experimental domain registration. Together, these Registrars and affiliates cover 77% of those that appeared in our Registrant Survey sample.
7.2. Domain names As part of the experimental study, we studied the relationship between of the type of domain
name and WHOIS misuse. We registered domains that could be associated with the following
categories:
Completely random domain names composed by 5 to 20 random letters and numbers (e.g. unvdazzihevqnky1das7.biz).
Synthetic Domain Names (meaning domain names generated simply for the purposes of this study and registered by us) intended to look like individual persons (e.g. Randall-Bilbo.com).
Synthetic domain names composed by two randomly selected words from the English vocabulary (e.g. neatlimbed.net).
Synthetic Domain names intended to look like businesses within professional categories (e.g., hiphotels.biz).
In defining the characteristics of the last category, we selected a taxonomy of professional
categories that may lend themselves to spear-phishing and targeted spam. Additionally, we
hypothesized that domains that would be targeted could also be in the same categories as
domain name categories that were more likely to be abused. For instance, illicit online
pharmacies might prefer to register legitimate pharmacy domain names by harvesting related
WHOIS information. Thus, by registering pharmacy related domains, we hypothesized that we
might possibly observe higher rates of WHOIS misuse.
50
To this end, we consulted both APWG’s report on “Phishing Activity Trends” (APWG, 2011) and
the spam mailbox of this report’s authors. From the first source, we extracted the professional
categories that were mostly targeted by spam and phishing in the last quarter of 2010 with
percentages of more than 4% in total. More specifically these categories are: Financial services, Payment services, Gaming, Auction and Social networking. From the second
source, based on the kind of spam messages we usually receive (subject and sender) we
qualitatively decided to include professional categories related to medical services, medical equipment, hotels, traveling and delivery and shipping. We also defined three control
categories, which serve to verify that the above categories are specifically targeted or that they
are just general recipients of spam. The three categories are technology, education and
weapons.
7.3. Registrants associated with domains All registered gTLD domain names are associated with a Registrant Name and/or Registrant
Organization that links the domain name with its beneficial domain user, or with a legal proxy of
the beneficial domain user. This Registrant information appears publicly in WHOIS, and this
experimental study was designed to test whether the data associated with a registered gTLD
domain name is misused. Therefore, for the purpose of the experiment, we created artificial
Registrant Names, one for each domain name we registered. The ultimate goal was to be able
to associate an observation of misuse with a single domain, or with a specific set of domain
names under a specific gTLD, registered at a specific Registrar. A WHOIS record is comprised
of the following pieces of information: Registrant name, postal address, phone number, and
email address. In the following sections we discuss the design details in producing each one of
those.
Names of Registrants In generating artificial names comparable to names of real persons, we randomly glued together
an extensive list of common male and female names, with an extensive list of common last
names. There is no reuse of first name – last name combinations, so that we generated 400
distinct names, which serve as a unique association between a domain and the Registrant.
Email addresses For each domain that we registered, we set the DNS MX records to forward the requests,
through a mail proxy server, to an email server under our full control. The benefit of running our
51
own mail server is that we can completely control its behavior, disabling any spam filters that
would prevent us from collecting spam email. This email server acted as an aggregator for all
the domain names that we registered. For the purpose of anonymity we rented a virtual server
with Linode.com, which acted as a mail proxy to our email server. This mail proxy server
allowed us to conceal the fact that the mail server is running on a machine at CMU, aggregating
both solicited and unsolicited email sent to all test domains.
Physical addresses We initially put a lot of effort in finding a service that would enable us to acquire a number of
residential addresses for use with each of the registered domains.
We looked into international, as well as various national postal forwarding services. However we
were unable to find a suitable service. First, in all the countries we surveyed (including the US)
these services often require identification prior to opening a mailbox (e.g., form 1583 in the US),
and limit the number of recipients that can receive mail at this mailbox. Moreover we were
hesitant to trust mail-forwarding services from privately owned service providers. The reason for
the mistrust is that the individuals or companies providing mail-forwarding services may
themselves misuse the postal addresses and therefore contaminate our experimental results.
We decided to register three PO boxes within the US; several domains shared the same PO
box (but had different “contact names”). PO boxes can use street addresses appearing as a
residential address instead of as a PO box. This service (called “street addressing”) typically
uses the street address of the post office branch in which the PO box is situated. We provided
these street addresses when supplying contact information for inclusion in WHOIS data.
PO boxes are typically bound to the name of the person who registered them. We performed
multiple tests on the functionality of the PO boxes to see if, in practice, this was enforced. We
sent letters addressed to random names using the standard PO box addressing format as well
as the street addressing format. The purpose was to see if we would receive the letters without
any problem since the postal mail addressee’s name would not be listed as one of the owners of
the PO box. The letters were received successfully, which was a good indicator that other letters
addressed to any of the artificially created names associated with these mailboxes will likely be
accepted by the post offices (provided the volume of such mail remains low). Interestingly, we
acquired two PO boxes in California, but the same tests that worked at the other locations failed
in CA, rendering them unsuitable for the study.
52
Each one of the three physical addresses was associated randomly with an equal probability of
selection. In other words, each address is used by 33.3% of all these experimental domains.
Phone numbers We used Skype Manager to produce phone numbers that were associated with the WHOIS
records of the experimental domains. We used a separate number for each group of domains
within the same Registrar, registered under the same gTLD. For example all COM domains
registered with GoDaddy shared the same phone number. With this design there was a risk that
we would be unable to associate a spam voice call with a single domain name. Indeed, a phone
spammer may not necessarily identify a domain name or a person that he or she is calling for.
In this case, since the person/domain name acts as a unique identifier, association with a single
domain would be impossible. However, we can still compare the level of misuse within the same
gTLD across different Registrars and across gTLDs within the same Registrar. Moreover, this
re-use keeps phone costs within reason; on the other hand, registering a separate phone
number for each registered domain would have almost doubled the platform setup cost of the
experiment.
The numbers associated with each Registrant had an area code that matched the location of
the associated PO box.
7.4. Registering domains We registered in total 400 domain names across the top 5 gTLDs (.COM, .NET, .ORG, .BIZ,
and .INFO) and the 16 Registrars in Table 13. Before registering the domains, we generated
400 unique Registrant combinations for use with each of the domains. Whenever the Registrar
required the inclusion of an organization as part of the Registrant information, we used the
name of the domain’s Registrant, regardless of the category of the domain name being
registered (i.e. none of the synthetic business name category were registered with a synthetic
business name as its associated organization).
Given the parameters described above, each Registrar was assigned a group of 25 domains.
Each group is distributed evenly across the 5 types of domain names (person name, random
name, synthetic, professional category, control category) and the top 5 gTLDs. In other words,
with each Registrar we registered 5 domains under each gTLD, and the 5 domains consisted of
1 registration per category type. As all the domains under a single combination of gTLD and
53
Registrar are assigned the same phone number, we utilized 80 Skype numbers for the duration
of the experiment.
For example, given a Registrar R, we registered five domains with each of the five categories of
domain names. Each set of 5 domains has one domain under the five
gTLDs; .COM, .NET, .ORG, .BIZ, and .INFO. We created a total of five phone numbers, one for
each of the five gTLDs, and reused them across the different domain category types. Also
domains in each of the five gTLDs were associated randomly with one of the three PO box
addresses. Table 14 provides an example of the set of information required to complete the
registrations with one Registrar. In this experimental study, we used 16 blocks of information
similar to the one presented in Table 14. For each domain, we used the same Registrant
Names, postal/email addresses, and phone numbers for all types of WHOIS contacts (i.e.
Technical and Billing contacts).
The design of the experimental study differs from the design of the Registrant sample selection
in terms of the proportions used to select/register domains. In selecting the Registrant sample,
we utilized the method of proportional probability sampling, with proportions selected to be
equal to the ones on the Internet (see Table 2). On the other hand, the methodology we used to
register domains for the experimental study is similar to an equal probability sample, as there
was an equal number of domains registered in each gTLD. This design choice was motivated by
a desire to balance the costs of running the experimental study while retaining scientific
meaning. We will relate the rates of measured WHOIS misuse with the reported rates of WHOIS
misuse in Section 7.7.
7.5. Duration of the experiment We started registering domains at the last week of June 2012, and we completed the
registrations four weeks later. The main difficulty that we faced was the time required to
manually register the 400 domains in different registration environments (Registrars); little to no
automation was available across such a range of Registrars. The experiment lasted six months,
ending in the last week of January 2013. All experimental domains were registered using
commercial services offered by Registrars (i.e., we did not use free solutions such as those
provided by DynDNS), and none of the experimental domains was suspended or deleted during
this test period.
54
Domain name Domain category gTLD Registrant name and
organization Phone number PO box address Contact email
theo-lovell person name com Yvonne Beverly pn1 PO1 [email protected] farouk-head
net Miek Luo pn2 PO1 [email protected]
neville-llewellyn org Hilda Lucas pn3 PO3 [email protected] sedat-brandon info Sidney Charizard pn4 PO2 [email protected] hubert-germaine biz Vivek Christian pn5 PO3 [email protected] MoK8XlJ7BD random name com Izumi Brooke pn1 PO1 [email protected]
w6ilHlOhVy4PuO8s3gU8 net Colin Yushchenko pn2 PO1 contact@ w6ilHlOhVy4PuO8s3gU8.net
X6fIq96VvTae org Kinch Dana pn3 PO3 [email protected]
frTIg6FfxOWZTe5DL9Xgu4 info Tyler Gill pn4 PO2 contact@f rTIg6FfxOWZTe5DL9Xgu4.info
6TkOqIg
biz Kirk Xuereb pn5 PO3 [email protected] shescoundrel synthetic com Donna Langley pn1 PO1 [email protected] screwturned net Sharon Gasparian pn2 PO1 [email protected] lifethirsting
org Bonnie Addison pn3 PO3 [email protected]
steamertraffic info Trevor Ryan pn4 PO2 [email protected] gazellebrown biz Alexis Chandler pn5 PO3 [email protected]
pediatrictherapyequipment prof categories com Hein Clayden pn1 PO1 contact@ pediatrictherapyequipment.com
hotelspell
net Pandora Angelopoulou pn2 PO1 [email protected]
chiropractictherapyequipment org Chet Miyazaki pn3 PO3 contact@ chiropractictherapyequipment.org
chattanoogatherapyequipment info Barrio Bruce pn4 PO2 contact@ chattanoogatherapyequipment.info
hiphotels
biz Stevan Stratford pn5 PO3 [email protected] techdaft control com Liyuan Thornton pn1 PO1 [email protected] teachreel
net Molly Tattersall pn2 PO1 [email protected]
techyank
org Vicki Stoner pn3 PO3 [email protected] weaponsmob info Dewey Fermi pn4 PO2 [email protected] fastweapons biz Mechael Mereon pn5 PO3 [email protected]
Table 14 Example of domain registration details for a single Registrar. Identical information was used for all types of contacts (e.g. Technical and Billing)
55
7.6. Breakdown of the collected instances of misuse We next report the level of WHOIS misuse we experienced. More specifically, we report the
amounts of postal, email, and voicemail spam we observed, and we try to characterize different
types of spam within each set. We also analyze the email spam we collected to characterize the
incidents of phishing and malware distribution.
Postal address misuse As explained in previous sections, we operated three post office (PO) boxes in the state of
Pennsylvania, which we associated randomly with the artificial Registrant identities. The PO box
addresses used were not published in combination with our test domain name in any other
public directory, other than WHOIS. We monitored the contents of the PO boxes biweekly from
June 2012 until January 2013. We categorized the content either as generic spam or targeted spam. We placed mail in the first group if the receiver was not explicitly mentioned by name. A
common example in this category is mail addressed to the generic “PO Box holder.” Two out of
the three boxes would receive this kind of spam mail periodically, and this was observed with
every inspection of those boxes. In addition, there were cases where we would receive postal
mail addressed to a name that was not matching any of the Registrant names associated with a
specific PO box. A reasonable explanation for these instances is that previous owners of the PO
boxes would still have mail sent to that location. This kind of spam email is still considered
generic spam, and was observed in one of the PO boxes.
TLD Domain name category Purpose of postal mail
COM Professional (auctions) SEO services
NET Person name SEO Services
ORG Person name Product offer
INFO Professional (auctions) Shipping services
Table 15 Observed postal spam attributed to WHOIS misuse. First two rows (same color) represent same Registrar.
We received in total four pieces of postal mail that we classified as targeted WHOIS spam
(Table 15). Two out of four were from the same company; they were received in the same
collection period, and were both dated September 14th 2012. The purpose of both letters was to
sell advertising services for the domain names. The company collects a one-time fee of $85
56
USD, in exchange for submitting the domain names to search engines and performing search
engine optimization (SEO) on the domains.
Both domains subjected to this postal misuse were registered using the same Registrar. The
third piece of postal mail spam was received from a Registrar towards the end of the experiment
and targeted a domain registered with a different Registrar. The purpose of the letter was to
enroll the recipient in a membership program that provides easy means of sending postal mail
without the need to interact with the US post office. The fourth piece of postal mail spam was
received very close to the end of the experiment and offered a free product in exchange for a
website sign-up.
Surprisingly, the third PO box only received three pieces of generic spam throughout the
duration of the experiment.
Overall, the volume of targeted WHOIS postal spam is very low (4 pieces, 10%), compared to
the 34 pieces 20 of generic postal spam (90%). However, this may be due to the small
geographical diversity that we were able to achieve.
Email address misuse Each of the 400 domains we registered for the purpose of this experiment has a set of published
and unpublished email addresses. A published email address is of the form of
contact@domainname (e.g. [email protected]) and is listed only in the WHOIS record of
each domain. However, any email sent to a different recipient under the same domain (e.g.
[email protected]), will still be collected for later analysis; all such email addresses are
deemed “unpublished” addresses, since they are not advertised anywhere, including WHOIS.
By collecting unsolicited emails sent to both published and unpublished addresses, we are able
to provide a meaningful comparison of WHOIS-related spam, and generic (random) spam.
To classify incoming email either as solicited email or as unsolicited bulk email (spam), we used
the definition of spam offered by (Spamhaus.org, 2013). In short, an email is classified as spam
if it is unsolicited, and the recipient has not provided explicitly his consent to receive such email.
We adapted this definition to our experiment, by considering email originating from each
domain’s respective Registrar as not spam, while any other email is classified as spam. Indeed,
in many cases, the contract between Registrar and Registrant, which is established upon
20 (30 weeks x 1 piece of spam x 2 PO boxes) + 1 of spam from third PO box
57
registering a domain, gives permission to the Registrar to send informative emails. Since the
Registrant enters freely into this agreement (and can exit freely) by providing the published
email address to receive such notifications, we did not consider email received at the published
addresses from the Registrar as spam. We identified email originating from a domain’s
Registrar by looking at the email headers, extracting the domain part of the sender’s email
address, and comparing this string with the recipient’s respective Registrar.
Throughout the experiment, published email addresses received 7,609 unsolicited emails out of
which 7,221 (95%) were classified as spam (Figure 11). Of the total 400 domains, 95%21
received unsolicited emails in their published addresses with 71% of those receiving spam email
(Figure 12). Interestingly, 80% of spam emails collected during this experiment were addressed
to the 25 domains of a single Registrar (Registrar 13). As all the domains across all gTLDs
registered with the specific Registrar are equally affected by WHOIS misuse, this observation
does not affect the statistical validity of the results we present, except when it is explicitly stated
herein.
All 1,872 emails received at the unpublished addresses were classified as spam22, and they
were targeted to 15% of the domains23. This observation is a consequence of the definition of
spam we use; since the unpublished addresses are not listed in any public directory and not
shared in any way, all emails received are unsolicited, and therefore counted as spam. Out of
our 400 domains, two specific domains received a disproportionate amount of spam emails in
their unpublished mailboxes. We ascribed this to the possibility that 1) these domains had been
previously registered, and 2) previous owners of those domains were targeted, so that we
inherited the misuse along with the domains. It is thus highly plausible that the misuse
experienced there had a source different from WHOIS misuse. Looking at historical WHOIS
records24 confirmed that both domains had been previously registered (12 years prior, and 5
years prior, respectively) which lends further credence to our hypothesis.
21 All the received emails were unsolicited, and some of them were classified as spam, based on the definition. Therefore, this number is composed by the proportion of domains receiving unsolicited emails and the proportion of domains receiving spam. 22 None of those emails originated from a domain’s Registrar. 23 85% of the domains did not receive any emails at their unpublished email addresses. 24 http://www.domaintools.com/research/whois-history/
58
Figure 11 Breakdown of the emails collected across all domains, based on their classification.
Figure 12 Breakdown of experimental domains based on the emails they receive. The difference between public and private addresses in receiving email spam is statistically significant.
5% 95%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
In published addresses
In unpublished addresses
In published addresses In unpublished addressesTotal emails not classified as
spam 5% 0%
Total emails classified as spam 95% 100%
24% 71%
15%
5%
85%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
In published addresses
In unpublished addresses
In published addresses In unpublished addressesDomains receiving emails not
classified as spam 24% 0%
Domains receiving spam 71% 15%Domains not receiving any
unsolicited email 5% 85%
59
Domain grouping Number of domains / Total domains (in category) Published Addresses Unpublished addresses
gTLD Received email but no spam With spam Received email
but no spam With spam
COM 22/80 48/80 0/80 18/80 NET 19/80 52/80 0/80 11/80 ORG 16/80 45/80 0/80 11/80 INFO 20/80 62/80 0/80 15/80 BIZ 20/80 75/80 0/80 5/80 Domain name category
Control 20/80 58/80 0/80 12/80 Synthetic 17/80 56/80 0/80 13/80 Person name 21/80 51/80 0/80 10/80 Random 19/80 57/80 0/80 10/80 Professional categories 20/80 60/80 0/80 15/80 Registrars Registrar 1 0/25 23/25 0/25 1/25 Registrar 2 0/25 23/25 0/25 1/25 Registrar 3 2/25 4/25 0/25 3/25 Registrar 4 0/25 5/25 0/25 3/25 Registrar 5 21/25 24/25 0/25 8/25 Registrar 6 0/25 19/25 0/25 16/25 Registrar 7 0/25 25/25 0/25 0/25 Registrar 8 4/25 20/25 0/25 1/25 Registrar 9 0/25 18/25 0/25 7/25 Registrar 10 0/25 13/25 0/25 0/25 Registrar 11 0/25 23/25 0/25 0/25 Registrar 12 22/25 17/25 0/25 6/25 Registrar 13 25/25 25/25 0/25 13/25 Registrar 14 22/25 16/25 0/25 0/25 Registrar 15 0/25 11/25 0/25 1/25 Registrar 16 1/25 16/25 0/25 0/25
Table 16 Breakdown of collected email based on the nature of the targeted email address (i.e. published or not), the gTLD, the type of the domain name, and the Registrar.
60
In Table 16 we provide a breakdown of the collected emails based on the domain gTLD, the
domain name category, and the Registrar of the domain.25 The middle column shows the
domains receiving email in their published mailboxes, and the right column domains receiving
email in their unpublished mailboxes. Each of the two columns is further divided between
domains receiving email other than spam, and domains that received spam. Obviously, none of
those categories are mutually exclusive. Each fraction in the table represents the number of
domains existing in a specific category, out of the possible total number of domains in the
category. For example, of the 80 .COM test domains, 48 received spam at a WHOIS published
email address and 18 received spam at an unpublished email address. 22 of those .COM test
domains received non-spam email at the WHOIS-published address.
As expected, looking at the unpublished mailboxes of all domains in all categories we only
observe non-WHOIS-misuse spam. Across all categories there is a seemingly higher
occurrence of spam email in the published mailboxes compared to the unpublished ones. Using
a chi-square test, we find that the difference in proportions of received spam between published
and unpublished addresses is statistically significant when considering the gTLD (p < 0.05) and
the Registrar (p < 0.001), but not the domain name category (p > 0.05). In other words, WHOIS
misuse is present at measurable, statistically-significant levels (as shown by the difference
between published and unpublished addresses receiving spam); domain name category does
not seem to impact the amount of misuse, while the choice of gTLD and Registrar can increase
the occurrence of WHOIS attributed email misuse rate.
In Section 8.2 we study in detail which parameters of a domain (e.g. price of registration, gTLD,
anti-harvesting techniques employed by the Registrar/Registry, etc.) affect the rates of WHOIS
attributed email misuse.
Attempted malware delivery We used VirusTotal26 to scan all collected files received as email attachments during the first
four months of the experiment, and to detect malicious software. There is a great variety of
malicious software (malware) that can infect any computer, and which can place the infected
computer under the control of an attacker. For example, so-called “backdoors” can grant the
25 This experiment does not aim to identify specific Registrars, but to look for patterns affecting WHOIS misuse. Therefore, the name of the Registrar is not explicitly provided, and we instead offer anonymized identities. 26 https://www.virustotal.com/
61
attacker unrestricted remote access to the infected computer. The attacker may use the
backdoor, for example, to steal passwords or personally identifiable information. We followed, in
this respect, ICANN’s Terms of Reference for the various WHOIS studies (ICANN, 2009) in
which the existence of malware in email spam is associated with attempts for identity theft.
In total, we received 496 email messages with any type of attachments, with only 10 of those
targeting published email addresses. These attachments were sent to 10 distinct domains
registered with the same Registrar, and were sent by the same sender. However, all 10
attachments were innocuous, with the content being some form of newsletter.
Of the 486 attachments that were sent to the unpublished email addresses, 76 were found to
contain malware. The recipients of these infected emails were three of our experimental
domains. The analysis of the malware indicated that the 76 infected attachments were
associated with 12 well known families of malware, with 10 being different variants of Trojans.
However, none of the infected attachments targeted any of the published email addresses, and
as such we did not observe any WHOIS attributed malware delivery. This is in line with the
findings of the Registrant survey.
Phone number misuse As we experience in our daily lives, we often receive phone calls that were intended for a
different recipient, either because of misdialing or because the caller has the wrong contact
information of the person they are trying to reach. These cases, while they represent unwanted
calls, can hardly be classified as WHOIS originating spam. On the other hand, if the call is
unsolicited and the caller offers Internet services (e.g. website development) or is starting a
discussion about a domain name, then we can, with reasonable assurance, associate the call
with WHOIS misuse. There are also instances where the call is unsolicited, and the caller offers
services unrelated to WHOIS. However it is unknown if the caller harvested the number from
WHOIS, or if it was obtained in some other way (e.g., exhaustive dialing of known families of
phone numbers). The experimental design did not involve registering additional unpublished
phone numbers (similar to the private email addresses in the previous section), and, therefore,
we cannot compare the findings of this section to a baseline voicemail spam rate.
In the context of this experiment, we define voicemail spam associated with WHOIS misuse as
any voicemail that has intelligible content and the content makes reference of a domain name or
Internet related services, or if the caller states that he found the number online. Voicemails that
do not fall in this category are either categorized as not spam (e.g. misdialing), or as possible
62
spam (i.e. spam not clearly associated with WHOIS). There is a special case where the caller
makes mention of the name of the person they are trying to reach. In this event, we cross
checked our database of experimental Registrant identities, and if there was a match, the
voicemail was automatically classified as spam, regardless of the content. Voicemails that had
no content or where the content was not comprehensible are shown below but classified
separately from voice spam and non-spam.
We present the overall classification of the received voicemails in Figure 13. We collected in
total 674 voicemails throughout the experiment, and we classified 6% (39) as spam, 15% (102)
as not spam, and 4% (28) as possible spam. An additional 38% (256) of voicemails contained a
recorded message inviting the recipient of the call to “press one to accept”. We started receiving
this type of voicemail on a daily basis, several times a day, starting during the second month of
the experiment. All these voice messages – but one – were directed to a single number we used
in the WHOIS records of the .NET domains we registered at one Registrar. Even though the
content was not adequate to characterize these messages, the persistency indicates that there
is no randomness and we therefore placed them in a special category: interactive spam. Finally,
37% (249) of voicemails were not classified due to the lack of content.
Figure 13 Characterization of 674 collected voicemails. The 2 categories on the right represent WHOIS-attributed misuse phone number misuse.
Of the 39 pieces of voicemail spam, 77% (30) had the same caller and were originating from the
same company selling website advertising services. This caller placed two phone calls in each
of the numbers, one as an initial contact and one as a follow up. The caller targeted .BIZ
38% 37%
15%
6% 4%
0%
5%
10%
15%
20%
25%
30%
35%
40%
Interactivespam
Blank Not spam Spam Possible spam
% o
f col
lect
ed v
oice
mai
ls
Voicemail class
63
domains registered with five Registrars, .COM domains registered with four Registrars,
and .INFO domains registered with six Registrars. In total, domains registered with 11 out of the
16 Registrars used in the experiment, received this call.
The remaining spam voicemail targeted .BIZ domains registered with four Registrars, .COM
domains registered with three Registrars, and .INFO, .NET, and .ORG domains associated with
1 Registrar each. In one case we observed a particularly elaborate attempt to acquire personally
identifiable information
In Figure 14 we present a breakdown of domains receiving voicemail spam based on the gTLD
of the domains. .COM, .INFO, and .BIZ received 93% of spam voicemail, with the other 6%
equally divided between .NET and .ORG domains. Overall, 30% of all domains, registered with
14 out of 16 Registrars were affected by WHOIS-originated voice spam misuse.
Figure 14 Breakdown of domains receiving voicemail spam per gTLD.
In Section 8.3 we study in detail which domain characteristics (gTLD, domain category,
Registrar, and domain price) affect the WHOIS attributed voicemail spam rates.
Other types of misuse We next briefly discuss our findings considering the other types of WHOIS misuse covered in
the Registrant survey.
Throughout the experimental period we did not detect any unauthorized intrusion or attempts at
Denial of Service attacks to the servers involved in the experiment. It is possible that using
artificial Registrant identities and a proxy to hide the real IP address of the servers acted as a
deterrent to such attempts. However, we cannot validate this hypothesis.
.COM, 23%
.NET, 3%
.ORG, 3%
.INFO, 33%
.BIZ, 38%
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
1
% of domains receiving voicemail spam
gTLD
64
We did not observe any blackmailing attempts through the inspection of voicemails, or postal
mail. On the other hand, we could not analyze the contents of each spam email, due to their
sheer volume, and the probabilistic nature of automated detection methods.
Similarly, we did not actively look for cases of identity theft, as this would have necessitated
significantly greater resources to discover.
7.7. Overall experiment incidents of WHOIS misuse In Table 17 we present the proportions of experimental domains experiencing misuse based on
the harmful act, grouped by their respective gTLDs. The three measured harmful acts are listed
from left to right in a descending order of reported impact (see Figure 8). It is evident that email
WHOIS misuse impacts most of the domains. Voice spam (Phone number) WHOIS misuse
comes right after email misuse in terms of frequency, and postal spam (Postal address) WHOIS
misuse comes last.
Since, as stated in Section 7.4 above, each phone number used in the experimental study is
associated with the five domains registered under the same gTLD and Registrar, we cannot
directly derive the exact amount of phone number misuse each domain experiences. However,
we can get a lower bound on the amount of phone number misuse by considering that each
instance of misuse affects only a single domain (lower bound); and an upper bound by
assuming that all five domains associated with that specific number are targeted. Since the true
value lies somewhere in between, we report both values in Table 17 and we use the average for
further analysis.
TLD Test domains experiencing WHOIS misuse (by Type of harmful act27)
27 None of the experimental domains experienced any incident of malware delivery or identity theft.
65
% Domains experiencing phone # misuse
% Domains experiencing email misuse
% Domains experiencing postal misuse
BIZ 30%
(max 50%, min 10%) 94% 0%
COM 15%
(max 25%, min 5%) 60% 1%
INFO 23%
(max 38%, min8%) 78% 1%
NET 4%
(max 6%, min 1%) 65% 1%
ORG 4%
(max 6%, min 1%) 56% 0%
Total 5% 71% 1%
Table 17 Portion of domains affected by WHOIS misuse based on TLD and type of harmful act. The harmful acts are ordered based on the reported impact (from the Registrant survey) in decreasing order from left to right.
The proportions in the table above do not take into consideration the differences in gTLD
distribution in the Internet, presented in Table 2. Therefore, while merely presenting our findings,
we do not allow for meaningful comparison of the measured WHOIS misuse with the misuse
reported by the Registrars in their survey responses. In Section 8.1 we offer a comparative
analysis of the empirically measured WHOIS misuse, and the reported misuse from the
Registrant survey, weighting the empirically measured WHOIS misuse appropriately.
7.8. Discussion We found evidence that WHOIS data publication contributes measurably to the misuse of
personal information of Registrants. Our experimental domain names’ postal addresses, email
addresses, and phone numbers listed in WHOIS were misused by third parties to advertise
66
unsolicited services. None of our experimental domain names experienced attempted malware
delivery or identity theft.
The amount of postal spam we received is too low to assess the significance of the parameters
associated with a domain name (such as professional category of domain name) in relation to
the possibility of receiving postal spam. However, the amount of email and voicemail spam
attributable to WHOIS misuse was notably higher than other (non-WHOIS-misuse) email and
voice spam measured for these domains. We were also able to infer that both the choice of
gTLD and the choice of Registrar were statistically significant in the rate of email spam
attributable to WHOIS misuse. In section 8 we perform a more in depth analysis of the extent to
which gTLDs, Registrars, anti-harvesting measures, and other domain parameters impact the
rates of WHOIS-related misuse.
Given the design of this experiment it would be possible to state that there is a causal
relationship between the public availability of personally identifiable information in WHOIS and a
Registrant’s experience of spam email and voicemail. However, there is one additional
explanation for the WHOIS misuse than cannot be controlled: Registrars may be providing
Registrant information collected during domain registration to third parties (e.g., through bulk
WHOIS access, possibly through private communication with resellers). Investigating this
possibility is out of the scope of this study.
67
8. Comparative result analysis In this section we provide a holistic analysis, combining data from different sections, to assess
the hypothesis of there being statistical significance between WHOIS publication of and
apparent third-party misuse of Registrant personal information.
8.1. Correlation between measured and reported incidence of misuse
In Section 5 we collected the experiences of Registrants related to the misuse of their personal
information that they attributed to WHOIS publication. Later in Section 7, we presented the
experimental findings that showed that there is indeed measurable misuse attributable to the
availability of such information in WHOIS. In this section we compare the reported rates of
misuse with the measured rates.
Figure 15 Comparison of measured vs. Registrant reported WHOIS misuse rates. The reported rates include error bars representing the 12.5% error rate.
We did indeed measure instances of all three kinds of WHOIS misuse reported by the majority
of the Registrars: phone number, postal address, and email address misuse. In Figure 15, we
present the overall measured WHOIS misuse per type of harmful act, taking into consideration
the gTLD global shares as provided in Table 2. We contrast the measured rates with the
equivalent rates collected from the Registrant survey presented in Table 7. More specifically, we
consider the portion of Registrant responses that indicated having experienced WHOIS
Email addressmisuse
Phone numbermisuse
Postal addressmisuse
Measured misuse rate 62% 14% 1%Reported misuse rate 14% 9% 21%
0%
10%
20%
30%
40%
50%
60%
70%
68
attributed misuse, while claiming they had not published the allegedly misused information
anywhere other than WHOIS. We decided to compare this portion of responses with the
measured misuse rates, since it matches our experimental condition—we too only published the
misused information in WHOIS but not anywhere else.
In the cases of email and voicemail spam, we observe that the measured experimental misuse
rates are higher than the Registrant-reported rates. However, we measured lower rates of
postal spam in our experiment than were reported by Registrants. The differences between the
measured and reported misuse rates are statistically significant in the cases of email and postal
address misuse, while in the case of phone number misuse, they are not.
There are a few caveats that should be noted as they may have introduced measurement
biases, possibly affecting the observed differences.
Overall, the low response rate of the Registrant survey, as explained elsewhere, may have led
to inaccurate levels of reported WHOIS misuse. The experimental study, on the other hand, was
conducted in a systematic way and on a larger scale than the final turnaround of the Registrant
survey (400 domains vs. 57 domains). In addition, we can be certain that the Registrant
information we used to register the experimental domains have not been published in any other
directory. Similar statements made by Registrants could not be verified with the same level of
certainty.
More specifically, the significant difference in postal spam might be attributed to the limited
geographical diversity of the experimental PO boxes, in combination with the heavy reuse of
postal addresses by our artificial Registrant identities. The aforementioned experimental design
decisions may have driven the measured misuse to lower rates. Additionally, the possible
inability of some Registrants to distinguish WHOIS from non-WHOIS originating misuse may
have resulted in an overestimation in the reported rates. However, considering the frequency of
misuse (e.g. “A few times a year”) and the content of spam mail (e.g. SEO services), the
experimental findings are in line with the reported incidents. Moreover, the duration of the
experiment (6 months) is possibly not adequate to allow for extensive harvesting of postal
addresses. In support of this argument, it is noteworthy that the postal address of a domain we
registered in January of 2012, as part of a pilot of the experimental infrastructure, received its
first WHOIS-attributed spam postal mail 8 months after registering the domain.
Similarly, as discussed in Sections 7.3 and 7.7, the experimental design allows us to measure
occurrences of voicemail spam for groups of domains, registered under the same gTLD and
69
Registrar. Following up on the arguments offered in Section 7.7, we observe an average
weighted (based on gTLD proportions) voicemail spam rate of 14%, which is very close to the
reported rate. It is possible that the Registrants were especially accurate in reporting this type of
misuse, since they were immediately able to recognize it in terms of adverse effects.
Looking at the incidents of email misuse, we observe a difference of 45% between the
measured and the reported rates. This difference can be attributed to the strict definition of
spam that we adopted in this study, which may have led to the overestimation of the measured
spam rates. Looking at the frequency of WHOIS-related spam, the majority of Registrants (35%)
reported daily occurrences, with 30% reporting a frequency of a few times per week (the
difference of 5% is within the margin of error). Our measurements showed that most of the
spam-receiving domains received just one piece of spam, but the majority of them that received
more than one would receive spam every 10 days on average. This frequency is similar to the
one reported by the Registrants. However, the difference in frequency between the reported and
observed rates can be ascribed to the Registrants’ possible difficulty in identifying WHOIS-
originated email spam (and as such leading to Registrants’ reporting only of spam they believed
to be WHOIS-originated), and to different perceptions of what may constitute spam.
Regarding other types of WHOIS misuse (e.g. identity theft, server intrusion), the findings from
the Registrant survey are fully supported by the experimental study. There was no reported or
experimentally measured WHOIS-attributed misuse, other than the misuse types we already
discussed.
8.2. Domain characteristics affecting email address misuse In Section 7.6 we showed that email address misuse is affected by the gTLD of the domain and
the choice of Registrar in a statistically significant way. This analysis did not consider possible
correlations between the independent variables, i.e. the domain price, the existence of anti-
harvesting techniques, the gTLD, the domain name category, and the Registrar. In this section
we try to disentangle the effect all five may have on the prevalence of misuse.
Price of purchasing an experimental domain name
The price ranges of the 400 domains we purchased for one year are presented in Figure 16.
These ranges are distributed across 35 price levels, and it is noteworthy that no Registrar
offered a domain in the range of $19.96 to $34.98.
70
Because the dependent variable is binary in the case of email address misuse we use a
multivariate logit regression. The results of the regression (Table 18) show that the domain price
is statistically significant and negatively correlated with the WHOIS misuse. This coefficient
means that each $1 increase in the price of an experimental domain28 corresponds to a 15%
decrease in the odds of the Registrants experiencing misuse of their email address.29 In other
words, the more expensive the registered domain is, the less email address misuse the
Registrant experiences. Note, however, that none of our experimental domains were in use at
the time we registered them; none required purchase from another Registrant at above-retail
prices.
Figure 16 Observed minimum and maximum price, per TLD at the 16 Registrars we used for the experimental domain registrations. In total we observed 35 price levels.
Coefficient Odds Std. Err. Significance
Domain price -0.166 0.846 1.376 p < 0.001 Table 18 The logit regression shows that for every $1 increase in the price of a domain, there is a 15% less chance of experiencing WHOIS attributed misuse of the Registrant’s email address.
gTLD of experimental domain name
28 registered at the lowest possible retail prices offered by each Registrar 29 For instance, if domain A costs $10 to register, and domain B costs $12 to register, and if domain A has a chance of experiencing email address misuse of 10%, domain B would be expected to have a chance of experiencing misuse roughly equal to 7.2%.
$9.47 $8.59 $7.17
$0.99 $2.99 $0.99
$35.00 $35.00 $38.51 $38.51 $38.51 $38.51
$- $5
$10 $15 $20 $25 $30 $35 $40 $45
COM NET ORG INFO BIZ Overall
Dom
ain
regi
stra
tion
pric
e
Minimum price Maximum price
71
The gTLD, being a categorical variable in the regression, requires a different examination. Using
deviation coding we examined which gTLDs have a statistically significant contribution in the
possibility of receiving email spam attributed to WHOIS misuse. In Table 19 we present our
findings. Green highlighting represents negative correlation with email spam, while red
represents positive correlation. For the domain names included in our experiment, the BIZ gTLD
is highly correlated with spam, while domains under the COM, NET, and ORG gTLDs
experienced less spam.
gTLD Correlation to email spam Rate of change from mean
BIZ Positive (p < 0.001) 21
COM Negative (p < 0.001) 0.3
INFO Not statistically significant -
NET Negative (p < 0.05) 0.44
ORG Negative (p < 0.001) 0.32
Table 19 Overall correlation of TLD with email misuse. Green represents less misuse, while red represents more misuse. Domains under the COM, NET, and ORG gTLDs are less probable to be subject to email misuse, while BIZ domains are more susceptible.
Using dummy coding, we examine in Table 20 how the different gTLDs compare in rates of
observed email address misuse, all else being equal. The gTLDs appearing in the columns are
the point of reference with which gTLDs appearing in the rows are compared. Cell colors
represent the relative contribution of the point of reference gTLD to another gTLD, and the
contents show the level of significance.30 Green highlighting means that the column gTLD
correlates with less spam than the row gTLD, while red highlighting means that the column
gTLD correlates with more spam than the row gTLD. Grey cells represent statistically
insignificant comparisons. For the domain names included in our experiment, we found that BIZ
domains are correlated with higher instances of email misuse compared to all other gTLDs. The
30 Only statistically significant comparisons are shown.
72
INFO gTLD follows immediately after, and it exhibits lower potential for email address misuse
only when compared to the BIZ gTLD.
COM NET ORG INFO BIZ
COM p < 0.05
(0.26) p = 0
(0.85)
NET p < 0.1
(0.38) p = 0
(0.02)
ORG p < 0.05
(0.28) p = 0
(0.015)
INFO p < 0.05
(3.8)
p < 0.1
(2.6)
p < 0.05
(3.52)
p = 0
(0.05)
BIZ p = 0
(71)
p = 0
(48.3)
p = 0
(65.4) p = 0
(18)
Table 20 Comparison of gTLDs in terms of contribution to WHOIS-attributed email misuse. Columns are the reference TLDs and the color indicates if they contribute more (red) or less (green). P-value is shown where statistically significant. The numbers in parentheses show the rate of change of the conditional mean with respect to a specific value of the categorical variable at a corresponding row.
Category of experimental domain name
Using deviation coding we identified one category with statistical significant correlation to email
misuse. Domains denoting a person name (like randall-bilbo.com) are negatively correlated to
misuse (p < 0.05) – that is, the possibility of experiencing email address misuse is 37% less
than if the domain name had a different format. Other categories examined in our experiment
(e.g., randomly-generated names, synthetic business names) do not appear to have a
statistically significant role.
This appears to be an important result. However, we point out that all the domain names we
registered with the aim of denoting individuals contain a hyphen, while none of the domain
73
names we used for the other categories do. It is unclear whether the statistical differences
observed are due to the domain names denoting a person, or because they contain a hyphen.
Anti-harvesting applied to experimental domain name
The existence of anti-harvesting techniques was encoded as a dichotomous categorical variable
that denotes the existence or not of any anti-harvesting techniques for each experimental
domain name. While the Registrars and Registries selected for this experiment employ a variety
of parameters in WHOIS port 43 and web form rate limiting, we chose this simple binary coding
for simplicity in the statistical interpretation.
Using a logistic regression we find that the existence of anti-harvesting techniques is statistically
significant in predicting the potential of email address misuse. Additionally, the possibility of experiencing email misuse without the existence of any anti-harvesting technique is 2.3 times higher than when an anti-harvesting technique is in place.
Registrar of experimental domain name
We encoded each experimental domain name’s Registrar as a 16 part categorical variable
using deviation coding to measure each Registrar’s deviation from the overall mean value. We
did not find any statistically significant contribution. In other words, the choice of Registrar alone
is not adequate to predict the possibility of email address misuse.
8.3. Domain characteristics affecting phone number misuse We examine the factors that affect the possibility of a Registrant receiving a voicemail in the
three main classes: (a) spam, (b) possible spam, and (c) not spam. We are purposefully not
considering the other two classes (interactive and empty) as they do not present meaningful
outcomes. The factors measured in the experimental study that could affect the type of received
voicemail are the price of a domain, the gTLD, and the Registrar.31 A model that could allow us
to perform this analysis is the multinomial logistic regression. However, multinomial logic
31 The experimental study design did not allow association between a received voicemail and of the domain name category.
74
regressions require a large sample size (i.e. observations) to calculate statistically significant
correlations, which, in the case of our experiment, is not available.
gTLD Correlation to voice spam Rate of change from mean
BIZ Positive (p = 0.002) 7.39
COM Not statistically significant
INFO Positive (p = 0.003) 5.12
NET Not statistically significant
ORG Negative (p < 0.05) 0.1
Table 21 Correlation of gTLD with WHOIS-attributed phone number misuse.
Therefore we reverted to a basic logistic regression by transforming the multiple-response
dependent variable into a dichotomous one. We did this by conservatively transforming
observations of possible spam into observations of not spam. The independent variables, as in
Section 8.2, are the domain price, the gTLD, the existence of anti-harvesting techniques, and
the Registrar. The categorical variables were coded initially using deviation coding to identify
which had an overall statistical significance.
gTLD of experimental domain name
The gTLD was the only variable with statistical significance. Table 21 shows how the five gTLDs
are correlated with the measured WHOIS-attributed phone number misuse. Domains under the
BIZ and INFO gTLDs are correlated with higher misuse, while domains under the ORG gTLD
are correlated with lower misuse.
We also looked into how each gTLD affects the phenomenon of WHOIS-attributed phone
number misuse, in comparison to the other gTLDs. Table 22 presents our findings. Cell colors
represent the relative contribution of the point of reference gTLD to another gTLD, and the
contents show the level of significance32. Green highlighting means that the column gTLD
32 Only statistically significant comparisons are shown.
75
correlates with less voicemail spam than the row gTLD, while red highlighting means that the
column gTLD correlates with more voicemail spam than the row gTLD. Grey cells represent
statistically insignificant comparisons. We found that BIZ and INFO domains are correlated with
higher instances of phone number misuse compared to all other gTLDs.
COM NET ORG INFO BIZ
COM p = 0.01
(0.12) p = 0.001
(0.09)
NET p < 0.05
(0.07) p = 0.01
(0.05)
ORG p < 0.001
(0.02) p = 0.001
(0.01)
INFO p = 0.001
(7.9) p < 0.05
(13.5) p < 0.001
(47.9)
BIZ p = 0.001
(11.4) p = 0.01
(19.4) p = 0.001
(69.1)
Table 22 Comparison of gTLDs in terms of contribution to WHOIS-attributed phone number misuse. Columns are the reference TLDs and the color indicates if they contribute more (red) or less (green). P-value is shown where statistically significant. The numbers in parentheses show the rate of change of the conditional mean with respect to a specific value of the categorical variable at a corresponding row.
Our decision to code possible voicemail spam as not spam may underestimate the extent of
misuse, and therefore the coefficients. However we believe that this is a conservative approach
that prevents possible false positives from being considered in our model.
Anti-harvesting applied to experimental domain name
We did not find any statistically significant correlation between use or non-use of anti-harvesting
measures and the rate of phone number misuse observed for our experimental domain names.
Registrar of experimental domain name
76
We did not find any statistically significant correlation between Registrar and the rate of phone
number misuse observed for our experimental domain names.
Price of experimental domain name
We did not find any statistically significant correlation between retail price of an experimental
domain name and the rate of phone number misuse observed for those names.
8.1. Domain characteristics affecting postal address misuse The level of misuse that we observed during the experimental study in terms of postal address
misuse was very minimal as we have discussed in previous sections. Therefore we cannot
provide any meaningful analysis regarding the domain name characteristics that affect this type
of misuse.
77
9. Discussion In this work we undertook a combination of descriptive and experimental studies to examine the
hypothesis that WHOIS-published data leads to a measurable degree of misuse, and to cast
light on the experience of WHOIS misuse from the viewpoint of Registrants, Registrars,
Registries, experts, and law enforcement agencies.
We surveyed 101 experts and law enforcement agents, and the majority of participants (60%)
indicated that WHOIS misuse is usually not considered when investigating security incidents.
This actuality combined with the fact that WHOIS misuse is a real and measurable phenomenon,
reveals that WHOIS is an underestimated vulnerability that experts should consider more
consistently. However, the views of the experts may be affected by underreporting of incidents
of WHOIS misuse and inconsistencies between self-reported WHOIS misuse and actual
(experimentally-measured) WHOIS misuse.
In the few cases where experts were able to report on specific cases of WHOIS misuse (23
cases reported by 18% of participants), the adverse effects were similar to the cases that were
reported through the Registrant survey, and measured through the experimental study (e.g.
postal and email spam). However, a few targeted cases of WHOIS misuse (4 out of 23) had
potentially significant impact (e.g. fraud to extract money); due to their rare occurrence, we were
not able to observe or measure similar cases in the other parts of this study. Finally, the experts
stated that anti-harvesting techniques deployed subsequently did in fact deter reoccurrence of
WHOIS misuse in 11 out of 12 incidents. We made similar observations in the experimental part
of this study, where we identified a statistically significant effect of anti-harvesting techniques in
thwarting WHOIS-originated email spam.
Through our Registrant survey, we were able to gather information about experiences of
WHOIS misuse only from 57 Registrants (out of 1619 invitations to participate, with a target of
340 participants), despite our effort to attract participation by offering a chance to win attractive
prizes. This low response rate (3.6%) demonstrated how difficult such survey-based studies are
to run over the Internet. In addition it serves as a reminder that Internet-based surveys should
be very minimalistic in terms of extent and terminology used. Given the limited turnout of the
Registrars and Registrant survey in this particular study, simply limiting ourselves to the expert
survey and the experiment would have sufficed to achieve the same level of statistical
significance as we obtained in our entire study.
78
In our limited sample, we found that Registrants experienced measurable and statistically significant WHOIS misuse. Specifically, the prevalent types of misuse are
associated with phone numbers, email addresses, and postal addresses published exclusively
in WHOIS. More specifically:
29.8% of surveyed Registrants reported WHOIS email address misuse 12.3% of surveyed Registrants reported WHOIS phone number misuse 29.8% of surveyed Registrants reported WHOIS postal address misuse
No other type of misuse (e.g. identity theft) was reported or measured at a statistically
significant level.
Possibly the most interesting finding of the Registrar and Registry survey was their hesitation
to participate (22 participants out of 111 invitations). This could have been due to concerns over
possible consequences of public disclosure of confidential business practices.
Nevertheless, the survey provides insights on the reported and experienced incidence of
WHOIS misuse. Registrars and Registries reported that WHOIS queries are mainly carried out
through port 43, followed by web forms, and then by bulk purchases. However the latter has the
potential for higher impact in misuse, as the number of WHOIS records exchanged is by
definition very large. Nevertheless, port 43 rate limiting appears to be the most widely adopted
anti-harvesting technique.
We performed rate-limiting tests for the 92 “thin” Registrars in our sample (representing a
combined 77.4% market share in August 2011) and the three “thick” Registries. We found that
54% of the Registrars and Registries we tested do not employ any port 43 rate limiting
technique with the remaining 46% employing some type of rate limiting (e.g. IP blacklisting,
CAPTCHAs, combination of techniques).
Through the experimental study we found statistically significant evidence of WHOIS
originated misuse targeting the email addresses of Registrants. 71% of the 400 experimental
domains experienced email address misuse. More specifically,
94% of .BIZ domains, 78% of .INFO domains, 65% of .NET domains, 60% of .COM domains, and 56% of .ORG domains
were affected by email address misuse attributed to WHOIS.
79
The occurrence of email misuse can be empirically predicted by taking into account the cost of a
domain, the gTLD, and the existence of anti-harvesting mechanisms. When comparing the
contribution of the top 5 gTLDs in predicting the relative occurrence of WHOIS originated email
misuse, the .BIZ domains rank first in being vulnerable to email address misuse.
Considering the misuse of Registrants’ phone numbers – measured in the experimental study
as voicemail spam – we found that 5% of the experimental domains were affected phone
number misuse, with the following breakdown per gTLD:
30% of .BIZ domains, 23% of .INFO domains, 15% of .COM domains, 4% of .NET domains, and 4% of .ORG domains.
There is a statistically significant correlation between the choice of gTLD and the WHOIS-
attributed phone number misuse. As with email spam, BIZ and INFO gTLDs are correlated with
more misuse, while ORG is correlated with less misuse. However, we found no relationship
between cost of a domain or existence of anti-harvesting mechanisms and Registrant phone
number misuse.
The type of the domain name (for the five types we studied) cannot adequately predict any type
of WHOIS misuse, with the exception of domain names denoting a person’s full name (in this
study, formatted as firstname-lastname). This domain name format resulted in a 37% reduction
in email address misuse, compared to the other types of domain names.
The volume of collected postal spam attributed to WHOIS misuse, even though it is non-zero, is
too low to allow any inferences. Overall 1% of experimental domains were subject to postal
address misuse.
Comparing the number of Registrants experiencing WHOIS-originated email misuse (17%) with
the measured WHOIS-originated email misuse in the experimental study (62% of domains) we
note that the difference is well beyond the margin of error. Therefore we believe that WHOIS-
attributed spam email occurrence is under reported, possibly because Registrants find email
spam to be less impactful than phone or postal spam.
In addition, the reported occurrence of WHOIS-originated postal address misuse (22%)
compared to the total of three instances of measured misuse (1% of domains) is an indication
that the limitations of operating not more than three mailboxes as part of the experimental study
80
was possibly a deterrent for more adequate measurement of postal spam misuse. However, as
we observed through the pilot of the experimental study, it can take more than 6 months to
receive a piece of WHOIS-related spam postal mail. If this observation is generally true, then
the duration of this study may have contributed to the low representation of WHOIS-attributed
postal address misuse.
On the other hand, the measured occurrence of phone number misuse (14%) is very similar to
the reported misuse (13%), and within the margin of error, possibly because Registrants find
phone misuse more impactful, and they are therefore more prone to report it.
Revisiting the main hypothesis we set off to test in this WHOIS misuse study, namely, that
public access to WHOIS misuse data leads to a measurable degree of misuse, we conclude
that this hypothesis is validated, in a statistically significant way, both via measurements and via
surveys. The main types of misuse we found are voice spam, email spam, and postal spam.
Although we found other types of misuse as well (e.g. malware and DDoS attacks), the surveys
and experiments did not yield many such instances as to be statistically-significant for the
purposes of this study. Through our controlled measurement experiments, we found anti-
harvesting mechanisms were a deterrent to misusing a WHOIS-published email address. On
the other hand, anti-harvesting does not appear to significantly impact the other types of misuse
considered in this study.
81
10. Appendix A – Law Enforcement/Researcher survey
10.1. Invitation to participate Dear [Insert name here],
We are researchers at Carnegie Mellon University, conducting a study sponsored by ICANN on
misuse of gTLD WHOIS data Ð that is, harmful acts such as spam, phishing, identity theft, and
stalking which Registrants believe were sent using WHOIS-published contact information.
(Please see: http://blog.icann.org/2011/04/cylab-at-carnegie-mellon-university-selected-to-
conduct-study-of-whois-misuse/comment-page-1/).
As part of this study, we are planning to interview and survey a
number of cyber security researchers, law enforcement agents, consumer protection agencies
from various countries, about security incidents they have observed in the field. Given your
noted expertise in the field, we would be delighted to have the opportunity to interview you. We
are aiming to complete gathering answers to this survey by [closing date here].
Shall you be interested in a phone or email interview, please let us
know by responding to this email or by contacting Nicolas Christin
at +1-412-268-4432. We have also made an online survey available at
[Insert link here].
The survey (or equivalently, the phone interview) should not take more
than 15 minutes of your time, and is a vital component of the study.
Note that, since the study is commissioned by ICANN, participation
82
presents a great opportunity to have an impact on policy making.
Thank you in advance for your time and consideration. We look forward to
your contribution.
Sincerely,
--
Nicolas Christin, Ph.D. and Nektarios Leontiadis
Carnegie Mellon University CyLab
10.2. Consent form This survey is part of a research study conducted by Prof. Nicolas Christin at Carnegie Mellon
University.
The purpose of the research is to investigate the extent to which public availability of certain
information online leads to the information being misused by unauthorized parties.
Procedures
Participants are expected to answer a survey. The expected duration of participation is 15
minutes.
Participant Requirements
Participation in this study is limited to individuals age 18 and older.
Risks
The risks and discomfort associated with participation in this study are no greater than those
ordinarily encountered in daily life or during other online activities.
83
Benefits
There may be no personal benefit from your participation in the study but the knowledge
received may be of value to humanity.
Compensation & Costs
There is no compensation for participation in this study. There will be no cost to you if you
participate in this study.
Confidentiality
By participating in this research, you understand and agree that Carnegie Mellon may be
required to disclose your consent form, data and other personally identifiable information as
required by law, regulation, subpoena or court order. Otherwise, your confidentiality will be
maintained in the following manner:
Your data and consent form will be kept separate. Your consent form will be stored in a locked
location on Carnegie Mellon property and will not be disclosed to third parties. By participating,
you understand and agree that the data and information gathered during this study may be used
by Carnegie Mellon and published and/or disclosed by Carnegie Mellon to others outside of
Carnegie Mellon. However, your name, address, contact information and other direct personal
identifiers in your consent form will not be mentioned in any such publication or dissemination of
the research data and/or results by Carnegie Mellon.
Right to Ask Questions & Contact Information
If you have any questions about this study, you should feel free to ask them by contacting the
Principal Investigator now at
84
Dr. Nicolas Christin
Carnegie Mellon INI & CyLab
4720 Forbes Avenue, CIC Room 2108
Pittsburgh, PA 15217 USA
Phone: 412-268-4432
Email: [email protected]
If you have questions later, desire additional information, or wish to withdraw your participation
please contact the Principal Investigator by mail, phone or e-mail in accordance with the contact
information listed above.
If you have questions pertaining to your rights as a research participant; or to report objections
to this study, you should contact the Research Regulatory Compliance Office at Carnegie
Mellon University. Email: [email protected] . Phone: 412-268-1901 or 412-268-5460.
The Carnegie Mellon University Institutional Review Board (IRB) has approved the use of
human participants for this study.
Voluntary Participation
Your participation in this research is voluntary. You may discontinue participation at any time
during the research activity.
I am age 18 or older. Yes No
I have read and understand the information above. Yes No
I want to participate in this research and continue with the survey. Yes No
85
10.3. Survey questions Thank you for agreeing to participate in a network security survey assembled by Carnegie Mellon CyLab. We appreciate your time filling out answers to the following questions thoroughly. If you have any questions about this survey, or the underlying study, please contact Dr. Nicolas Christin at <[email protected]>. 1. How would you best describe your occupation? - Researcher (Academia) - Researcher (Industry) - Consultant - Law enforcement agent - Consumer protection agency - Other (Please describe) 2. Which category best describes your employer: - Academia - Security industry - Other IT industry - Not-for-profit Non-Governmental Organization (NGO) - Governmental organization - Other (Please describe) 3. a) Which country are you based in?
(Drop down list) b) If different from the country you are based in, which geographic area are you providing
input about: - Africa - North America - South America - Asia - Europe - Oceania - Same as 3a. 4. Are you familiar with the process of DNS name registration? [1: Not familiar - 3: Know the basics - 5: Expert]
86
5. a) Are you familiar with how domain Registrants are required to supply contact information
when registering a domain? [1: Not familiar - 3: Know the basics - 5: Expert] b) Are you familiar with how this Registrant contact information can be queried or
obtained by third parties via WHOIS? [1: Not familiar - 3: Know the basics - 5: Expert] 5. Do you know what WHOIS harvesting is? [Yes/No] If yes, provide a one-line summary of what it is: (Open ended field) 6. Are you aware of any WHOIS anti-harvesting techniques? [Yes/No] If yes, please describe: (Open ended field) Page break. Answers to the following set of questions may contain sensitive or private data. Let us re-
iterate that 1) you do not have to answer questions you do not fill comfortable answering, 2)
shall you decide to answer, your individual answers will be protected: only the research
team at Carnegie Mellon will be able to view your individual answers (others would only
have access to aggregate statistics or reports); data will be stored encrypted. In addition,
except for members of the research team, your identity and the identity of your organization
will not be tied to specific answers, unless you explicitly grant us permission to do so (see
question 12). 7. In the course of your employment, have you directly experienced any of the following
network security related attack caused by outsiders?
87
[Yes/No for all questions] - Denial of Service - Phishing - Vishing (voicemail phishing) - Email spam - Postal spam - Email virus - Abuse of personal data or identity theft - Malware installation/drive by downloads - Unauthorized intrusion on servers - Blackmail/ransom demands/intimidation - Have experienced attacks, but prefer not to divulge specifics - Other (Please describe) - Prefer not to answer 8. In the course of your employment, have any of the following network security-related
attacks been reported to you or to your organization by a third-party? [Yes/No for all questions] - Denial of Service - Phishing - Vishing (voicemail phishing) - Email spam - Postal spam - Email virus - Malware installation/drive by downloads - Unauthorized intrusion on servers - Abuse of personal data or identity theft - Blackmail/ransom demands/intimidation - Have experienced attacks, but prefer not to divulge specifics - Other (Please describe) - Prefer not to answer
88
9. Can you or your organization supply aggregate reports or statistics on security incidents that
you have collected? - Yes - No 9a. If yes, can you give an online pointer to the resources? - Open ended field 10. Have you or your organization analyzed whether WHOIS contact data was analyzed or found
to play a role in security incidents? - Yes - No 10b. If yes, can you give details about how WHOIS contact data was analyzed and to which
extent (aggregate statistics are appropriate here): - Open ended field Specific incidents 11. Have you ever observed directly or indirectly individual incidents (as opposed to
collecting aggregate data, per the previous questions) involving harvesting of WHOIS data? Please distinguish between each incident. For each incident you are aware of please
answer the following questions: How did you become aware of the incident: (Experienced yourself, reported to you, reported to your organization, heard from third
parties) - Which elements of WHOIS Registrant contact information were misused? - Are you aware of any measures that were taken to protect the Registrant's WHOIS contact
information (“countermeasures”)?
89
- If yes, have you had any similar incident after the deployment of countermeasures - Can you provide any other details? As an example: Alice reported to me that her email was published as the WHOIS contact for ABC Corp.
Alice subsequently received phishing emails containing details available through WHOIS
but published in no other Internet location. 12. If applicable, can you disclose which organization you represent in your answers? - [open ended field] - Prefer not to answer 13. If needed, would you be available for follow-up discussion to clarify certain of your
answers to this survey? - yes - no
90
11. Appendix B – Registrant survey
11.1. Invitation to participate Dr. Nicolas Christin
Carnegie Mellon University – CyLab
4720 Forbes Avenue, CIC Rm 2108
Pittsburgh, PA 15213 USA
http://www.andrew.cmu.edu/user/nicolasc/
Please click here to verify authenticity of this email:
http://dogo.ece.cmu.edu/whois-study/
Dear [FirstName], Sampled Domain Name: [CustomData]
Interested in winning the new Apple iPad 4G or an Apple iPod Shuffle? Read on.
We are computer security researchers in Carnegie Mellon University’s Cyber Security Lab
(CyLab) (http://www.cylab.cmu.edu). We are conducting a study that may help reduce Internet-
based crimes, and we need your help!
At some point – perhaps when you created a website or an email account – you registered a
domain name. During registration, you were asked to provide contact details (name, email,
phone number, address). These details are published in a public Internet directory called
"WHOIS." ANYONE, including us, can look up this directory to find out registration information.
By sharing your experience as a domain name Registrant, you can help us better understand
potential misuses of WHOIS registration data.
The results of this study will help the Internet community to fight various forms of online crime.
We will NOT collect your personal information, unless you specifically give us permission to
contact you to discuss this survey. Information about this option is available at the end of the
survey.
The survey should take about 30 minutes to complete, and will ask questions about the domain
name you have registered and your experience using it.
91
You can complete the survey in two ways:
- Complete and submit an on-line survey form by clicking [SurveyLink] (PREFERRED),
- Download survey questions from http://dogo.ece.cmu.edu/whois-
study/WHOIS_Misuse_Survey_Registrant_Printable.pdf and email answers to whois-
We aim to complete this survey by [closing date here]. Please click on the link below if you do
not wish to participate or receive further communication from us. You will not be contacted
further.
[RemoveLink]
If you fully complete the survey, you will be entered in a drawing for a chance to win one new
iPad (“iPad 3”) 16GB with 4G, or one of four 2GB iPod Shuffle.
Thank you very much for your time and consideration. We look forward to hearing from you.
Sincerely,
--
Nicolas Christin, Ph.D
Carnegie Mellon University CyLab
11.2. Consent This survey is part of a research study conducted by Prof. Nicolas Christin at Carnegie Mellon
University.
The purpose of the research is to investigate the extent to which public availability of certain
information online leads to the information being misused by unauthorized parties.
Procedures
Participants are expected to answer a survey. The expected duration of participation is 30
minutes.
92
Participant Requirements
Participation in this study is limited to individuals age 18 and older.
Risks
The risks and discomfort associated with participation in this study are no greater than those
ordinarily encountered in daily life or during other online activities.
Benefits
There may be no personal benefit from your participation in the study, but the knowledge
received may be of value to humanity.
Compensation & Costs
By fully completing the survey, you will be entered in a drawing for a chance to win an Apple
iPad 4G, or one of four Apple iPod Shuffle. There will be no cost to you if you participate in this
study.
Confidentiality
By participating in this research, you understand and agree that Carnegie Mellon may be
required to disclose your consent form, data and other personally identifiable information as
required by law, regulation, subpoena or court order. Otherwise, your confidentiality will be
maintained in the following manner:
Your data and consent form will be kept separate. Your consent form will be stored in a locked
location on Carnegie Mellon property and will not be disclosed to third parties. By participating,
you understand and agree that the data and information gathered during this study may be used
by Carnegie Mellon and published and/or disclosed by Carnegie Mellon to others outside of
Carnegie Mellon. However, your name, address, contact information and other direct personal
93
identifiers in your consent form will not be mentioned in any such publication or dissemination of
the research data and/or results by Carnegie Mellon.
Right to Ask Questions & Contact Information
If you have any questions about this study, you should feel free to ask them by contacting the
Principal Investigator now at
Dr. Nicolas Christin
Carnegie Mellon INI & CyLab
4720 Forbes Avenue, CIC Room 2108
Pittsburgh, PA 15217 USA
Phone: 412-268-4432
Email: [email protected]
If you have questions later, desire additional information, or wish to withdraw your participation
please contact the Principal Investigator by mail, phone or e-mail in accordance with the contact
information listed above.
If you have questions pertaining to your rights as a research participant; or to report objections
to this study, you should contact the Research Regulatory Compliance Office at Carnegie
Mellon University. Email: [email protected] . Phone: 412-268-1901 or 412-268-5460.
The Carnegie Mellon University Institutional Review Board (IRB) has approved the use of
human participants for this study.
Voluntary Participation
94
Your participation in this research is voluntary. You may discontinue participation at any time
during the research activity.
I am age 18 or older. Yes No
I have read and understand the information above. Yes No
I want to participate in this research and continue with the survey. Yes No
11.3. Survey questions Thank you very much for completing this survey conducted by Carnegie Mellon CyLab in the
United States (http://www.cylab.cmu.edu). We contacted you because you registered one or
more of the domain names that appear in a random sample being examined by this survey. By
sharing your experience as a domain name Registrant, you can help us make the Internet a
safer place! If you have any questions about this survey, or the underlying study, please contact
Dr. Nicolas Christin at <[email protected]>.
This survey is commissioned by ICANN, the Internet Corporation for Assigned Names and
Numbers. ICANN coordinates the assignment of domain names, and is in charge of the policies
governing WHOIS directory. The results of this study will help ICANN to take steps to reduce
WHOIS misuse.
First, let us start with a brief explanation to help you complete this survey. A domain name
identifies an Internet resource like a website or email service (google.com, verizon.net, cmu.edu
etc.). At some point – perhaps when you created a website or an email account – you obtained
a domain name. That process is called “domain registration.” Companies that provide domain
registration services are known as “Registrars.” Examples include GoDaddy, Tucows and
ENOM. During registration, your Registrar asked you to provide contact details (name, email,
phone number, address). These details are published in a public Internet directory called
"WHOIS" and ANYONE (including us) can access it.
95
We value your privacy. We assure you that all of your survey answers will be treated as
confidential and we will use them only for aggregate statistical analysis. By this we mean that no
entity will be able to associate a specific answer to you. Your personal contact details or
individual answers will NOT be disclosed to anyone outside of our research team.
1. How many domain names have you currently registered?
- 1
- 2-10
- More than 10
2. Please list all of the domain names that you have registered. If you registered more than one name, please separate them with commas (,) – for example, “mycorp1.com, mycorp2.com.”
[Open ended]
2.1 Please tell us the “sampled domain name” that appears in your survey invitation letter.
[Open ended]
When answering questions that follow, please think about your experiences as the Registrant of this sampled domain and communication sent to addresses that you supplied when registering that domain. Before continuing, you may find it helpful look up your own domain in WHOIS using http://whois.domaintools.com.
3. Thinking about why you registered this domain name and how you use it, please indicate
which of the following categories best describes you as this domain name’s Registrant:
- I registered the domain for my own use as an Individual
- I registered the domain for use by a For-profit business or organization
- I registered the domain for use by a Non-profit organization
- I registered the domain for use by an informal interest group (e.g., tennis club)
96
- Other (please specify)
3.1 Is this domain name used for any commercial activities – for example, to sell or advertise
goods or services or to collect donations?
- Yes
- No
- Not sure or prefer not to answer
4. Please indicate the country that you identified when you registered this domain name. Note:
WHOIS identifies several contacts for each domain name, including an administrative contact
(usually you) and a technical contact (may be your Internet service provider). Here, we are
interested in the country identified in YOUR contact details.
(Drop down list)
5. Please identify the Registrar (that is, the registration service provider) from whom you
obtained this domain name. If you do not know or recall, you may leave this blank.
[Open ended field]
6. Before taking this survey, did you know that the contact details which you provided during
domain registration would be publicly available on the Internet through “WHOIS”?
[Yes/No]
7. Since registering this domain name, have you ever received unsolicited postal mail at any of
the postal addresses that you specified in contact details during domain registration?
[YES/NO]
97
7.1 [If yes to Q7] Do you have reason to suspect that you received this unsolicited postal
mail because your postal address was published in WHOIS?
[YES/NO]
7.1.1 [If yes to Q7.1] Why do you think so?
[Open ended field]
7.1.2 [If yes to Q7.1] Is the postal address published in another public directory or
Internet source (for example, a phone book, a website, your email signature)?
[Yes/No]
7.1.3 [If yes to Q7.1] How often do you receive unsolicited postal mail at the
postal addresses published in WHOIS?
- A few times in a week
- A few times in a month
- A few times in a year
- Less than once in a year
7.1.4 [If yes to Q7.1] When was the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
7.1.5 [If yes to Q7.1] Please describe reasons for which you were contacted in these cases (e.g., a domain name hosting services offer)
98
[Open ended]
7.1.6 [If yes to Q7.1] If you know or can recall who contacted you in a recent case, please tell us more about that entity (e.g., sender’s name, type of company)
[Open ended]
7.1.7 [If yes to Q7.1] Did this unsolicited postal mail have any adverse impact on you?
- Yes (describe)
- No
7.2 [If no to Q7.1] Could the postal address have been obtained from another public
directory or Internet source (for example, a phone book, a website, your email
signature)?
[Yes/No]
7.2.1[If no to Q7.2] How do you think your postal address was obtained?
[Open ended]
8. Since registering this domain name, have you ever received unsolicited electronic mail at any of the email addresses that you specified in contact details during domain registration?
[YES/NO]
8.1 [If yes to Q8] Do you have reason to suspect that you received those emails because
your email address was published in WHOIS?
[YES/NO]
8.1.1 [If yes to Q8.1] Please specify why you think so.
[Open ended field]
99
8.1.2 [If yes to Q8.1] Is the misused email address published in another public
directory or Internet source (for example, a website, your email signature,
Facebook, Twitter)?
[Yes/No]
8.1.3 [If yes to Q8.1] How often do you experience misuse of your email address published in WHOIS?
- A few times a day - A few times in a week
- A few times in a month
- A few times in a year
- Less than once in a year
8.1.4 [If yes to Q8.1] When was the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
8.1.5 [If yes to Q8.1] Please describe the reasons for which you were contacted in these cases (e.g., a domain name hosting services offer, targeted phishing email)
[Open ended]
8.1.6 [If yes to Q8.1] If you know or can recall who contacted you in a recent case, please tell us more about that entity (e.g., sender’s name, type of company)
[Open ended]
8.1.7 [If yes to Q8.1] Did this unsolicited email have any adverse impact on you?
- Yes (describe)
- No
100
8.2 [If no to Q8.1] Could the email address have been obtained from another public directory or Internet source (for example, a website, your email signature, facebook, twitter)?
[Yes/No]
8.2.1 [If no to Q8.2] How do you think your email address was obtained?
[Open ended]
9. Since registering this domain name, have you ever received unsolicited voice calls at the
phone number(s) that you specified in contact details during domain registration?
[YES/NO]
9.1 [If yes to Q9] Do you have reason to suspect that those unsolicited voice calls
happened because your phone number(s) are published in WHOIS?
[YES/NO]
9.1.1 [If yes to Q9.1] Please specify why you think so.
[open ended]
9.1.2 [If yes to Q9.1] Is the misused phone number(s) published in another public directory or Internet source (for example, a phone book, a website, your email signature)?
[Yes/No]
9.1.3 [If yes to Q9.1] How often do you experience misuse of your phone number(s) published in WHOIS?
- A few times a day - A few times in a week
- A few times in a month
- A few times in a year
- Less than once in a year
101
9.1.4 [If yes to Q9.1] When was the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
9.1.5 [If yes to Q9.1] Please describe the reasons for which you were contacted
in these cases (e.g., a domain name hosting services offer)
[Open ended]
9.1.6 [If yes to Q9.1] If you know or can recall who contacted you in a recent case, please tell us more about that entity (e.g., sender’s name, type of company).
[Open ended]
9.1.7 [If yes to Q9.1] Did these unsolicited calls have any adverse impact on you?
- Yes (describe)
- No
9.2 [If no to Q9.1] Could the phone number have been obtained from another public directory or Internet source (for example, a phone book, a website, your email signature)?
[Yes/No]
9.2.1 [If no to Q9.2] How do you think your phone number(s) was obtained?
[Open ended]
10. Since registering this domain name, have you ever had your identity (e.g. name, address,
phone number) abused or stolen? An example would be fraudulent use of your identity (without
your knowledge) to apply for a credit card or receive financial services.
[YES/NO]
102
10.1 [If yes to Q10] Was this identity specified in contact details during domain
registration?
[Yes/No]
10.1.1 [If yes to Q10.1] Do you have reason to suspect that the identity abuse happened because your identity details are published in WHOIS?
[YES/NO]
10.1.1.1 [If yes to Q10.1.1] Please specify why you think so.
[Open ended]
10.1.1.2 [If yes to Q10.1.1] Are the misused identity details published in another public directory or Internet source (for example, your email signature, a workplace directory, Facebook)?
[Yes/No]
10.1.1.3 [If yes to Q10.1.1] How many times have been your identity published in WHOIS abused or stolen?
- Once
- Twice
- Three times
- More than three times (please indicate)
10.1.1.4 [If yes to Q10.1.1] When was the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
103
10.1.1.5 [If yes to Q10.1.1] Please describe how your identity details were misused (e.g. issuing of a loan, credit card)
[Open ended]
10.1.1.6 [If yes to Q10.1.1] If you know or suspect who is responsible for this identity abuse/theft please tell us more about that entity (e.g., name, relationship to you if any).
[Open ended]
10.1.1.7 [If yes to Q10.1.1] Please describe the adverse impact of this identity abuse/theft on you. For example, would you rate the impact as minor, major, or severe?
[Open ended]
10.1.2 [If no to Q10.1.1] Could the identity details have been obtained from another public directory or Internet source (for example, your email signature, a workplace directory, Facebook)?
[Yes/No]
10.1.2.1 [If no to Q10.1.2] How do you think identity details were
obtained?
[Open ended]
11. Are there any Internet servers (web, email, etc.) now reachable using the domain name that
you registered?
[YES/NO]
11.1 [If yes to Q11] Are you the system administrator of these servers? That is, do you
own and operate the computer on which the server runs? (If your servers are hosted by
a web or email services provider, the answer to this question should be NO. If you’re not
sure about the answer, chances are good it should be NO.)
[YES/NO]
104
11.1.1 [If yes to Q11.1] Since registering this domain name, have you ever
experienced unauthorized intrusion into servers within this domain for which you
have administrative rights?
[YES/NO]
11.1.1.1 [If yes to Q11.1.1] Do you have reason to suspect that the
unauthorized intrusion(s) happened because your identity details are
published in WHOIS?
[YES/NO]
11.1.1.1.1 [If yes to Q11.1.1.1] Please specify why you think so.
[Open ended]
11.1.1.1.2 [If yes to Q10.1.1.1] Are the misused identity details published in another public directory or Internet source (for example, your email signature, a workplace directory, Facebook)?
[Yes/No]
11.1.1.1.3 [If yes to Q11.1.1.1] How many times have you
observed intrusions into your server(s) that you can relate to your
identity details published in WHOIS?
- Once
- Twice
- Three times
- More than three times (please indicate)
11.1.1.1.4 [If yes to Q11.1.1.1] When was the last time that you experienced this?
- Within this week
- Within this month
105
- Within the past three months
- Within this year
- More than a year ago (please specify)
11.1.1.1.5 [If yes to Q11.1.1.1] Please describe the adverse effect
and severity of the unauthorized intrusion (e.g. web site
defacement)
[Open ended]
11.1.1.1.6 [If yes to Q11.1.1.1] If you know or suspect who was behind a recent intrusion, please tell us more about that entity (e.g., source IP address or domain name).
[Open ended]
11.1.2 [If yes to Q11.1] Have any of the servers in your domain(s) been a victim of denial of service (DoS) attack? (If unsure, the answer should be NO.)
[YES/NO]
11.1.2.1 [If yes to Q11.1.2] Do you think the DoS attack happened
because your identity details are published in WHOIS?
[YES/NO]
11.1.2.1.1 [If yes to Q11.1.2.1] Why do you think so?
[Open ended]
11.1.2.1.2 [If yes to Q11.1.2.1] Are the misused identity details published in another public directory or Internet source (for example, your email signature, a workplace directory, Facebook)?
[Yes/No]
11.1.2.1.3 [If yes to Q11.1.2.1] How many times have you have you experienced a DoS attack against one or more of the servers within this domain that you attribute to WHOIS misuse?
106
- Once
- Twice
- Three times
- More than three times (please indicate)
11.1.2.1.4 [If yes to Q11.1.2.1] When is the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
11.1.2.1.5 [If yes to Q11.1.2.1] Please describe the adverse impact of the attack (e.g.unable to provide services to customers, etc)
[Open ended]
11.1.2.1.6 [If yes to Q11.1.2.1] If you are know or suspect who was behind a recent attack, please tell us more about that entity (e.g., caller’s name, type of company)
[Open ended]
12. Since registering this domain name, have you ever been a victim of blackmail or intimidation?
[YES/NO]
12.1 [If yes to Q12] Was the identity (e.g., name, address, phone number, etc) that was the target of blackmail or intimidation specified in contact details during domain registration?
[Yes/No]
12.1.1 [If yes to Q12.1] Do you have reason to suspect that the blackmail or intimidation was related to the fact that your identity details are published in WHOIS?
107
[YES/NO]
12.1.1.1 [If yes to Q12.1.1] Please specify why you think so.
[Open ended]
12.1.1.2 [If yes to Q12.1.1] Are the misused identity details published in another public directory or Internet source (for example, email signature, workplace directory, Facebook)?
[Yes/No]
12.1.1.3 [If yes to Q12.1.1] How many times have you have you been blackmailed or intimidated using your identity details published in WHOIS?
- Once
- Twice
- Three times
- More than three times (please indicate)
12.1.1.4 [If yes to Q12.1.1] When was the last time that you experienced this?
- Within this week
- Within this month
- Within the past three months
- Within this year
- More than a year ago (please specify)
12.1.1.5 [If yes to Q12.1.1] Please describe a recent incident (e.g., how you got blackmailed or intimidated).
[Open ended]
12.1.1.6 [If yes to Q12.1.1] If you know or suspect who was behind a recent incident, please tell us more about that entity (e.g., name, relationship to you if any)
[Open ended]
108
12.1.1.7 [If yes to Q12.1.1] Please describe the adverse impact this incident had on you. For example, would you rate the incident’s impact as minor, major, or severe?
[Open ended]
13. Have you received any other type of harmful Internet communication or experienced any
other harmful acts that you have reason to believe may represent WHOIS data misuse?
[Yes/No]
13.1 [If yes to 13] Please tell us what you experienced, why you believe WHOIS contact
details for this domain name might have played a role, and whether the contact details
misused in this incident were available from any other source.
[Open ended]
14. If you believe that the information you used for domain name registration has been misused
in any way, and you have indicated this in any one of the previous questions, did you
subsequently take any measures to avoid WHOIS misuse in the future?
[I have experienced misuse and taken measures/I have experienced misuse and not taken
measures/I have not experienced misuse]
14.1 [If yes to Q14] Please tell us about the measures that you took. Check all steps that
you tried and explain any additional strategies you tried that are not listed below:
- Cancelling your domain name’s registration or moving it to a different Registrar.
- Changing your email address or domain name or any other misused WHOIS data.
- Replacing your own WHOIS contact addresses with forwarding addresses supplied by
a service provider (such as your domain’s Registrar).
- Replacing your WHOIS contact names and addresses with the names and addresses
of a service provider (for example, someone registering the domain name on your
behalf).
109
- Supplying partially incorrect or incomplete information when re-registering the domain
name or updating its WHOIS contact details (e.g., using a fake street number with
everything else valid)
- Supplying completely fake information when re-registering the domain name or
updating its WHOIS contact details.
- Applying a spam filter or registering with an identy theft protection service or some
other step to deal with the consequences of WHOIS misuse (as opposed to reducing
misuse itself).
- Other (please describe)
[Important note: As previously stated, your individual answer to this question is completely confidential and will NOT be shared with your Registrar or ICANN.]
15. Are you aware of any strategies that your domain name’s current Registrar may be taking to
reduce or protect against WHOIS data misuse?
[YES/NO]
15.1 [If yes to Q15] Please describe: [open ended field]
16. Do you grant us permission to contact you further in case we need clarifications about your
answers to this survey?
[YES/NO]
16.1 [If yes to Q16] If yes, please enter your email here.
[Open ended]
11.4. Terms
Carnegie Mellon University
Definition of Terms - WHOIS Misuse Survey
110
The following are the descriptions for the technical terms used in the ICANN WHOIS Misuse
Study being conducted by CMU. These descriptions will help you understand both the general
meaning of a term and its specific meaning as applied in this study.
Identity Theft
Identity theft occurs when someone uses your personally identifiable information, like your name,
address, phone number, Social Security number (or national identification number), or credit
card number, without your permission, to commit fraud or other crimes. Some examples of
identity theft include renting an apartment, obtaining a credit card, or establishing a telephone
account in your name, without your permission.
Identity thieves steal information by going through trash looking for bills or other paper with your
personally identifiable information, soliciting your information by sending emails pretending to be
your bank (see also Phishing), calling your financial institution while pretending to be you, etc.
Thieves may also be able to get some personally identifiable information by searching WHOIS
for domain name contact names and addresses.
For additional information about Identity Theft and examples, see the United States Federal
Trade Commission website.
Blackmail
In common usage, blackmail is a crime involving threats to reveal substantially true and/or false
information about a person to the public, a family member, or associates unless a demand is
met. Blackmail can include coercion involving threats of physical harm, criminal prosecution, or
taking the person's money or property. In the context of WHOIS misuse, blackmailers may use
some personally identifiable information by searching WHOIS for domain name contact names
and addresses.
For additional information about Blackmail, see Wikipedia website.
Email Spam
Spam email is an unsolicited mail message, sent to your email address without your permission.
The sender of spam is commonly called a "spammer" Spammers send the same email to a
large number of email addresses. They may obtain email addresses from many different
111
sources such as websites and chat forums. It is also possible for spammers to search WHOIS
for domain name contact email addresses.
Spam email is often used to advertise (or sell) legal and illegal products and to attempt to steal
sensitive information like credit card numbers (see also Phishing). Products commonly
advertised by spam include prescription drugs, herbal medications, replica watches, online
gambling and pornography.
For additional information about Email Spam, see SpamHaus.
Postal Spam
Postal spam is unsolicited postal mail sent to a residential or commercial postal mailbox or
another postal address, and is similar in concept to email spam (see Email Spam). Postal
spammers may obtain postal addresses from many different sources, both offline and on-line,
including searching WHOIS for domain name contact postal addresses.
Phishing
Phishing attacks attempt to steal your personally identifiable information (see also Identity Theft)
and financial account information. A common tactic used during phishing attacks is sending
spam emails that contain links to counterfeit websites (see also Email Spam). Phishing emails
may contain details about recipients, obtained from many different sources, including searching
WHOIS for domain name contact names, addresses and phone numbers.
The attacker can use techniques to hide the identity of the phishing message's true sender and
make the email look like someone else sent it. For example, a phishing email may appear to
come from a legitimate bank, but when you click on the link, you may be taken to a website
designed to look like the bank's website. This may trick you into divulging sensitive data such as
banking or other website account usernames and passwords.
Alternatively, when you click on a phishing email link, you may be taken to a website that
attempts to automatically install malicious software on your computer without your permission or
knowledge. For example, a key-logger program may be installed to send everything that you
type (e.g., passwords) to a remote attacker.
For additional information about Phishing, see this United States Federal Trade Commission
alert.
Vishing
112
Vishing attacks attempt to steal your personally identifiable information (see also Identity Theft)
and financial account information. Vishing attacks are similar to phishing attacks (see Phishing),
but are conducted using voice or telephone calls instead of email messages. The attacker can
use techniques to hide the vishing caller's true caller identification number and make the caller's
number appear to be another party's number. Vishing attack victims may be tricked into
revealing sensitive information.
For example, the attacker may call you, claiming to be a representative of a bank, and request
your banking information for administrative purposes. Alternatively, upon receiving a vishing call,
you may hear an automated voice message requesting you to immediately call a specified
number to verify account details. But that number reaches the attacker, not your bank.
For additional information about Vishing, see this United States Federal Bureau of Investigation
(FBI) consumer alert.
Email Virus
The most generic definition of an email virus is malicious software (also called "malware")
delivered as an email file attachment. When the recipient opens the attached file, the malicious
software is installed or otherwise activated. The malicious software may damage data or
services on the recipient's computer. It may also carry out harmful actions on behalf of the
attacker. Common examples include deleting files, sending spam emails (see Email Spam) on
the attacker's behalf, tracking the user's actions, and downloading and installing additional
malicious software. Mail messages that carry viruses may be sent to email addresses obtained
from many different sources, including searching WHOIS for domain name contact addresses.
For additional information about Email Viruses, see Carnegie Mellon My Secure Cyberspace.
Denial of Service (DoS)
In a denial-of-service attack, an attacker attempts to prevent legitimate users from accessing or
making use of information or services. By targeting your computer and its network connection,
or the computers and network of Internet sites that you are trying to use, an attacker may be
able to prevent you from accessing email, websites, online service provider accounts (banking
sites, etc.), or other services that rely on the computers or networks that are under DoS attack.
Not all disruptions to service are the result of a DoS attack. There may be technical problems
with a particular network, or system administrators may be performing maintenance. However,
the following symptoms could indicate a DoS attack:
113
unusually slow performance when opening files or accessing websites,
unavailability of a particular website,
inability to access any website, or
a dramatic increase in the amount of spam that you receive
DoS attacks may be launched against targets identified in many different sources, including
searching WHOIS for domain name contact names and addresses.
For additional information about DoS, see United States Computer Emergency Response Team
(US-CERT) website.
Unauthorized Intrusion
Unauthorized intrusion occurs when an attacker gains access to services or information on a
computer system without the owner's permission. It is also possible that the attacker is a
legitimate user of the computer system, but has managed to gain access to an access level
higher than she is authorized to access.
Unauthorized intrusion can happen in many ways. Some common techniques used by intruders
are sending malicious messages to the targetís computer through the network, tricking the
administrator of the computer system in to installing malicious software (see also Phishing), and
guessing the administrator's account username and password. Unauthorized intrusions may be
launched against targets identified in many different sources, including searching WHOIS for
domain name contact names and addresses.
For additional information about Unauthorized Intrusions, see Carnegie Mellon My Secure
Cyberspace:
Document Information
This document was prepared to help users completing surveys being conducted by computer
security researchers at Carnegie Mellon University - Cylab. This document is for research and
education purposes only, and is not for commercial or business purposes. Anyone can use this
document in part or whole by citing all the sources cited in this document, and adhering to the
terms of use specified by the sources cited in this document. All queries regarding this
document should be directed to [email protected].
Acknowledgement of sources
114
All sources used to create this document are specified below. Some sentences have been
quoted verbatim or with slight modifications to assist readers with limited knowledge of
computer terminology. Further, certain references to United States specific terminology (e.g.,
Social Security Number) have been reduced as this document is intended for use by an
international audience.
Identity Theft: http://www.ftc.gov/bcp/edu/microsites/idtheft/consumers/about-identity-
theft.html#Whatisidentitytheft
Denial of Service: http://www.us-cert.gov/cas/tips/ST04-015.html
Phishing: http://www.icann.org/en/general/glossary.htm#P
Blackmail: http://en.wikipedia.org/wiki/Blackmail
Spamhaus: http://www.spamhaus.org/definition.html
Email Viruses: http://www.mysecurecyberspace.com/encyclopedia/index/intrusion.html
Phishing: http://www.ftc.gov/bcp/edu/pubs/consumer/alerts/alt127.shtm
Vishing: http://www.fbi.gov/news/stories/2010/november/cyber_112410
Unauthorized Intrusions:
http://www.mysecurecyberspace.com/encyclopedia/index/intrusion.html
115
12. Appendix C – Registrar and Registry Survey
12.1. Invitation to Participate Carnegie Mellon University - CyLab
Dr. Nicolas Christin
http://www.andrew.cmu.edu/user/nicolasc/
Email: Please click here to verify authenticity of this email
Dear [Firstname] [Lastname],
We are researchers at Carnegie Mellon University in the United States, conducting a study
commissioned by ICANN on the extent to which public WHOIS contact data for gTLD domains
is misused to commit harmful acts such as spam, phishing, identity theft, stalking, etc. One
survey will target Registrants who believe cases of misuse have originated from WHOIS-
published contact data. We are asking gTLD Registries and a geographically diverse set of
Registrars to participate in this second related survey to learn how WHOIS data for sampled
domain names could possibly have been obtained (e.g., supported query vectors, applied anti-
harvesting measures). Because your organization is a Registry or Registrar associated with a
domain name included in our study’s random sample, we would like to learn about your WHOIS
data access practices. Your participation in this survey presents a great opportunity to share
your insights as a Service Provider about how prevalent public WHOIS data misuse may or may
not be and ways you think have been most effective in deterring WHOIS harvesting.
Please visit this link [URL] to view more information about this study, including important term
definitions that may be useful when you answer this survey and information about how we will
treat business-sensitive and personal information that you choose to share with us.
Should you be interested in participating in the survey, please do so in any of the following
ways:
- Complete and submit an on-line survey form by clicking [URL] (preferred),
- Download survey questions form [URL] and email answers to [email protected],
or
- Schedule a phone interview by responding to this email.
116
We are aiming to complete this survey by [closing date here]. Please note: If you do not wish to
participate and receive further communication from us, please click the link below, and you will
be automatically removed from our mailing list.
[RemoveLink]
The survey (or equivalently, the phone interview) should take about 25 minutes of your time,
and is a vital component of the study. If you explicitly permit us to do so, we may follow up with
you by phone or email in case we wish to clarify some of your answers.
Thank you in advance for your time and consideration. We look forward to your contribution.
Sincerely,
--
Nicolas Christin, Ph.D.
Carnegie Mellon University CyLab
12.2. Consent form This survey is part of a research study conducted by Prof. Nicolas Christin at Carnegie Mellon
University.
The purpose of the research is to investigate the extent to which public availability of certain
information online leads to the information being misused by unauthorized parties.
Procedures
Participants are expected to answer a survey. The expected duration of participation is 25
minutes.
Participant Requirements
117
Participation in this study is limited to individuals age 18 and older.
Risks
The risks and discomfort associated with participation in this study are no greater than those
ordinarily encountered in daily life or during other online activities.
Benefits
There may be no personal benefit from your participation in the study but the knowledge
received may be of value to humanity.
Compensation & Costs
There is no compensation for participation in this study. There will be no cost to you if you
participate in this study.
Confidentiality
By participating in this research, you understand and agree that Carnegie Mellon may be
required to disclose your consent form, data and other personally identifiable information as
required by law, regulation, subpoena or court order. Otherwise, your confidentiality will be
maintained in the following manner:
Your data and consent form will be kept separate. Your consent form will be stored in a locked
location on Carnegie Mellon property and will not be disclosed to third parties. By participating,
you understand and agree that the data and information gathered during this study may be used
by Carnegie Mellon and published and/or disclosed by Carnegie Mellon to others outside of
Carnegie Mellon. However, your name, address, contact information and other direct personal
identifiers in your consent form will not be mentioned in any such publication or dissemination of
the research data and/or results by Carnegie Mellon.
118
Right to Ask Questions & Contact Information
If you have any questions about this study, you should feel free to ask them by contacting the
Principal Investigator now at
Dr. Nicolas Christin
Carnegie Mellon INI & CyLab
4720 Forbes Avenue, CIC Room 2108
Pittsburgh, PA 15217 USA
Phone: 412-268-4432
Email: [email protected]
If you have questions later, desire additional information, or wish to withdraw your participation
please contact the Principal Investigator by mail, phone or e-mail in accordance with the contact
information listed above.
If you have questions pertaining to your rights as a research participant; or to report objections
to this study, you should contact the Research Regulatory Compliance Office at Carnegie
Mellon University. Email: [email protected] . Phone: 412-268-1901 or 412-268-5460.
The Carnegie Mellon University Institutional Review Board (IRB) has approved the use of
human participants for this study.
Voluntary Participation
Your participation in this research is voluntary. You may discontinue participation at any time
during the research activity.
119
I am age 18 or older. Yes No
I have read and understand the information above. Yes No
I want to participate in this research and continue with the survey. Yes No
12.3. Survey questions We are researchers at Carnegie Mellon University in the United States, conducting a study
commissioned by ICANN on the extent to which public WHOIS contact data for gTLD domains
is misused to commit harmful acts such as spam, phishing, identity theft, stalking, etc. One
survey will target Registrants who believe cases of misuse have originated from WHOIS-
published contact data. We are asking gTLD Registries and a geographically diverse set of
Registrars to participate in this second related survey to learn how WHOIS data for sampled
domain names could possibly have been obtained (e.g., supported query vectors, applied anti-
harvesting measures).
Because your organization is a Registry or Registrar associated with a domain name included in
our study’s random sample, we would like to learn about your WHOIS data access practices.
Your participation in this survey presents a great opportunity to share your insights as a Service
Provider about how prevalent public WHOIS data misuse may or may not be and ways you
think have been most effective in deterring WHOIS harvesting.
Please visit this link [URL] to view more information about this study, including important term
definitions that may be useful when you answer this survey and information about how we will
treat business-sensitive and personal information that you choose to share with us. If you have
any questions about this survey, or the underlying study, please contact Dr. Nicolas Christin at
This first set of questions is intended to capture general characteristics of the gTLD domain name Registrar or registration services that your organization provides.
120
0. Does your organization operate a gTLD Registry? If so, please list the generic top-level
domains for which your organization is responsible, separating values with commas (,).
[Open ended]
1. Does your organization operate as a gTLD domain name Registrar? If so, please list the
generic top-level domain(s) under which your organization offers registration services,
separating values with commas (,).
[Open ended]
2. In which country is your headquarters located?
(Drop down list)
3. Please indicate, by order of magnitude, the number of individual domain names for which your organization provides registration services (directly or indirectly), as of the response date.
- Exactly or under 100 000
- 100 001 to 1 000 000
- 1 000 001 to 10 000 000
- More than 10 000 000
4. Please indicate, by order magnitude, the monthly number of WHOIS queries that you receive and respond to via any of the following means, without regard to the number of WHOIS records actually returned in those responses:
a) Port 43 WHOIS protocol query responses/month
- Exactly or under 100 000
- 100 001 to 1 000 000
- 1 000 001 to 10 000 000
- More than 10 000 000
- Do not know or do not measure
b) Web form WHOIS query responses/month
- Exactly or under 100 000
- 100 001 to 1 000 000
- 1 000 001 to 10 000 000
121
- More than 10 000 000
- Do not know or do not measure
c) Bulk WHOIS data purchase transactions/month
- Exactly or under 100 000
- 100 001 to 1 000 000
- 1 000 001 to 10 000 000
- More than 10 000 000
- Do not know or do not measure
d) Other WHOIS data request methods (please describe method and frequency)
[Open ended]
WHOIS anti-harvesting techniques
The following set of questions are intended to explore WHOIS anti-harvesting techniques that Registries and Registrars may have implemented to reduce WHOIS misuse, resist DoS attack or improve operating efficiency. Your response to these questions will help us better understand the extent to which Registries and Registrars have already taken steps to deter WHOIS harvesting and how.
5. Do you currently implement any techniques to deter WHOIS data harvesting?
[Yes/No]
[If no to Q5, next section]
[New page]
122
This section explores WHOIS anti-harvesting techniques that you may have implemented. For each technique, we will ask you to provide a short description of key parameters used to trigger or tune your implementation. This information will help us better understand both common and innovative techniques and assess their apparent impact on WHOIS misuse frequency.
Rate limiting
Here we ask you to describe two techniques used to limit WHOIS resolution request rates: limiting the requests sent to the well-known WHOIS port 43, and providing a web form for submitting WHOIS requests which makes machine automated request generation more difficult.
6. Do you implement Port 43 rate limiting?
[Yes/No]
6.1 [If yes to Q6] Please describe key parameters that affect your implementation, such as
thresholds used to activate the lock and the duration of the lock.
[Open ended]
7. Do you support WHOIS Query through a web form?
[Yes/No]
7.1 [If yes to Q7] Does submitting a WHOIS Query to your web form require answering a
CAPTCHA prompt?
[Yes/No]
7.2 [If yes to Q7] Do you apply any other (non-CAPTCHA) rate limiting to web form queries?
123
[Yes/No]
7.2.1 [If yes to Q7.2] Please describe key parameters that affect your web form rate limiting
implementation, such as thresholds used to activate the lock and the duration of the lock
[Open ended]
[New page]
Blacklisting
Here we ask you to describe anti-harvesting methods that involve some kind of sender blacklisting, used to prevent suspected WHOIS data harvesters from performing an unreasonably large number of WHOIS resolution requests.
8. Do you implement Permanent IP or Domain Name-based Blacklisting of suspected WHOIS
data harvesters?
[Yes/No]
9. Do you implement Temporary IP or Domain Name-based Blacklisting of suspected WHOIS data harvesters?
[Yes/No]
10. If you implement any form of blacklisting (either listed above, or other), please provide details about the types of blacklisting you have implemented, the criteria used to identify suspected WHOIS data harvesters, and any thresholds and durations used to determine when to activate or remove the blacklist.
[Open ended]
Privacy or Proxy registration services
124
Here we ask you to identify any services that you may already offer to domain name Registrants to address their concerns about harvesting of data published in WHOIS.
11. Do you offer a service which provides alternate WHOIS contact information and mail
forwarding services while not actually shielding the Registered Name Holder’s identity?
- Yes
- No
- Unknown or prefer not to answer
12. Do you offer a service which registers domain names on a customer's behalf and then
licenses domain use so that your identity and contact information is published in WHOIS?
- Yes
- No
- Unknown or prefer not to answer
13. Other techniques. Please provide a detailed description of any other techniques you have
implemented to deter misuse of WHOIS data obtained by harvesters.
[Open ended]
[New page]
The following set of questions will help us understand to what extent you receive feedback from domain name Registrants regarding harmful acts which they believe were sent using WHOIS contact information. We are particularly interested in understanding whether these incidents can in fact be corelated to the information available on-line only through WHOIS.
125
14. Have any of the following harmful acts ever been reported to your organization by domain
name Registrants who suspected they were experiencing WHOIS data misuse? Please refer to
[URL] for our description of the following terms.
[Check all that apply]
- Denial of Service
- Phishing
- Vishing (voicemail phishing)
- Email spam
- Postal spam
- Email virus
- Unauthorized intrusion on servers
- Abuse of personal data or identity theft
- Blackmail/ransom demands/intimidation
- Registrants have reported experiencing harmful acts, but I prefer not to divulge specifics
- Other (Please describe)
- Prefer not to answer
15. Was your organization able to identify whether WHOIS contact data was in fact misused or
found to play a role in any of the above-reported incidents?
- Yes
- No
15.1. [If yes to Q15] Please supply details about how WHOIS contact data was misused and to
what extent. You may describe particular incidents involving suspected misuse and/or
aggregate statistics about how often misuse was either confirmed or ruled out.
- Open ended field
[New page]
126
The following set of questions will help us understand how Registries and Registrars take steps to combat WHOIS harvesting if and when such activity is detected.
16. Have you experienced any known WHOIS data harvesting incidents (successful or otherwise)?
[Yes/No]
16.1 [If yes to Q16] Can you provide any statistics that might quantify how often your
organization experiences WHOIS harvesting attempts (e.g., frequency of WHOIS rate limit
triggering or blacklisting)?
[Open ended field]
16.2 Can you describe how successful WHOIS data harvesting attempts (if any) have been detected and investigated?
[Open ended field]
17. Were your WHOIS anti-harvesting techniques implemented or adapted in response to past harvesting incidents?
[Yes/No]
17.1 [If yes to Q17] Which of the following anti-harvesting techniques (if any) did you adopt within the last 2 years after you realized that you were being targeted by WHOIS data harvesters?
[Check all that apply among the following:
Port 43 rate limiting
Query only through a web form.
CAPTCHA
Permanent IP or Domain Name Blacklisting
Temporary IP or Domain Name Blacklisting
Privacy or Proxy registration services
Other anti-harvesting technique (please explain) ]
127
17.2 [If yes to Q17] For each technique deployed or adapted as a countermeasure, please give a short description of your rationale and the extent to which it has proven effective as a deterrent.
[Open ended]
[New page]
Finally, we wish to consider other access paths that harvesters can use to obtain WHOIS data. Your answer may help us assess alternative ways that misused WHOIS data may have been obtained and possible impact of affiliates on WHOIS misuse frequency.
18. Please provide a list of key affiliates and partners that purchase domain name -related services from your organization (e.g., bulk purchase of domain names for resale, bulk access to WHOIS data).
[Open ended]
19. Do you grant us permission to contact you further in case we need clarifications about your
answers to this survey?
[YES/NO]
19.1 [If yes to Q19] If yes, please enter your email here.
[Open ended]
128
13. Bibliography APWG. (2011). Phishing Attack Trends Report - Q2 2010. Anti-Phishing Working Group.
GNSO. (2007). Retrieved from http://gnso.icann.org/en/issues/whois-privacy/whois-services-
final-tf-report-12mar07.htm
ICANN. (2009). Terms of reference for WHOIS misuse studies. Retrieved from
http://gnso.icann.org/issues/whois/tor-whois-misuse-studies-25sep09-en.pdf
NORC. (2010). Draft Report for the Study of the Accuracy of WHOIS Registrant Contact Information. University of Chicago.
SAC023. (2007). Is the WHOIS Service a Source for email Addresses for Spammers?
SAC028. (2008). SSAC Advisory on Registrar Impersonation – Phishing Attacks.
129