Page 1 of 3
ADHD: Adaptive Design Honeypot Deployments
Kate Highnam
Imperial College London
London, UK
Zach Hanif
Independent Researcher
Washington, D.C., USA
Ellie Van Vogt
Imperial College London
London, UK
Sonali Parbhoo
Imperial College London
London, UK
Sergio Maffeis
Imperial College London
London, UK
Nicholas R. Jennings
Loughborough University
Loughborough, UK
ABSTRACT
Traditional honeypot deployments expose vulnerable systems for
extended periods of time in large quantities to gather empirical
information on intrusion techniques. This deployment strategy can
be too slow to respond to emerging threats and provide opportu- nity for attackers to develop detection techniques on the honeypots
employed. In this poster, we present a novel approach for honeypot
deployments that optimises the allocation of resources (based on
events seen) for the purposes of proving or disproving an initial
hypothesis about the environment. Our adaptive design (AD)
honeypot deployment is inspired by the clinical trial community
version: a variant of a randomised control trial (RCT) to measure
how a particular “treatment” affects a population. While more re- strictive than the breadth of questions a traditional deployment
could answer, our directed approach quickly answers specific ques- tions such as “Does inserting this API vulnerability increase the
chances of seeing exploits on the SQL server?” and “Are attackers
constantly exploiting misconfigured cloud instances across cloud
regions in the U.S.?”. We run a study to answer the latter question
to compare the RCT, AD, and traditional deployment methods. By
conducting studies with a control, we uncover (with high confi- dence) the cause-and-effect relationship a vulnerability has on the
likelihood of system exploitation.
CCS CONCEPTS
• Security and privacy → Vulnerability management; Network
security.
KEYWORDS
datasets, honeypots, randomised control trials, security, intrusion
research
ACM Reference Format:
Kate Highnam, Zach Hanif, Ellie Van Vogt, Sonali Parbhoo, Sergio Maf- feis, and Nicholas R. Jennings. 2023. ADHD: Adaptive Design Honeypot
Deployments. In Proceedings of The 28th European Symposium on Research
in Computer Security (ESORICS ’23). ACM, New York, NY, USA, 3 pages.
https://doi.org/XXXXXXX.XXXXXXX
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
ESORICS ’23, September 25–29, 2023, The Hague, The Netherlands
© 2023 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/XXXXXXX.XXXXXXX
1 MOTIVATION AND METHODOLOGY
Traditional honeypot deployments - or “vanilla” deployments as
we will call them in this paper - expose a large number of identical
vulnerable systems for a particular (long) length of time [1, 4, 5, 7].
While sufficiently large and long-lived vanilla deployments all but
guarantee observations and can summarise the general state of
automated threats, they carry several risks and costs that may
be sometimes unacceptable. For example, leaving a meaningful
quantity of identical honeypots online provides opportunity for
adversaries to identify the presence of monitoring tools employed
within the honeypot. This can hinder observations and render the
tools useless, i.e., when adversaries stop acting after detecting active
monitoring or debugging tools. Additionally, large scale deploy- ments cost time and money, which absorbs budget, and can hinder
or preclude timely observations.
As an alternative to the vanilla deployment, we propose to save
resources and limit exposure by directing honeypot studies to an- swer pre-specified critical security questions about the current state
of the environment. Our new deployment strategy answers “What
if...?” and “Why...?” questions using a control group, a set of hon- eypots deployed that remain unchanged while others are altered.
This allows us measure the effect of our change on the environ- ment. These are slightly adapted techniques from those used to
design clinical trials. See Table 1 for some of the terminology from
healthcare mapped to security as it is used to define our work.
In healthcare, a typical control group study follows the design of
a randomised-control trial (RCT), the gold standard for clinical
trial methods. RCT is used to minimise the impact of researcher
biases while evaluating causal relationships [3]. Our method is
based on adaptive design (AD), a variant of RCT that incorpo- rates pre-planned opportunities to modify aspects of an ongoing
trial in response to data accumulated during the study, without
invalidating its integrity [6, 8, 10]. Unlike the healthcare AD, we
use the Kaplan-Meier function to calculate the likelihood of an
event (i.e., a honeypot being exploited) to encourage infection for
intrusion research purposes. As shown in Figure 1, RCT and AD
both account for known conditions and unforeseen events (e.g., a
pandemic or war) which might require the trial to end early by
separating the trial into multiple stages to run interim analysis. We
show these methods can significantly reduce the cost of the study
while answering security questions through preset objectives.
In this paper, we present the first control-based deployment
strategy for honeypots that optimises resource allocation and limits
honeypot exposure. The strategy is then employed in an exemplary
study to determine the impact of an ssh vulnerability on cloud
servers across the United States, comparing our AD to vanilla and
Page 2 of 3
ESORICS ’23, September 25–29, 2023, The Hague, The Netherlands Highnam, et al.
Table 1: Mapping of healthcare terminology to security ter- minology for control trials deploying honeypots.
Healthcare Security
“trial” “a study comparing honeypots
with and without a vulnerability”
“study population” “a collection of individual
honeypots”
“patient”, “participant” “a honeypot”
“recruiting more subjects” “starting more honeypots with
specific characteristics”
“infection” “exploitation”
“disease” “attacker technique for exploit”
“intervention”, “treatment” “corruption” or “the presence
or insertion of a vulnerability”
“treated” “corrupted”
RCT deployments. We find that AD reduces costs and can answer
a directed question in a shorter amount of time while limiting
the likelihood of error. This study ran using automated scripts,
presenting the first automated control study.
2 EXPERIMENTS AND RESULTS
We compare the vanilla, RCT, and AD in separate honeypot de- ployments (i.e., trials) for the same study - following Figure 1. Our
study population are inactive cloud servers with no applications or
connections besides our monitoring in an isolated Docker container.
Our corruption for the study is an ssh vulnerability, altering the
honeypots to accept any password for four fake IT user accounts.
Because we never login, any user login seen is considered mali- cious. Our control group only accepts “password” as the password
for the same user accounts, which we know are also scanned for
by attackers [4]. Our hypothesis for the study is that the chosen
corruption significantly increases the likelihood of exploitation in
the U.S. during local working hours. This closely follows the work
of Highnam et al. [2], which contains the dataset used as our pilot
study and honeypot setup.
Each trial took approximately 12 hours; in the multi-stage tri- als (RCT and AD), the duration is the same but divided into three
four-hour stages. After each stage, AD uses infection rates to de- termine allocation proportions by region. RCT maintains the same
allocation proportions, redeploying the same number of control
and corrupted honeypots unless it met a predefined early stopping
criteria. The honeypots are deployed in one of four cloud-provided
regions geographically located within the U.S. Each region has dis- tinct IP ranges, so IP scanning or region specific attacks should be
observable in the logs.
Over the 12-hour vanilla trial which deployed 140 corrupted
honeypots, there was no obvious pattern in the order of the IP
addresses hit and only 3 honeypots remained unexploited by the
end. Following power analysis [9] to limit error, the RCT deployed
Vanilla
Deploy triple
the given
Population Size
(3Ntotal ):
Corrupted only
Wait for
Trial
Duration
(3 ⨉ Stage
Length)
Calculate Power Analysis for Population Size per Stage (Ntotal )
Deploy the given Population Size (Ntotal ):
Equal parts control and corrupted
End Trial
Wait for Stage Duration
Randomized Control
Time to
Stop?
Yes
No
Adaptive Design
Gather Events from
Stage Logs
Calculate new stage
deployment
allocations
Update proportions
of incidence
Deploy new
allocations
Wait for Stage
Duration
Time to
Stop?
Yes
No
Start Trial
Set initial parameters
Figure 1: Flow diagram of the three trial designs documented.
“Time to Stop?” step is when the trial could end early.
48 honeypots at each stage, split equally between regions and cor- ruption status (control or corrupted). The AD trial started the same
as the RCT, but in the later stages it automatically updated the
allocation proportions to request fewer honeypots overall (AD: 119,
RCT: 144). Overall, the AD trial viewed more exploitations than in
the RCT (AD: 50, RCT: 42).
This study demonstrates that while recording more intrusions
in observational studies (i.e., in the vanilla trial), the presence of
the control group (as in RCT and AD) enables us to identify the cor- ruption effect. Our AD shows it is capable of confirming corruption
effect at a cheaper and quicker rate than RCT. Had the difference
due to the corruption been less apparent a priori (e.g., in alter- ing multiple points of entry or limiting sequences of vulnerability
exploits), deploying control honeypots amongst the corrupted pro- vides counterfactual information for confidence in the corruption’s
(causal) effect.
Page 3 of 3
ADHD: Adaptive Design Honeypot Deployments ESORICS ’23, September 25–29, 2023, The Hague, The Netherlands
REFERENCES
[1] Samuel Kelly Brew and Emmanuel Ahene. 2022. Threat Landscape Across Multi- ple Cloud Service Providers Using Honeypots as an Attack Source. In Frontiers in
Cyber Security: 5th International Conference, FCS 2022, Kumasi, Ghana, December
13–15, 2022, Proceedings. Springer, 163–179.
[2] Kate Highnam, Kai Arulkumaran, Zachary Hanif, and Nicholas R. Jennings. 2021.
BETH Dataset: Real Cybersecurity Data for Unsupervised Anomaly Detection
Research. The Conference on Applied Machine Learning in Information Security
(CAMLIS) (2021).
[3] Sherilyn Houle. 2015. An introduction to the fundamentals of randomized con- trolled trials in pharmacy research. The Canadian journal of hospital pharmacy
68, 1 (2015), 28.
[4] Christopher Kelly, Nikolaos Pitropakis, Alexios Mylonas, Sean McKeown, and
William J Buchanan. 2021. A comparative analysis of honeypots on different
cloud platforms. Sensors 21, 7 (2021), 2433.
[5] Stefan Machmeier. 2023. Honeypot Implementation in a Cloud Environment.
arXiv preprint arXiv:2301.00710 (2023).
[6] Philip Pallmann, Alun W Bedding, Babak Choodari-Oskooei, Munyaradzi Di- mairo, Laura Flight, Lisa V Hampson, Jane Holmes, Adrian P Mander, Lang’o
Odondi, Matthew R Sydes, et al. 2018. Adaptive designs in clinical trials: why
use them, and how to run and report them. BMC medicine 16, 1 (2018), 1–15.
[7] Niels Provos and Thorsten Holz. 2007. Virtual honeypots: from botnet tracking to
intrusion detection. Pearson Education.
[8] Nigel Stallard, Lisa Hampson, Norbert Benda, Werner Brannath, Thomas Burnett,
Tim Friede, Peter K Kimani, Franz Koenig, Johannes Krisam, Pavel Mozgunov,
et al. 2020. Efficient adaptive designs for clinical trials of interventions for
COVID-19. Statistics in Biopharmaceutical Research 12, 4 (2020), 483–497.
[9] Kristian Thorlund, Shirin Golchi, Jonas Haggstrom, and Edward Mills. 2019.
Highly Efficient Clinical Trials Simulator (HECT): Software application for plan- ning and simulating platform adaptive trials. Gates Open Research 3 (2019).
[10] CH van Werkhoven, S Harbarth, and MJM Bonten. 2019. Adaptive designs in
clinical trials in critically ill patients: principles, advantages and pitfalls. Intensive
Care Medicine 45, 5 (2019), 678–682.
Received 21 July 2023