Security & Privacy, TU Wien ‒

Roles

Assistant Professor

Contact

Courses

2026S

Project in Computer Science 1 / PR / 192.021
Project in Computer Science 2 / PR / 192.022
Seminar for PhD Students / SE / 192.060
Bachelor Thesis / PR / 192.061
Scientific Research and Writing / SE / 193.052

2025W

Project in Computer Science 1 / PR / 192.021
Project in Computer Science 2 / PR / 192.022
Seminar for PhD Students / SE / 192.060
Bachelor Thesis / PR / 192.061
Machine Learning for Computer Security / VU / 192.172

Projects (at TU Wien)

BREADS : Building Robust and Explainable AI-based Defenses for Computer Security
2024 - 2030 / Vienna Science and Technology Fund (WWTF)

Publications (created while at TU Wien)

2025

Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]
Cortellazzi, J., Quiring, E., Arp, D., Pendlebury, F., Pierazzi, F., & Cavallaro, L. (2025). Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]. ACM Transactions on Privacy and Security, 28(4), 1–37.
DOI: 10.1145/3742895 Metadata
Abstract
Recent research efforts on adversarial machine learning (ML) have investigated problem-space attacks, focusing on the generation of real evasive objects in domains where, unlike images, there is no clear inverse mapping to the feature space (e.g., software). However, the design, comparison, and real-world implications of problem-space attacks remain underexplored. This article makes three major contributions. Firstly, we propose a general formalization for adversarial ML evasion attacks in the problem-space, which includes the definition of a comprehensive set of constraints on available transformations, preserved semantics, absent artifacts, and plausibility. We shed light on the relationship between feature space and problem space, and we introduce the concept of side-effect features as the by-product of the inverse feature-mapping problem. This enables us to define and prove necessary and sufficient conditions for the existence of problem-space attacks. Secondly, building on our general formalization, we propose a novel problem-space attack on Android malware that overcomes past limitations in terms of semantics and artifacts. We have tested our approach on a dataset with 150K Android apps from 2016 and 2018 which show the practical feasibility of evading a state-of-the-art malware classifier along with its hardened version. Thirdly, we explore the effectiveness of adversarial training as a possible approach to enforce robustness against adversarial samples, evaluating its effectiveness on the considered machine learning models under different scenarios. Our results demonstrate that “adversarial-malware as a service” is a realistic threat, as we automatically generate thousands of realistic and inconspicuous adversarial applications at scale, where on average it takes only a few minutes to generate an adversarial instance.
Seeing through: analyzing and attacking virtual backgrounds in video calls
Weißberg, F., Hilgefort, J. M., Grogorick, S., Arp, D., Eisenhofer, T., Eisemann, M., & Rieck, K. (2025). Seeing through: analyzing and attacking virtual backgrounds in video calls. In SEC ’25: Proceedings of the 34th USENIX Conference on Security Symposium (pp. 6561–6580). Association for Computing Machinery.
Metadata
Abstract
Video calls have become an essential part of remote work. They enable employees to collaborate from different locations, including their homes. Transmitting video from the personal living environment, however, poses a privacy risk: Colleagues may gain insight into private information through details in the background. To limit this risk, video conferencing services implement virtual backgrounds that conceal the real environment during a video call. Unfortunately, this protection suffers from imperfections and pixels from the environment occasionally become visible. In this paper, we investigate this privacy leak. We analyze the virtual background techniques used in two major video conferencing services (Zoom and Google) and determine how pixels of the environment leak. Based on this analysis, we propose a reconstruction attack: This attack removes the virtual background by re-purposing the video conferencing software and uses semantic segmentation to filter out the video caller. As a result, only pixels leaking from the environment remain and can be aggregated into a reconstructed image. We examine the efficacy of this attack in a quantitative and qualitative evaluation. In comparison to previous studies, our attack recovers at least 53% more leaked pixels from a video call, exposing larger areas of the environment. We thus conclude that virtual backgrounds currently do not provide an adequate protection in practice.
Rule Extraction and Interaction-Aware Explainability for AI-Driven Malware Detection
Anthony, P., Galadima, K. R., Adams, Z., Onoja, M., Arp, D., Homola, M., & Balogh, Š. (2025). Rule Extraction and Interaction-Aware Explainability for AI-Driven Malware Detection. In A. Hogan, K. Satoh, H. Dağ, A.-Y. Turhan, D. Roman, & A. Soylu (Eds.), Rules and Reasoning : 9th International Joint Conference, RuleML+RR 2025, Istanbul, Turkey, September 22–24, 2025, Proceedings (pp. 137–155). Springer.
DOI: 10.1007/978-3-032-08887-1_9 Metadata
Abstract
As machine learning becomes integral to malware detection, the demand for interpretability has become critical, not only to understand model decisions, but also to support actionable insights for analysts. While post-hoc techniques like SHAP, LIME, and Anchor offer feature attributions or instance-level rules, they fail to capture generalized semantic patterns across malware samples. To address this, we propose a unified and extensible explainability framework for binarized malware features, offering three levels of interpretability: (1) first-order explanations (individual feature effects), (2) second-order explanations (pairwise interactions revealing nonlinear dependencies), and (3) higher-order, rule-based explanations that formalize joint feature contributions for deeper analytical insight. Our framework builds on an MLP-based detector trained on the EMBER dataset. It first uses SHAP to assess global feature relevance and then introduces two key extensions: (i) a SHAP-based interaction formalism that reveals synergistic and antagonistic effects among features, and (ii) a generalized Anchor algorithm that extracts symbolic, reusable rules to illuminate model behavior and malware patterns. Our global rules achieve an F1 score of 83% on EMBER and perfectly reconstruct nonlinear decision boundaries in synthetic benchmarks (100% F1 on the XoR dataset). Analysis of EMBER’s extracted rules reveals that the black-box model’s logic often relies on structural anomalies, prioritizing statistical patterns rather than capturing meaningful behavioral patterns indicative of known malware tactics.

2024

Pitfalls in Machine Learning for Computer Security
Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., & Rieck, K. (2024). Pitfalls in Machine Learning for Computer Security. Communications of the ACM, 67(11), 104–112.
DOI: 10.1145/3643456 Metadata
Abstract
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.

Daniel Christopher Arp