Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models

Abstract

The protection of cyber Intellectual Property (IP) such as web content is an increasingly critical concern. The rise of large language models (LLMs) with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. In response, we propose a novel defense framework that empowers web content creators to safeguard their web-based IP from unauthorized LLM real-time extraction and redistribution.

Publication
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
Zhuangdi Zhu
Zhuangdi Zhu
Assistant Professor (Tenure-Track)

My research centers around accountable, scalable, and trustworthy AI, e.g., decentralized machine learning, knowledge transfer for supervised and reinforcement learning, debiased representation learning, etc.