Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models

Yisheng Zhong, Yizhu Wen, Junfeng Guo, Mehran Kafai, Heng Huang, Hanqing Guo, Zhuangdi Zhu

May 2025

Abstract

The protection of cyber Intellectual Property (IP) such as web content is an increasingly critical concern. The rise of large language models (LLMs) with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. In response, we propose a novel defense framework that empowers web content creators to safeguard their web-based IP from unauthorized LLM real-time extraction and redistribution.

Type

Preprint

Publication

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)

Zhuangdi Zhu

Assistant Professor (Tenure-Track)

My research centers around accountable, scalable, and trustworthy AI, e.g., decentralized machine learning, knowledge transfer for supervised and reinforcement learning, debiased representation learning, etc.