Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models
Yisheng Zhong,
Yizhu Wen,
Junfeng Guo,
Mehran Kafai,
Heng Huang,
Hanqing Guo,
Zhuangdi Zhu
May 2025
Abstract
The protection of cyber Intellectual Property (IP) such as web content is an increasingly critical concern. The rise of large language models (LLMs) with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. In response, we propose a novel defense framework that empowers web content creators to safeguard their web-based IP from unauthorized LLM real-time extraction and redistribution.
Publication
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
Assistant Professor (Tenure-Track)
My research centers around accountable, scalable, and trustworthy AI, e.g., decentralized machine learning, knowledge transfer for supervised and reinforcement learning, debiased representation learning, etc.