Connect the Dots: Knowledge Graph-Guided Crawler Attack on Retrieval-Augmented Generation Systems

Connect the Dots: Knowledge Graph-Guided Crawler Attack on Retrieval-Augmented Generation Systems
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Stealing attacks pose a persistent threat to the intellectual property of deployed machine-learning systems. Retrieval-augmented generation (RAG) intensifies this risk by extending the attack surface beyond model weights to knowledge base that often contains IP-bearing assets such as proprietary runbooks, curated domain collections, or licensed documents. Recent work shows that multi-turn questioning can gradually steal corpus content from RAG systems, yet existing attacks are largely heuristic and often plateau early. We address this gap by formulating RAG knowledge-base stealing as an adaptive stochastic coverage problem (ASCP), where each query is a stochastic action and the goal is to maximize the conditional expected marginal gain (CMG) in corpus coverage under a query budget. Bridging ASCP to real-world black-box RAG knowledge-base stealing raises three challenges: CMG is unobservable, the natural-language action space is intractably large, and feasibility constraints require stealthy queries that remain effective under diverse architectures. We introduce RAGCrawler, a knowledge graph-guided attacker that maintains a global attacker-side state to estimate coverage gains, schedule high-value semantic anchors, and generate non-redundant natural queries. Across four corpora and four generators with BGE retriever, RAGCrawler achieves 66.8% average coverage (up to 84.4%) within 1,000 queries, improving coverage by 44.90% relative to the strongest baseline. It also reduces the queries needed to reach 70% coverage by at least 4.03x on average and enables surrogate reconstruction with answer similarity up to 0.699. Our attack is also scalable to retriever switching and newer RAG techniques like query rewriting and multi-query retrieval. These results highlight urgent needs to protect RAG knowledge assets.


💡 Research Summary

The paper addresses a newly emerging threat to Retrieval‑Augmented Generation (RAG) services: the theft of the underlying knowledge base (KB) that the retriever draws from. While prior model‑stealing work focuses on extracting model weights, RAG systems expose proprietary documents—runbooks, licensed manuals, internal tickets—through every generated answer. Existing multi‑turn stealing attacks rely on simple heuristics such as continuing the last response or expanding on extracted keywords. These approaches quickly drift off‑topic or repeatedly query the same semantic region, causing the coverage of the hidden corpus to plateau well before the query budget is exhausted.

To overcome these limitations, the authors formalize KB stealing as an instance of the Adaptive Stochastic Coverage Problem (ASCP), which they term “RAG Crawling.” In this formulation, each query is an action that stochastically reveals a subset of hidden items (documents) under an unknown realization of the world. The attacker’s objective is to maximize the expected number of unique items uncovered given a fixed query budget B. Because the coverage function is adaptively monotone and submodular, the classic adaptive‑greedy policy that selects the query with the largest Conditional Expected Marginal Gain (CMG) at each step enjoys a (1‑1/e) approximation guarantee relative to the optimal adaptive policy. This provides a solid theoretical foundation for a globally optimal stealing strategy.

Implementing the adaptive‑greedy principle in a realistic black‑box setting raises three practical challenges: (1) CMG is unobservable because the attacker lacks direct access to the corpus; (2) the natural‑language query space is effectively infinite, making exhaustive search infeasible; (3) queries must remain innocuous, comply with service policies, and stay effective under common defenses such as query rewriting and multi‑query retrieval.

The proposed solution, RAGCrawler, consists of three tightly coupled stages that together instantiate the adaptive‑greedy policy under real‑world constraints.

  1. Knowledge‑Graph Construction – After each response, the attacker extracts entities and relations (e.g., issue, module, hot‑fix) and inserts them into an attacker‑side knowledge graph (KG). The KG serves as a global state, tracking which facts have been revealed and which relational gaps remain. By maintaining this structured summary, the attacker can estimate the marginal contribution of future queries to overall coverage.

  2. Strategy Scheduling – Using the KG’s growth dynamics and historical gain statistics, the system selects a set of high‑value “semantic anchors.” Anchors correspond to uncovered relations or low‑frequency nodes (e.g., an issue whose hot‑fix has never been seen). This step dramatically prunes the action space, focusing the search on queries that are expected to yield the highest CMG.

  3. Query Generation – The chosen anchors are transformed into natural‑language queries via prompt engineering and a history‑aware deduplication module. The generator produces fluent, policy‑compliant questions that avoid repeated retrieval of already‑known documents while still steering the retriever toward the targeted gaps.

The authors evaluate RAGCrawler on four corpora (including a proprietary troubleshooting knowledge base) and four RAG configurations: a baseline BGE retriever, a version with query rewriting, a multi‑query retriever, and a hardened “safeguard” variant. With a budget of 1,000 queries, RAGCrawler achieves an average coverage of 66.8 % and a peak of 84.4 %, outperforming the strongest baseline by 44.90 %. It also reduces the number of queries needed to reach 70 % coverage by at least a factor of 4.03 on average. To demonstrate the practical impact, the stolen documents are used to train a surrogate RAG system; the surrogate’s answers attain an answer‑similarity score of 0.699 (ROUGE‑L) compared with the original service. Moreover, the approach remains effective when the victim switches retrievers, employs query rewriting, or adopts multi‑query retrieval, indicating strong adaptability to evolving RAG designs.

The paper’s contributions are threefold: (i) a rigorous theoretical model of RAG knowledge‑base stealing as ASCP, with provable near‑optimality guarantees; (ii) a concrete attacker‑side KG‑driven framework that estimates CMG, schedules high‑gain actions, and generates stealthy natural queries; (iii) extensive empirical validation across diverse datasets and defensive settings, establishing the seriousness of the threat. The findings call for immediate development of defenses tailored to protect the knowledge base in RAG pipelines—such as response randomization, selective document exposure, and anomaly detection on query patterns—to mitigate the risk of large‑scale IP exfiltration.


Comments & Academic Discussion

Loading comments...

Leave a Comment