營銷型網(wǎng)站建設(shè)百度網(wǎng)站的網(wǎng)址
檢索增強(qiáng)生成(Retrieval-Augmented Generation, RAG)作為應(yīng)用大模型落地的方案之一,通過讓 LLM 獲取上下文最新數(shù)據(jù)來解決 LLM 的局限性。典型的應(yīng)用案例是基于公司特定的文檔和知識(shí)庫開發(fā)的聊天機(jī)器人,為公司內(nèi)部人員快速檢索內(nèi)部文檔提供便利。另外,也適用于特定領(lǐng)域的GenAI應(yīng)用,如醫(yī)療保健、金融和法律服務(wù)。盡管Naive RAG在處理簡單問題時(shí)表現(xiàn)良好,但在面對(duì)復(fù)雜任務(wù)時(shí)卻顯得力不從心。本文將探討Naive RAG的局限性,并介紹如何通過引入代理(Agentic)方法來提升RAG系統(tǒng)的智能性和實(shí)用性。
01.
Naive RAG方法在處理簡單問題時(shí)表現(xiàn)良好,例如:
“特斯拉的主要風(fēng)險(xiǎn)因素是什么?”(基于特斯拉2021年10K報(bào)告)
“Milvus 2.4有哪些功能?”(基于Milvus 2.4 release note)
然而,當(dāng)面對(duì)更復(fù)雜的問題時(shí),Naive RAG的局限性就顯現(xiàn)出來了。
總結(jié)性問題:例如,“給我一個(gè)公司10K年度報(bào)告的總結(jié)”,Naive RAG難以在不丟失重要信息的情況下生成全面的總結(jié)。
比較性問題:例如,“Milvus 2.4 與Milvus 2.3 區(qū)別有哪些”,Naive RAG難以有效地進(jìn)行多文檔比較。
結(jié)構(gòu)化分析和語義搜索:例如,“告訴我美國表現(xiàn)最好的網(wǎng)約車公司的風(fēng)險(xiǎn)因素”,Naive RAG難以在復(fù)雜的語義搜索和結(jié)構(gòu)化分析中表現(xiàn)出色。
一般性多部分問題:例如,“告訴我文章A中的支持X的論點(diǎn),再告訴我文章B中支持Y的論點(diǎn),按照我們的內(nèi)部風(fēng)格指南制作一個(gè)表格,然后基于這些事實(shí)生成你自己的結(jié)論”,Naive RAG難以處理多步驟、多部分的復(fù)雜任務(wù)。
02.
Naive RAG上述痛點(diǎn)的原因單次處理:Naive RAG通常是一次性處理查詢,缺乏多步驟的推理能力。
缺乏查詢理解和規(guī)劃:Naive RAG無法深入理解查詢的復(fù)雜性,也無法進(jìn)行任務(wù)規(guī)劃。
缺乏工具使用能力:Naive RAG無法調(diào)用外部工具或API來輔助完成任務(wù)。
缺乏反思和錯(cuò)誤糾正:Naive RAG無法根據(jù)反饋進(jìn)行自我改進(jìn)。
無記憶(無狀態(tài)):Naive RAG無法記住對(duì)話歷史,無法在多輪對(duì)話中保持上下文一致性。
03.
從RAG到Agentic RAG為了克服Naive RAG的局限性,我們可以引入代理方法(Agentic),使RAG系統(tǒng)更加智能和靈活。
路由
路由是最簡單的代理推理形式。給定用戶查詢和一組選擇,系統(tǒng)可以輸出一個(gè)子集,將查詢路由到合適的處理模塊。
工具
調(diào)用外部工具或API來輔助完成任務(wù)。比如,使用查詢天氣接口來獲取最新的天氣信息。
查詢/任務(wù)規(guī)劃
將查詢分解為可并行處理的子查詢。每個(gè)子查詢可以針對(duì)任何一組RAG管道執(zhí)行,從而提高處理效率和準(zhǔn)確性。
反思
使用反饋來改進(jìn)代理的執(zhí)行并減少錯(cuò)誤,反饋可以來自LLM自身。
記憶
除了當(dāng)前查詢外,還可以將對(duì)話歷史作為輸入,納入RAG管道中,從而在多輪對(duì)話中保持上下文一致性。
04.
實(shí)踐我們基于Milvus,LlamaIndex構(gòu)建一個(gè)Agentic RAG案例。
首先,我們把Milvus 2.3 和 2.4 release note文檔,通過LlamaIndex SentenceWindowNodeParser
分段之后,導(dǎo)入到Milvus。
node_parser?=?SentenceWindowNodeParser.from_defaults(window_size=3,window_metadata_key="window",original_text_metadata_key="original_text",
)#?Extract?nodes?from?documents
nodes?=?node_parser.get_nodes_from_documents(documents)vector_store?=?MilvusVectorStore(dim=1536,?uri="http://localhost:19530",collection_name='agentic_rag',overwrite=True,enable_sparse=False,hybrid_ranker="RRFRanker",hybrid_ranker_params={"k":?60})storage_context?=?StorageContext.from_defaults(vector_store=vector_store)index?=?VectorStoreIndex(nodes,?storage_context=storage_context
)
然后,我們定義兩個(gè)agent tool,他們分別是vector query tool 和summary tool。vector query tool利用了Milvus Hybrid search能力。summary tool采用了 LlamaIndex的 SummaryIndex
對(duì)于文檔塊提取summary。
def?vector_query(query:?str,page_numbers:?Optional[List[int]]?=?None
)?->?str:#?The?target?key?defaults?to?`window`?to?match?the?node_parser's?defaultpostproc?=?MetadataReplacementPostProcessor(target_metadata_key="window")#?BAAI/bge-reranker-base?is?a?cross-encoder?model#?link:?https://huggingface.co/BAAI/bge-reranker-basererank?=?BGERerankFunction(top_n?=?3,?model_name?=?"BAAI/bge-reranker-base",device="cpu")#?The?QueryEngine?class?is?equipped?with?the?generator?and?facilitates?the?retrieval?and?generation?stepsquery_engine?=?vector_index.as_query_engine(similarity_top_k?=?3,?vector_store_query_mode="hybrid",??#?Milvus?starts?supporting?from?version?2.4,?use?'Default'?for?versions?before?2.4node_postprocessors?=?[postproc,?rerank],)response?=?query_engine.query(query)return?responsevector_query_tool?=?FunctionTool.from_defaults(name=f"vector_tool_{name}",fn=vector_query
)
summary_index?=?SummaryIndex(nodes)
summary_query_engine?=?summary_index.as_query_engine(response_mode="tree_summarize",use_async=True,
)
summary_tool?=?QueryEngineTool.from_defaults(name=f"summary_tool_{name}",query_engine=summary_query_engine,description=(f"Useful?for?summarization?questions?related?to?{name}"),
)
最后,通過LlamaIndex的ReActAgentWorker
整合vector query tool 和summary tool,以及React的特性,反思,路由以及使用工具能力,實(shí)現(xiàn)Agentic RAG。
response?=?agent.query("Tell?me?about?the?difference?between?milvus?2.3?and?milvus?2.4?"
)
這是對(duì)于上述問題的思考過程,包括React的觀察,思考,行動(dòng)具體步驟。
Thought:?The?user?is?asking?about?the?differences?between?Milvus?2.3?and?Milvus?2.4.?I?need?to?gather?information?about?both?versions?to?provide?a?comprehensive?comparison.
Action:?vector_tool_milvus_2_3
Action?Input:?{'query':?"What's?new?in?Milvus?2.3?"}
Thought:?It?appears?there?is?a?persistent?issue?with?the?tool?input.?I?will?try?using?a?different?tool?to?gather?the?necessary?information.
Action:?summary_tool_milvus_2_3
Action?Input:?{'input':?"What's?new?in?Milvus?2.3?"}
Observation:?Milvus?2.3?introduces?several?new?features?and?enhancements,?including?GPU?support,?Arm64?support,?a?refactored?QueryNode,?and?a?merged?IndexCoord?and?DataCoord?for?simplified?deployment.?It?also?includes?a?new?NATS-based?message?queue,?upsert?API,?range?search,?count?statement,?cosine?metrics,?raw?vectors?in?search?returns,?ScaNN?index?support,?iterator?functionality,?JSON_CONTAINS?expressions,?and?CDC?support.?Additionally,?there?are?performance?improvements?such?as?MMap?for?capacity?increase,?optimized?data?filtering,?growing?index,?and?increased?resource?usage?in?multi-core?environments.?The?release?also?features?a?new?load?balancer,?dynamic?configuration?support,?tracing?support,?reorganized?error?codes,?and?upgraded?tools?like?Birdwatcher?and?Attu.
Thought:?I?have?gathered?information?about?the?new?features?and?enhancements?in?Milvus?2.3.?Now,?I?need?to?gather?similar?information?for?Milvus?2.4?to?compare?the?two?versions.
Action:?summary_tool_milvus_2_4
Action?Input:?{'input':?"What's?new?in?Milvus?2.4?"}
Observation:?Milvus?2.4?introduces?several?new?features?and?improvements.?Key?highlights?include:
...
These?updates?aim?to?enhance?system?performance,?improve?observability,?and?streamline?the?codebase?for?increased?simplicity.
Thought:?I?now?have?detailed?information?about?the?new?features?and?enhancements?in?both?Milvus?2.3?and?Milvus?2.4.?I?can?compare?the?two?versions?based?on?these?details.
輸出的答案,從答案來看,不僅詳細(xì)列舉了Milvus 2.3 和Milvus 2.4各種功能特性,還有功能總結(jié)與對(duì)比。
Answer:?Milvus?2.3?and?Milvus?2.4?both?introduce?significant?new?features?and?improvements,?but?they?focus?on?different?aspects?of?the?system.?Here?are?the?key?differences:###?Milvus?2.3:
1.?**GPU?and?Arm64?Support**:?Introduces?GPU?support?and?Arm64?support.
2.?**Refactored?QueryNode**:?Simplifies?deployment?by?merging?IndexCoord?and?DataCoord.
3.?**New?Message?Queue**:?Implements?a?NATS-based?message?queue.
4.?**New?APIs?and?Features**:?Includes?upsert?API,?range?search,?count?statement,?cosine?metrics,?raw?vectors?in?search?returns,?ScaNN?index?support,?iterator?functionality,?JSON_CONTAINS?expressions,?and?CDC?support.
5.?**Performance?Improvements**:?Features?MMap?for?capacity?increase,?optimized?data?filtering,?growing?index,?and?increased?resource?usage?in?multi-core?environments.
6.?**Load?Balancer?and?Dynamic?Configuration**:?Introduces?a?new?load?balancer?and?dynamic?configuration?support.
7.?**Tool?Upgrades**:?Upgrades?tools?like?Birdwatcher?and?Attu.###?Milvus?2.4:
1.?**New?GPU?Index?-?CAGRA**:?Introduces?a?state-of-the-art?GPU-based?graph?index?for?significant?performance?boosts.
2.?**Multi-vector?and?Hybrid?Search**:?Supports?storing?vector?embeddings?from?multiple?models?and?conducting?hybrid?searches.
3.?**Sparse?Vectors**:?Supports?efficient?approximate?semantic?nearest?neighbor?search?for?keyword?interpretation?and?analysis.
4.?**Grouping?Search**:?Allows?aggregation?of?search?results?by?specific?scalar?fields.
5.?**Inverted?Index?and?Fuzzy?Match**:?Improves?scalar?query?performance?and?supports?fuzzy?matches.
6.?**New?Data?Types**:?Supports?Float16?and?BFloat16?vector?data?types.
7.?**L0?Segment**:?Enhances?performance?of?delete?and?upsert?operations?by?recording?deleted?data.
8.?**Refactored?BulkInsert**:?Improves?logic?for?importing?multiple?files?in?a?single?request.
9.?**Memory-mapped?Storage**:?Optimizes?memory?usage?by?mapping?file?content?into?memory.
10.?**Milvus-CDC**:?Improves?stability?for?capturing?and?synchronizing?incremental?data.###?Summary:
-?**Milvus?2.3**?focuses?on?broadening?hardware?support,?enhancing?deployment?simplicity,?and?introducing?various?new?APIs?and?performance?improvements.
-?**Milvus?2.4**?emphasizes?advanced?search?capabilities,?new?data?types,?improved?memory?management,?and?further?performance?optimizations.Both?versions?aim?to?enhance?system?performance?and?usability,?but?Milvus?2.4?introduces?more?advanced?search?features?and?optimizations?compared?to?Milvus?2.3.
05.
結(jié)論
從Naive RAG到Agentic RAG的轉(zhuǎn)變,標(biāo)志著檢索增強(qiáng)生成技術(shù)的一個(gè)重要進(jìn)步。通過引入代理(Agentic)方法,我們可以顯著提升RAG系統(tǒng)在處理復(fù)雜任務(wù)時(shí)的能力。這不僅使得RAG系統(tǒng)更加智能和靈活,也為其在實(shí)際應(yīng)用中提供了更廣闊的前景。未來,我們可以期待更多基于Agentic RAG的創(chuàng)新應(yīng)用,推動(dòng)大模型技術(shù)的發(fā)展。
Ref: https://docs.google.com/presentation/d/1IWjo8bhoatWccCfGLYw_QhUI4zfF-MujN3ORIDCBIbc/edit#slide=id.g2c00d03c26e_0_64
作者介紹
Milvus 北辰使者:臧偉
推薦閱讀