Query2doc: Query Expansion with Large Language Models, Wang et al., arXiv 2023. [Paper] 利用大型语言模型进行查询扩展。
Generative and Pseudo-Relevant Feedback for Sparse, Dense and Learned Sparse Retrieval, Mackie et al., arXiv 2023. [Paper] 针对稀疏、密集和学习到的稀疏检索的生成式和伪相关反馈。
Generative Relevance Feedback with Large Language Models, Mackie et al., SIGIR 2023 (short paper). [Paper] 利用大型语言模型进行生成式相关性反馈。
GRM: Generative Relevance Modeling Using Relevance-Aware Sample Estimation for Document Retrieval, Mackie et al., arXiv 2023. [Paper] GRM:使用相关性感知样本估计的生成式相关性建模,用于文档检索。
Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search, Mao et al., arXiv 2023. [Paper] 大型语言模型了解您的上下文搜索意图:面向对话搜索的提示框架。
Precise Zero-Shot Dense Retrieval without Relevance Labels, Gao et al., ACL 2023. [Paper] 无需相关性标签的精确零样本密集检索。
Query Expansion by Prompting Large Language Models, Jagerman et al., arXiv 2023. [Paper] 通过提示大型语言模型进行查询扩展。
Large Language Models are Strong Zero-Shot Retriever, Shen et al., arXiv 2023. [Paper] 大型语言模型是强大的零样本检索器。
Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting, Ye et al., EMNLP 2023 (Findings). [Paper] 增强对话搜索:大型语言模型辅助的信息查询重写。
Can generative llms create query variants for test collections? an exploratory study, M. Alaofi et al., SIGIR 2023 (short paper). [Paper] 生成式大型语言模型能否为测试集创建查询变体?一项探索性研究。
Corpus-Steered Query Expansion with Large Language Models, Lei et al., EACL 2024 (Short Paper). [Paper] 基于语料库引导的大型语言模型查询扩展。
Large language model based long-tail query rewriting in taobao search, Peng et al., WWW 2024. [Paper] 基于大型语言模型的淘宝搜索长尾查询重写。
Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers?, Li et al., SIGIR 2024. [Paper] 查询扩展能否提高强交叉编码器排序器的泛化能力?
Query Performance Prediction using Relevance Judgments Generated by Large Language Models, Meng et al., arXiv 2024. [Paper] 利用大型语言模型生成的相关性判断进行查询性能预测
RaFe: Ranking Feedback Improves Query Rewriting for RAG, Mao et al., arXiv 2024. [Paper] RaFe:排序反馈改进了RAG的查询重写
Crafting the Path: Robust Query Rewriting for Information Retrieval, Baek et al., arXiv 2024. [Paper] 打造路径:信息检索中的稳健查询重写
Query Rewriting for Retrieval-Augmented Large Language Models, Ma et al., arXiv 2023. [Paper] 检索增强型大型语言模型的查询重写
1.2微调方法
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation, Srinivasan et al., EMNLP 2022 (Industry). [Paper] (This paper explore fine-tuning methods in baseline experiments.) 利用检索增强和多阶段蒸馏的大型语言模型查询意图
1.3知识蒸馏方法
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation, Srinivasan et al., EMNLP 2022 (Industry). [Paper] 利用检索增强和多阶段蒸馏的大型语言模型查询意图
Knowledge Refinement via Interaction Between Search Engines and Large Language Models, Feng et al., arXiv 2023. [Paper] 通过搜索引擎与大型语言模型之间的交互进行知识精炼。
Query Rewriting for Retrieval-Augmented Large Language Models, Ma et al., arXiv 2023. [Paper] 面向检索增强的大型语言模型查询重写
2.检索器
2.1利用LLMs生成搜索数据
InPars: Data Augmentation for Information Retrieval using Large Language Models, Bonifacio et al., arXiv 2022. [Paper] InPars: 使用大型语言模型进行信息检索的数据增强
Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval, Ma et al., arXiv 2023. [Paper] 基于大型语言模型文档扩展的密集段落检索预训练
InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval, Jeronymo et al., arXiv 2023. [Paper] InPars-v2: 大型语言模型作为信息检索的高效数据集生成器
Promptagator: Few-shot Dense Retrieval From 8 Examples, Dai et al., ICLR 2023. [Paper] Promptagator: 从8个示例中进行少样本密集检索
AugTriever: Unsupervised Dense Retrieval by Scalable Data Augmentation, Meng et al., arXiv 2023. [Paper] AugTriever: 通过可扩展数据增强进行无监督密集检索
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers, Saad-Falco et al., arXiv 2023. [Paper] UDAPDR: 通过LLM提示和重排器蒸馏进行无监督领域适应
Soft Prompt Tuning for Augmenting Dense Retrieval with Large Language Models, Peng et al., arXiv 2023. [Paper] 软提示调优以增强大型语言模型的密集检索
CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation, Huang et al., ACL 2023. [Paper] CONVERSER: 使用合成数据生成进行少样本对话密集检索
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval, Thakur et al., arXiv 2023. [Paper] 利用LLMs为多语言密集检索合成多语言训练数据
Questions Are All You Need to Train a Dense Passage Retriever, Sachan et al., ACL 2023. [Paper] 问题是你训练密集段落检索器所需的一切
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators, Chen et al., EMNLP 2023. [Paper] 超越事实性: 大型语言模型作为知识生成器的全面评估
Gecko: Versatile Text Embeddings Distilled from Large Language Models, Lee et al., arXiv 2024. [Paper] Gecko: 从大型语言模型中提取的多功能文本嵌入
Improving Text Embeddings with Large Language Models, Wang et al., ACL 2024. [Paper] 使用大型语言模型改进文本嵌入
2.2利用LLMs增强模型架构
Text and Code Embeddings by Contrastive Pre-Training, Neelakantan et al., arXiv 2022. [Paper] 通过对比预训练生成文本和代码嵌入
Fine-Tuning LLaMA for Multi-Stage Text Retrieval, Ma et al., arXiv 2023. [Paper] 微调LLaMA进行多阶段文本检索
Large Dual Encoders Are Generalizable Retrievers, Ni et al., EMNLP 2022. [Paper] 大型双编码器是可泛化的检索器
Task-aware Retrieval with Instructions, Asai et al., ACL 2023 (Findings). [Paper] 任务感知检索与指令
Transformer memory as a differentiable search index, Tay et al., NeurIPS 2022. [Paper] Transformer记忆作为可微分搜索索引
Large Language Models are Built-in Autoregressive Search Engines, Ziems et al., ACL 2023 (Findings). [Paper] 大型语言模型是内置的自回归搜索引擎
Chatretriever: Adapting large language models for generalized and robust conversational dense retrieval, Mao et al., arXiv. [Paper] Chatretriever: 适应大型语言模型进行通用且鲁棒的对话密集检索
How does generative retrieval scale to millions of passages?, Pradeep et al., ACL 2023. [Paper] 生成式检索如何扩展到数百万段落?, Pradeep et al
CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive Tasks, Li et al., SIGIR. [Paper] CorpusLM: 面向知识密集型任务的统一语言模型