Awesome List Updates on Dec 11, 2023
2 awesome lists updated today.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor
1. Awesome Azure Openai Llm
What is the RAG (Retrieval-Augmented Generation)?
In a 2020 paper, Meta (Facebook) came up with a framework called retrieval-augmented generation to give LLMs access to information beyond their training data. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: [cnt] [22 May 2020]
- RAG-sequence — We retrieve k documents, and use them to generate all the output tokens that answer a user query.
- RAG-token— We retrieve k documents, use them to generate the next token, then retrieve k more documents, use them to generate the next token, and so on. This means that we could end up retrieving several different sets of documents in the generation of a single answer to a user’s query.
- Of the two approaches proposed in the paper, the RAG-sequence implementation is pretty much always used in the industry. It’s cheaper and simpler to run than the alternative, and it produces great results. cite [30 Sep 2023]
LlamaIndex
- LlamaIndex Overview (Japanese) [17 Jul 2023]
- LlamaIndex Tutorial: A Complete LlamaIndex Guide [18 Oct 2023]
- Multimodal RAG Pipeline ref [Nov 2023]
Vector Database Comparison
- Not All Vector Databases Are Made Equal: Printed version for "Medium" limits. doc [2 Oct 2021]
Vector Database Comparison / Vector Database Options for Azure
- Pgvector extension on Azure Cosmos DB for PostgreSQL: ref [13 Jun 2023]
- Vector Search in Azure Cosmos DB for MongoDB vCore [23 May 2023]
- Azure Cache for Redis Enterprise: Enterprise Redis Vector Search Demo [22 May 2023 ]
Vector Database Comparison / Lucene based search engine with OpenAI Embedding
- Vector Search with OpenAI Embeddings: Lucene Is All You Need: Our experiments were based on Lucene 9.5.0, but indexing was a bit tricky because the HNSW implementation in Lucene restricts vectors to 1024 dimensions, which was not sufficient for OpenAI’s 1536-dimensional embeddings. Although the resolution of this issue, which is to make vector dimensions configurable on a per codec basis, has been merged to the Lucene source trunk git (⭐2.5k), this feature has not been folded into a Lucene release (yet) as of early August 2023. [29 Aug 2023]
Microsoft Azure OpenAI relevant LLM Framework / Lucene based search engine with OpenAI Embedding
- Kernel Memory (⭐1.4k): Kernel Memory (FKA. Semantic Memory (SM)) is an open-source service and plugin specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines. [Jul 2023]
- FLAML (⭐3.8k): A lightweight Python library for efficient automation of machine learning and AI operations. FLAML provides an seamless interface for AutoGen, AutoML, and generic hyperparameter tuning. [Dec 2020]
- A Memory in Semantic Kernel vs Kernel Memory (FKA. Semantic Memory (SM)): Kernel Memory is designed to efficiently handle large datasets and extended conversations. Deploying the memory pipeline as a separate service can be beneficial when dealing with large documents or long bot conversations. ref (⭐2k)
Azure Reference Architectures / Azure AI Search
- Azure Cognitive Search rebranding Azure AI Search, it supports Vector search and semantic ranker. [16 Nov 2023]
Azure Enterprise Services / Azure AI Search
- Azure OpenAI Service On Your Data in Public Preview ref [19 Jun 2023]
- Azure OpenAI Finetuning: Babbage-002 is $34/hour, Davinci-002 is $68/hour, and Turbo is $102/hour. ref [16 Oct 2023]
- Customer Copyright Commitment: protects customers from certain IP claims related to AI-generated content. ref [16 Nov 2023]
Semantic Kernel / Semantic Kernel Planner
- Stepwise Planner released. The Stepwise Planner features the "CreateScratchPad" function, acting as a 'Scratch Pad' to aggregate goal-oriented steps. [16 Aug 2023]
Prompt Engineering / Prompt Template Language
- ReAct: [cnt]: Grounding with external sources. (Reasoning and Act): Combines reasoning and acting ref [6 Oct 2022]
- Zero-shot
- Large Language Models are Zero-Shot Reasoners: [cnt]: Let’s think step by step. [24 May 2022]
- Few-shot Learning
- Open AI: Language Models are Few-Shot Learners: [cnt] [28 May 2020]
- Retrieval Augmented Generation (RAG): [cnt]: To address such knowledge-intensive tasks. RAG combines an information retrieval component with a text generator model. [22 May 2020]
- Chain-of-Verification reduces Hallucination in LLMs: [cnt]: A four-step process that consists of generating a baseline response, planning verification questions, executing verification questions, and generating a final verified response based on the verification results. [20 Sep 2023]
- Reflexion: [cnt]: Language Agents with Verbal Reinforcement Learning. 1. Reflexion that uses
verbal reinforcement
to help agents learn from prior failings. 2. Reflexion converts binary or scalar feedback from the environment into verbal feedback in the form of a textual summary, which is then added as additional context for the LLM agent in the next episode. 3. It is lightweight and doesn’t require finetuning the LLM. [20 Mar 2023] / git (⭐2.2k)
LangChain Agent & Memory / Criticism to LangChain
- What’s your biggest complaint about langchain?: ref [May 2023]
LangChain vs Competitors / LangChain vs LlamaIndex
- Basically LlamaIndex is a smart storage mechanism, while LangChain is a tool to bring multiple tools together. cite [14 Apr 2023]
LangChain vs Competitors / LangChain vs Semantic Kernel vs Azure Machine Learning Prompt flow
What's the difference between LangChain and Semantic Kernel?
LangChain has many agents, tools, plugins etc. out of the box. More over, LangChain has 10x more popularity, so has about 10x more developer activity to improve it. On other hand, Semantic Kernel architecture and quality is better, that's quite promising for Semantic Kernel. ref (⭐21k) [11 May 2023]
- Using Prompt flow with Semantic Kernel: ref [07 Sep 2023]
Finetuning / PEFT: Parameter-Efficient Fine-Tuning (Youtube) [24 Apr 2023]
- PEFT: Parameter-Efficient Fine-Tuning. PEFT is an approach to fine tuning only a few parameters. [10 Feb 2023]
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning: [cnt] [28 Mar 2023]
- QLoRA: Efficient Finetuning of Quantized LLMs: [cnt]: 4-bit quantized pre-trained language model into Low Rank Adapters (LoRA). git (⭐9.8k) [23 May 2023]
- LIMA: Less Is More for Alignment: [cnt]: fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, either equivalent or strictly preferred to GPT-4 in 43% of cases. [18 May 2023]
-
Expand: LongLoRA
- LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models: [cnt]: A combination of sparse local attention and LoRA git (⭐2.6k) [21 Sep 2023]
- Key Takeaways from LongLora
The document states that LoRA alone is not sufficient for long context extension.
Although dense global attention is needed during inference, fine-tuning the model can be done by sparse local attention, shift short attention (S2-Attn).
S2-Attn can be implemented with only two lines of code in training.
- QA-LoRA: [cnt]: Quantization-Aware Low-Rank Adaptation of Large Language Models. A method that integrates quantization and low-rank adaptation for large language models. git (⭐107) [26 Sep 2023]
Finetuning / Llama Finetuning
- Multi-query attention (MQA): [cnt] [22 May 2023]
- Comprehensive Guide for LLaMA with RLHF: StackLLaMA: A hands-on guide to train LLaMA with RLHF [5 Apr 2023]
RLHF (Reinforcement Learning from Human Feedback) & SFT (Supervised Fine-Tuning) / Llama Finetuning
- Libraries: TRL, trlX (⭐4.4k), Argilla
TRL: from the Supervised Fine-tuning step (SFT), Reward Modeling step (RM) to the Proximal Policy Optimization (PPO) step
The three steps in the process: 1. pre-training on large web-scale data, 2. supervised fine-tuning on instruction data (instruction tuning), and 3. RLHF. ref [ⓒ 2023]
- Reinforcement Learning from AI Feedback (RLAF): [cnt]: Uses AI feedback to generate instructions for the model. TLDR: CoT (Chain-of-Thought, Improved), Few-shot (Not improved). Only explores the task of summarization. After training on a few thousand examples, performance is close to training on the full dataset. RLAIF vs RLHF: In many cases, the two policies produced similar summaries. [1 Sep 2023]
Model Compression for Large Language Models / Llama Finetuning
- A Survey on Model Compression for Large Language Models ref [15 Aug 2023]
Pruning and Sparsification / Llama Finetuning
- Wanda Pruning: [cnt]: A Simple and Effective Pruning Approach for Large Language Models [20 Jun 2023] ref
Knowledge Distillation: Reducing Model Size with Textbooks / Llama Finetuning
- Orca 2: [cnt]: Orca learns from rich signals from GPT 4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. ref [18 Nov 2023]
- Distilled Supervised Fine-Tuning (dSFT)
- Zephyr 7B: [cnt] Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). ref [25 Oct 2023]
- Mistral 7B: [cnt]: Outperforms Llama 2 13B on all benchmarks. Uses Grouped-query attention (GQA) for faster inference. Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost. ref [10 Oct 2023]
Other techniques and LLM patterns / Llama Finetuning
- Large Transformer Model Inference Optimization: Besides the increasing size of SoTA models, there are two main factors contributing to the inference challenge ... [10 Jan 2023]
3. Visual Prompting & Visual Grounding / Llama Finetuning
- Visual Prompting [21 Nov 2022]
- Andrew Ng’s Visual Prompting Livestream [24 Apr 2023]
OpenAI's Roadmap and Products / OpenAI's plans according to Sam Altman
- OpenAI’s CEO Says the Age of Giant AI Models Is Already Over ref [17 Apr 2023]
- Q* (pronounced as Q-Star): The model, called Q* was able to solve basic maths problems it had not seen before, according to the tech news site the Information. ref [23 Nov 2023]
OpenAI's Roadmap and Products / GPT-4 details leaked unverified
- The Dawn of LMMs: [cnt]: Preliminary Explorations with GPT-4V(ision) [29 Sep 2023]
- GPT-4 details leaked
- GPT-4 is a language model with approximately 1.8 trillion parameters across 120 layers, 10x larger than GPT-3. It uses a Mixture of Experts (MoE) model with 16 experts, each having about 111 billion parameters. Utilizing MoE allows for more efficient use of resources during inference, needing only about 280 billion parameters and 560 TFLOPs, compared to the 1.8 trillion parameters and 3,700 TFLOPs required for a purely dense model.
- The model is trained on approximately 13 trillion tokens from various sources, including internet data, books, and research papers. To reduce training costs, OpenAI employs tensor and pipeline parallelism, and a large batch size of 60 million. The estimated training cost for GPT-4 is around $63 million. ref [Jul 2023]
OpenAI's Roadmap and Products / OpenAI Products
- OpenAI DevDay 2023: GPT-4 Turbo with 128K context, Assistants API (Code interpreter, Retrieval, and function calling), GPTs (Custom versions of ChatGPT: ref), Copyright Shield, Parallel Function Calling, JSON Mode, Reproducible outputs [6 Nov 2023]
- ChatGPT can now see, hear, and speak: It has recently been updated to support multimodal capabilities, including voice and image. [25 Sep 2023] Whisper (⭐67k) / CLIP (⭐24k)
- GPT-3.5 Turbo Fine-tuning Fine-tuning for GPT-3.5 Turbo is now available, with fine-tuning for GPT-4 coming this fall. [22 Aug 2023]
- DALL·E 3 : In September 2023, OpenAI announced their latest image model, DALL-E 3 git (⭐11k) [Sep 2023]
- Open AI Enterprise: Removes GPT-4 usage caps, and performs up to two times faster ref [28 Aug 2023]
- Custom instructions: In a nutshell, the Custom Instructions feature is a cross-session memory that allows ChatGPT to retain key instructions across chat sessions. [20 Jul 2023]
Trustworthy, Safe and Secure LLM / GPT series release date
- The Foundation Model Transparency Index: [cnt]: A comprehensive assessment of the transparency of foundation model developers ref [19 Oct 2023]
- Hallucinations: [cnt]: A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [9 Nov 2023]
Large Language Model Is: Abilities / GPT series release date
- Emergent Abilities of Large Language Models: [cnt]: Large language models can develop emergent abilities, which are not explicitly trained but appear at scale and are not present in smaller models. . These abilities can be enhanced using few-shot and augmented prompting techniques. ref [15 Jun 2022]
- Multitask Prompted Training Enables Zero-Shot Task Generalization: [cnt]: A language model trained on various tasks using prompts can learn and generalize to new tasks in a zero-shot manner. [15 Oct 2021]
- Language Modeling Is Compression: [cnt]: Lossless data compression, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%). [19 Sep 2023]
- LLMs Represent Space and Time: [cnt]: Large language models learn world models of space and time from text-only training. [3 Oct 2023]
- Large Language Models for Software Engineering: [cnt]: Survey and Open Problems, Large Language Models (LLMs) for Software Engineering (SE) applications, such as code generation, testing, repair, and documentation. [5 Oct 2023]
- LLMs for Chip Design: Domain-Adapted LLMs for Chip Design [31 Oct 2023]
Large Language Models (in 2023) / GPT series release date
Evolutionary Tree of Large Language Models / GPT series release date
- A Survey of Large Language Models: [cnt] /git (⭐9.8k) [31 Mar 2023] contd.
- LLM evolutionary tree: [cnt]: A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers) git (⭐9.1k) [26 Apr 2023]
LLM Materials for East Asian Languages / Japanese
- LLM 研究プロジェクト: ブログ記事一覧 [27 Jul 2023]
- rinna: rinna の 36 億パラメータの日本語 GPT 言語モデル: 3.6 billion parameter Japanese GPT language model [17 May 2023]
- rinna: bilingual-gpt-neox-4b: 日英バイリンガル大規模言語モデル [17 May 2023]
Learning and Supplementary Materials / Korean
- Attention Is All You Need: [cnt]: 🏆 The Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. [12 Jun 2017] Illustrated transformer
- Must read: the 100 most cited AI papers in 2022 : doc [8 Mar 2023]
- The Best Machine Learning Resources : doc [20 Aug 2017]
- What are the most influential current AI Papers?: NLLG Quarterly arXiv Report 06/23 git (⭐8) [31 Jul 2023]
- Comparing Adobe Firefly, Dalle-2, OpenJourney, Stable Diffusion, and Midjourney: Generative AI for images [20 Jun 2023]
- Open Problem and Limitation of RLHF: [cnt]: Provides an overview of open problems and the limitations of RLHF [27 Jul 2023]
Section 11: Datasets for LLM Training / OSS Alternatives for OpenAI Code Interpreter (aka. Advanced Data Analytics)
- LLM-generated datasets:
- Self-Instruct: [cnt]: Seed task pool with a set of human-written instructions. [20 Dec 2022]
- Self-Alignment with Instruction Backtranslation: [cnt]: Without human seeding, use LLM to produce instruction-response pairs. The process involves two steps: self-augmentation and self-curation. [11 Aug 2023]
- RedPajama: LLaMA training dataset of over 1.2 trillion tokens git (⭐4.5k) [17 Apr 2023]
Challenges in evaluating AI systems / Math
- Pretraining on the Test Set Is All You Need: [cnt]
- On that note, in the satirical Pretraining on the Test Set Is All You Need paper, the author trains a small 1M parameter LLM that outperforms all other models, including the 1.3B phi-1.5 model. This is achieved by training the model on all downstream academic benchmarks. It appears to be a subtle criticism underlining how easily benchmarks can be "cheated" intentionally or unintentionally (due to data contamination). cite [13 Sep 2023]
2. Awesome Cakephp
Templating
- 🍰 Templating (⭐1) - HTML snippets as value objects, (Font) icons, and templating topics.
- Prev: Dec 12, 2023
- Next: Dec 10, 2023