Introduction to branched RAG

This article was originally published on IBM Developer by Rajeev Mishra, Diwakar Kumar, Aditi Chawla, and Sunaina Saxena Branched retrieval-augmented generation (branched RAG) is an advanced iterative approach to information retrieval and response generation. It works by breaking down complex queries into multiple sub-questions, enabling more targeted retrieval of information. Each sub-question is used to gather specific insights, refining the overall context. The model then synthesizes the final response by integrating insights from these branched retrievals, leading to a more accurate and comprehensive answer. Traditional RAG pipelines often face challenges when handling complex queries that involve multiple dimensions or layers of information. For example, a query such as, "What is the impact of machine learning on healthcare and finance?," spans two distinct sectors (healthcare and finance), each requiring specialized information. To address such complex queries effectively, branched RAG breaks them down into simpler sub-questions, allowing for more precise and relevant responses. In this article, we’ll describe an Investment Advisor Tool that uses branched RAG to deliver personalized financial advice by deconstructing user queries into sub-questions and retrieving relevant information from financial reports, market analyses, and investment guidelines. The following graphic represents our branched-RAG implementation flow: The data collection framework leverages multiple open-source financial APIs: Yfinance API: Provides unrestricted access to Yahoo Finance data, offering granular stock and ETF information at intervals as low as 1-minute. Its open-source nature and easy setup make it ideal for high-frequency data collection. Newsapi: The News API enables searching and retrieving live articles from the web based on criteria like keywords, publication date, source, and language. Results can be sorted by date, relevancy, or popularity. An API key is required for access, and the free development tier allows up to 100 requests per day. Alpha Vantage: A popular service for financial data, including stock prices, forex, and cryptocurrency information. The API is simple to use, with a free tier permitting up to 5 requests per minute and 500 requests per day. Finnhub: Finnhub API provides real-time stock market, cryptocurrency, and financial news data, making it a powerful tool for developers working in financial analysis and trading systems. The free tier of Finnhub's API has a limit of 60 API calls per minute, with an additional limit of 30 API calls per second. Polygon.io Stocks API: Provides REST endpoints to access the latest market data from all U.S. stock exchanges. It also offers insights into company financials, stock market holidays, corporate actions, and more. The free plan allows up to 5 requests per minute. In the following code snippet, the function, recommendation_and_financials_data, is designed to gather stock recommendations and financial data for a list of symbols provided in data["stocks"]. Leveraging the yfinance library, it retrieves relevant information for each symbol, structures it in JSON format, and saves it to designated files in the specified directory. Additionally, the function checks for existing data and preserves it, ensuring that newly retrieved data is seamlessly integrated with previously stored information. Implementing branched RAG with watsonx To create our Investment Advisor Tool, we used these components: LLMs available in watsonx.ai A Milvus vector store Llama Index IBM Code Engine (for deployment). Click on IBM Developer to read the entire article.

Mar 26, 2025 - 14:41
 0
Introduction to branched RAG

This article was originally published on IBM Developer by Rajeev Mishra, Diwakar Kumar, Aditi Chawla, and Sunaina Saxena

Branched retrieval-augmented generation (branched RAG) is an advanced iterative approach to information retrieval and response generation. It works by breaking down complex queries into multiple sub-questions, enabling more targeted retrieval of information. Each sub-question is used to gather specific insights, refining the overall context. The model then synthesizes the final response by integrating insights from these branched retrievals, leading to a more accurate and comprehensive answer.

Image description

Traditional RAG pipelines often face challenges when handling complex queries that involve multiple dimensions or layers of information. For example, a query such as, "What is the impact of machine learning on healthcare and finance?," spans two distinct sectors (healthcare and finance), each requiring specialized information. To address such complex queries effectively, branched RAG breaks them down into simpler sub-questions, allowing for more precise and relevant responses.

In this article, we’ll describe an Investment Advisor Tool that uses branched RAG to deliver personalized financial advice by deconstructing user queries into sub-questions and retrieving relevant information from financial reports, market analyses, and investment guidelines.

The following graphic represents our branched-RAG implementation flow:

Image

The data collection framework leverages multiple open-source financial APIs:

  • Yfinance API: Provides unrestricted access to Yahoo Finance data, offering granular stock and ETF information at intervals as low as 1-minute. Its open-source nature and easy setup make it ideal for high-frequency data collection.

  • Newsapi: The News API enables searching and retrieving live articles from the web based on criteria like keywords, publication date, source, and language. Results can be sorted by date, relevancy, or popularity. An API key is required for access, and the free development tier allows up to 100 requests per day.

  • Alpha Vantage: A popular service for financial data, including stock prices, forex, and cryptocurrency information. The API is simple to use, with a free tier permitting up to 5 requests per minute and 500 requests per day.

  • Finnhub: Finnhub API provides real-time stock market, cryptocurrency, and financial news data, making it a powerful tool for developers working in financial analysis and trading systems. The free tier of Finnhub's API has a limit of 60 API calls per minute, with an additional limit of 30 API calls per second.

  • Polygon.io Stocks API: Provides REST endpoints to access the latest market data from all U.S. stock exchanges. It also offers insights into company financials, stock market holidays, corporate actions, and more. The free plan allows up to 5 requests per minute.

In the following code snippet, the function, recommendation_and_financials_data, is designed to gather stock recommendations and financial data for a list of symbols provided in data["stocks"]. Leveraging the yfinance library, it retrieves relevant information for each symbol, structures it in JSON format, and saves it to designated files in the specified directory.

Additionally, the function checks for existing data and preserves it, ensuring that newly retrieved data is seamlessly integrated with previously stored information.

Image

Implementing branched RAG with watsonx

To create our Investment Advisor Tool, we used these components:

  • LLMs available in watsonx.ai
  • A Milvus vector store
  • Llama Index
  • IBM Code Engine (for deployment).

Click on IBM Developer to read the entire article.