Advanced RAG with Re-Ranking | Groq | Pinecone | Cohere | Langchain | Ollama embeddings

Posted by

Hey there! Today, we’re diving into creating a cool rag (retrieval-augmented generation) app. Sometimes our app gives us chunks of text that don’t quite match up from our vector database. No worries, though! We’ve got a plan to fix it using Cohere’s platform API to re-rank our results.

Here’s our game plan:

  1. Starting Point: We’ll start by grabbing info from PDFs, and we’ll use Langchain to help us split those docs apart.
  2. Getting Fancy with Embeddings: Then, we’ll use ollama free embeddings to turn our text into fancy embeddings.
  3. Database Magic: Pinecone’s Vector Database will help us keep track of everything smoothly.
  4. Re-Ranking: With Cohere’s API, we’ll tweak our results to make them better.
  5. Accessing LLMs for Free: We’re also using Groq API to access LLMs (large language models) for free. That’s right, no cost!
  6. Testing, Testing: Finally, we’ll test everything with some question answering to make sure it’s all working like a charm.

So, let’s get started and make our rag app top-notch!

I am using jupyter notebook for this case you can use any editor or notebook. I have demo.pdf file in the same folder where i have created my notebook.

Let first grab all required api keys,

Pinecone

Go to https://app.pinecone.io and create api key.

After creating api key store in env.py file from where you will take it. Then create index in pinecone.

Add index name, dimentions(taking 768), capacity mode take default serverless(which is in public review phase) and create with other default options. it will look like below,

Groq

for groq api key go to groqcloud(https://console.groq.com/) , create account and create api key and store in env.py file.

for cohere go to https://dashboard.cohere.com/ and create account and create api key and store in env.py file.


Ollama

For ollama embedding we have to download ollama in our system and pull embedding model. refer https://medium.com/@gabrielrodewald/running-models-with-ollama-step-by-step-60b6f6125807 for this setup.

Now below is the code…

YouTube Tutorial Video

Leave a Reply

Your email address will not be published. Required fields are marked *