I have a Pdf which has different mortgage loan programs, each program has certain criteria and requirements mentioned.
I wanted a smart chatbot that knows every programs criteria and upon prompting it gives accurate values.
I used: 1. Pinecone as my vector store(dimension 1536) – Extracted text from PDF reader – chunk size of 14000 – overlap of 20 2. OpenAI embeddings 3. OpenAI model gpt-3.5-turbo-16k 4. Langchain load_qa_chain – prompt template – ConversationWindowMemory – chain_type: stuff 5. Semantic search
The reason I used (which I might be wrong) chunk size of 14k is it doesnot get the context document upon user query like when I say: “How many loan programs are there?” AI: There are 4 programs.
Where in actual there are 10 programs. Rightnow I am getting 60% of the response correct like it gives correct criteria but sometimes it mixes up the other programs criteria which is wrong.
I’ve seen yesterday that we can use agents on our chains, like agent can call our chain if it wants to search through vector store. (But you guys can guide me)
Secondly I think the parameters I am using rightnow are not right for my use case, like chain type or maybe I am missing something. Maybe a different approach is needed in my scenario.
For your understanding the Pdf has 10 programs and they are separate not interrelated. There are several conditions of the LTVs (Loan to value) which I think confuses chatgpt and it gives wrong value.
I am really in distress rightnow, anyone of your suggestions would be highly appreciated. Thanks
P.S. I am ready to provide more info if anyone needs for better understanding.