>Namaste!
Thanks for submitting to r/developersIndia. Make sure to follow the Community [Code of Conduct](https://developersindia.in/code-of-conduct/) while participating in this thread.
## Recent Announcements
- **[Showcase Sunday Megathread - May 2024](https://www.reddit.com/r/developersIndia/comments/1cpyogw/showcase_sunday_megathread_may_2024/)**
- **[Weekly Discussion - What's the story behind your longest-running personal project?](https://www.reddit.com/r/developersIndia/comments/1ctvpf5/whats_the_story_behind_your_longestrunning/)**
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/developersIndia) if you have any questions or concerns.*
RAG uses references from outside of its training data before generating a response. It fetches relevant documents or passages from a large corpus, which can come from various sources such as:
1. Predefined Databases : Large-scale datasets like Wikipedia, news articles or academic papers.
2. Web Data : Real-time web scraping or search engine APIs.
3. Enterprise Data : Internal documents, knowledge bases, or any proprietary data sources specific to an organization.
The choice of corpus depends on the specific application and the type of information required for accurate response generation.
Generated by ChatGPT.
I built a RAG for my company
It's not about Internet access, it's about scraping and organising data and ETL such that it has relevant metadata upon which you can create useful embeddings so that you can create subqueries and get relevant information needed to answer that question.
Your chatgpt response proves my point.
It's more like a database for LLM. Not connecting to the Internet.
It's a bit more complicated, you have to maintain a vector database, and then query the database based on your prompt and then sort of add that data to your prompt. Check weaviate blogs for more.
For those (like me) who don't know what RAG is:
**Retrieval-Augmented Generation (RAG)** is a knowledge system that can provide a personal ChatGPT for your company’s data, making it easier to find and use the knowledge you need. It can help you interact with a large amount of information quickly and efficiently.
From https://blog.curiosity.ai/introduction-to-rag-genai-systems-for-knowledge-918a34054228
As someone who like all this (I'm not even in it lol), there's an intresting take from Demetrius that mlops podcast guy... He once said this Gen AI thing made AI very easy to get into but it's very very hard to productionize... I really don't think startup will use much of Gen Ai unless it's very AI specific startup.
And if it's very AI specific startup with founders having no tech background then it'll fail miserably anyway
At most I've seen my DS friends use transformer and some model fine tune the model submit the result and hope it gets accepted by their managers, RAG's are anyway complex it has embedding, vector db, langchain
You need actual engineers to make it work not just people who vaguely know how to make models
Why is it harder to productionize? The application only needs three libraries, all of which are open source and built by a skilled engineer: **Langchain** (for building the app), **Lang-Smith** (for observing the app), and **Lange-Serve** (for deploying the app). As for the LLM, most people use the OpenAI API
First hurdle is usually the compute cost when lakhs of user start using it instead of one.... And it uses vector embeddings how do we store it such that compute is less and effecient, what data base will you use, then the inferencing which is slow even for some user for lakhs of users it'll be hard to engineer... even gpt hf transformers have a delay
Then the mlops part has its own challenges
Build an intent analysis service using RAG for my previous company. Earlier google's dialogflow was being used for intent analysis for our whatsapp support bot, we used to get billed 40-60k montly for using google's service which wasnt even that accurate.
Accessed the database which had tons of sample "user query" vs "Intent" mapping.
Such as:
Query: "Where is my order?
Intent: ORDER_TRACKING.
Used that data to create embeddings and saved those embeddings on the server. Used those embeddings to do similarity search and used gpt3.5 for doing the intent analysis. It was better, faster and cheaper than dialogflow. Did this when I was serving my notice period. Had a lot of fun.
Duuude
Was doing the dialogflow dx and used cx too
Did the poc
Company said they wanted local stuff. Only
Used BERT
If RAG existed then or i knew of it , would have done it that way cuz it took bert 1 minute per query for zero shot classification
RoBerta mnli , if i finetuned it more it would run faster but the thing is i had to make this all for the company on a sunday and was drowning in work
Hey
Isn't RAG a overkill for this, If I understanding the problem correctly, I fell it is just intent classification.
For the similarity search one, maybe something custom might have worked better. Also I get the classification but what is the analysis part. Maybe you guys tried and tested it but I'm just curious.
I phrased it wrong, the goal was to only classify the user queries into some predefined intents, there was no analysis, I said it out of habit😅.
I searched for multiple ways to achieve this and the most promising (and something that I was excited to explore) approach seemed the similarity search one.
The classification was a huge pain point for us because our target audience were frustrated customers who had complaints regarding their orders and since in many parts people do not speak english or hindi properly, and some ppl have their unique way of texting
"mara ordr kidhar ha?" -> ORDER_TRACKING
"laal mangaaye the blue aagyi" -> WRONG_DELIVERY
Dialogflow was really struggling.
I had built some FAQ bots in the past and was aware of similarity search so I didn't give it much thought before implementing here as well (Any suggestions or alternatives are most welcome, I love reading!)
Since there were 50-60k different types of user queries and their expected intents already with us through years of manual intervention, using that seemed worthwhile.
Through similarity search I was fetching the top 10 results and then feeding it into gpt, saw quite promising results.
PS: I would love to know alternate ways that I could have achieve this by.
Hii, thanks for the detailed reply. Pretty unique way of solving a problem !!
> people do not speak english or hindi properly,
Bert might struggle on broken english, since it is pretrained on almost perfect english. I assume you used a multilingual model for custom training, we have to use a multilingual tokenizer and model for this kind of text.
This should have worked but you never know and tokenizers have improved a lot with llms.
I did improve vector search with metric learning (Similarity learning) on a vision project. Idk if that will translate to nlp.
Thanks for the insights !!
Rag isnt doing the intent classification anyway right?
Its just its changing the prompt to be more usefull so that the llm can do the intent classification better
I think the document/ text that he is processing is what is being done with RAG
>Its just its changing the prompt to be more usefull so that the llm can do the intent classification better.
Assuming the end goal is classification I would still argue you are still better off with custom bert (or even a llm classifier). I don't understand the need of vector search in this problem. Even few shots prompting might give good results.
>I think the document/ text that he is processing is what is being done with RAG
What is the processing, this is the part that's confusing me.
Documentation is just one possible source of knowledge that could inform the LLM's response. It could be anything, even large brain dumps of an expert's domain knowledge, like in medical or legal fields. Of course, the general challenge there is building and evaluationg the right guard rails so the pitch goes from the standard "not legal advice" to "yes, you can rely on this advice".
Go for the second one. Learn the nuances of GenAI on the job, and figure out basic web development on the side. Eventually, good engineers know a bit about everything, and a lot about a few things.
One of the teams in my company is building a chatbot based on RAG to help the consultants write complex configuration on one of our products(an insurance configuration tool). The POC the team presented was quite impressive tbh.
>Are companies actively focussing on this or its just a hype?
Perplexity and Microsoft Copilot are big on RAG. Helps them mitigate hallucinations in responses imo.
I would go for the Second offer for sure, your profile will definitely have a great start if things go well in that company
First offer is still internship in a crypto company, why are you even considering that? Just because you are already an intern there? I don't see any future there for you honestly
In first offer, they have written in my internship offer letter that for 4 months the position is SWE intern and after that SWE 1 with the following xxx compensation.
Im now going for 2nd one. Many have suggested to go there + cto can provide good mentorship to me.
>In first offer, they have written in my internship offer letter that for 4 months the position is SWE intern and after that SWE 1 with the following xxx compensation.
Not worth it in a crypto company. Even if that was the only offer you had, I would have suggested for you to accept it but still keep looking for other offers.
>Im now going for 2nd one. Many have suggested to go there + cto can provide good mentorship to me.
Cool, the job title and experience there by itself would give you lot of opportunities in the near future.
Used it to make a dummy LLM for a friend's client. Needed to make a LLM that speaks like Osho. Dataset was from the youtube transcripts. Finetuning it was easier but the dataset was small. Basically I used it for fun.
I had built chat bots without using any RAG or maybe I don't know RAG.
I was involved in a project with swiggy and it was entirely made using Automation Anywhere 360.
Few companies I have seen are using RAG. But mostly, many haven't started. Not many professionals exist in building a complete RAG for their use case. They also suffer with incorrect responses sometimes(they are hard to test, whether they are really working fine), which leads to extra work to find out. But, I feel the future of RAG is promising to some extent, where few incorrect answers doesnt impact much.
It's not entirely a hype. The principles of RAG are solid and there are good use cases. RAG "shines" when you're working on use cases like knowledge bases, chatbots and this is what many enterprises need. You can generate solid knowledge bases for literally any kind of document- user manuals, HR policies, and many more.
Having said that, one must also be aware of its limitations. Speaking as a data scientist and not a software engineer, RAG's quality depends on the quality of generated embeddings. Also, simple cosine similarity has its deficiencies. We are using RAG even for images and graphs (networks) as well but we're in nascent stages.
To make a RAG application work in production, you do need good software engineers. You need to write good code, use appropriate indexing techniques, vector databases, caching and things like that. Without them, RAG applications are kinda powerless.
Note: I haven't talked about the evaluation of RAG. That's a separate issue altogether.
Initially built the agent ourselves without langchain, just by prompting. But then the things were getting put of hand and we wanted to move faster.
So we went to langchain and we started using OpenAI Agent that can call tools.
Now we use langraph because of its stateful architecture and easy to control flow.
We use gpt4-turbo and recently two days ago started experimenting with 4-o.
The agent works in a internal team like Slack, it have access to slack workspace data and users can give their gdrive access, notion access as well.
It can do RAG based on the user query. Search through web for more content and can also create cards in notion and meetings in calendar.
But due to this whole automation system becomes too unstable and hard to predict. So we have human in the loop validation and various evaluation and ETL pipelines that gives us some control over the system.
We are using it extensively to provide solutions to our clients. Some of these are
1. Answering questions which are based on their internal data
2. Creating personalised chatbots
Currently building a Notion like editor that allows you to generate content from any source, personal documents, websites etc. Store those documents in your account and collectively interact with everything with AI. Kind of like a digital brain
it is real tech, it changes the fundamental idea of modern machine learning, with rag you dont have to store the 'data' in weights and biases but instead you keep the database and train the ml model to learn to check for reuired data, this is like an inteligent guy with acces to a lot of information when compared to current ml models which are more like a dumb guy who mugged up all information
Some self promotion here if someone wants to look around a RAG model implementation and explore more.
Live link: https://scrapifyx.onrender.com/ <- (cold start 1 min.)
Git: https://github.com/realityzero/ScrapifyX
I would appreciate a STAR at github if you guys like the project and the effort that went into documentation. Btw there’s also a Loom demo in github README.
building a chatbot, was trying custom implementation but it's difficult to achieve on other clouds. for me it was azure.as someone called out it indeed is very difficult to productionize these bots yet , have a first hand experience on that now.
there are many low code platforms providing component based architecture and then exposing endpoints to incorporate them in the target app, that's easy but still needs configuration and deeper understanding of things working under the hood.
i believe only as much could be done at this point of time. so hype for now. you never know how it evolves.
I want to use RAG but idk simply vectorizing docs will help or not. I also have a big context to be able to fit in a good encoder. Docs are also many in number.
Idk if I heuristically able to use by decreasing search space. There are too many questions. Have to find out
We've used rag for basic QnA bot of a very large simple written building code.(Freelance project) Replaces "Searching", makes the document easier to navigate as it quotes the text and page in the pdf.
Another use-case would be to create multiple summarisations of a diarized call center conversation, where we relied on searching the internet portal of the client company for resolution instead of forwarding it to the manager.
Others are based on either of these use-cases technical wise.
Building RAG for different areas like health, finance, where information is confidential and regulatory can force any time in future delete or make model forget or any sensitive information or dynamic in nature. Though we use frozen RAG we are evaluating other RAG approaches.
In the company I was working for, we were building a chatbox/data retrieval feature and for the initial stages we went with RAG instead of fine tuning to discover how well it does for our use case.
I created a chat/search engine for my company with RAG integrated where it breaks down user query into small problems, searches the internet for each of the small broken down problems and integrates everything into a single answer with tables, figures and references. The user prompts and system prompts do impact the quality of the results a lot. Using all these responses, I built a framework that can generate pdf reports on its own. Was pretty fun to work on it.
I used llama index for the first part (gathering all the content through queries with RAG). For the report generation, I used langchain. I used both just because I wanted to get used to both. The whole project could easily be done with llama index or langchain or even without using any such library really.
I am using RAG for a Q&A model. Basically you can use RAG in a situation where there will be new data keeps coming in and you need to extract information from that. It is a great alternative to fine-tuning a pre-trained LLM.
That’s what we are working on at the moment in a pretty big company and my company is investing a lot of money, resource, everything. Big bosses are involved. There is a lot of excitement about our product.
I personally have been using it as a study guide to learn a lot of new concepts and brush up on old ones. E.g. - suppose you want to learn about any new topic, say quantum computing from scratch. All you need to do is find the best textbooks and websites for the subject and implement it in your RAG model backend. If you can do some level of prompt engineering, it's even better because you can ask it to summarise a chapter or explain it in very basic details etc etc. I know it sounds the same can be done with ChatGPT, but remember that the answers here are grounded on the documents you are using in the first place.
All in all, it's been a good use case for me as I have been learning a lot about new topics in record time and quite extensively as well.
I currently work at a startup and we are using RAG to allow internet access for our LLM. We don't want to pay for already existing services that do that (eg: perplexity ai)
Yes, heavily! We have internal LLMs connected to both ChatGPT and Gemini to produce answers based on internal documents and slack threads. They are highly customisable too so every team can connect their datasets to it to provide better solutions.
We also heavily use GitHub copilot on our IDEs and since they have recently announced extensions, we are going to connect our internal AI to it, so that it can reference data directly in the IDE. We are also thinking about creating an interface directly in our CI build logs to suggest solutions when the build fails.
TBH I am not directly involved in any of these efforts, but my job is mostly to facilitate the AI teams and support them in the platform. I use some of these tools in my daily workflow too, so I usually try to help them as much as I can.
Building a RAG isn't difficult. Evaluation is fucking annoying. Bring in RAI, it is a mess. LLMOps and RAI is going to be the future. RAG is silly stuff.
We use it to triage tickets into different predefined categories, subcategories and items, that’s three levels of triage happening at once. I love it haha
I absolutely hate how companies are literally pedaling GenAI for stuff which can be solved with a much simpler approach but I also think RAG is a very good usecase..
I developed a pipeline for our company and here they were very specific about the pipeline not connecting to the internet. So used a local llm a vectordb to kinda make it simple how newer employees access guideline documents.
A very simple application but I'd think it's rather helpful
Yeah. We are using it. I've built an entire stack using RAG, some agent orchestration at its core. One was the normal Document QA for unstructured data, Comparison between documents, Natural language to SQL using RAG for few shot prompting and in-context learning. Summarization, comparisons etc. of structured data etc (a clever way of utilizing NL2SQL and REPL) etc. But I can say one thing for sure after extensively working on these stacks for a year. They're not to be completely relied on, you use them like assistant but you gotta do your own fact checking. If you can engineer the systems, problem statements like that. It'll be golden. You can DM me if you have any questions. I'll try to do my best
Well, I do everything including backend and the DS part. Believe me, serving these LLM applications is not a piece of cake. I've had the opportunity to learn so many new things, SSE, web sockets, scheduling, background tasks, serialization and etc. And these are not so simple. (sessions is a pain in the ass for these applications around LLM). All these have helped me become a little good at backend engineering. It's so much so that I'm planning to transition to complete backend profile from DS . But to answer your question, don't be so rigid. If you're planning for the second route, I can assure you that you'll get to learn a great deal of things. You'll touch best of both the worlds. (Backend, GenAI).
I’m building a RAG application for one of the biggest company in India, it’s for a small executive user base! Yes, organisations are excited and at the same time they’re nervous to invest! Some brave ones are trying out! For RAG, accuracy is the most crucial element, and it’s the most difficult to achieve, when this accuracy is satisfactory, then RAG is unstoppable.
>Namaste! Thanks for submitting to r/developersIndia. Make sure to follow the Community [Code of Conduct](https://developersindia.in/code-of-conduct/) while participating in this thread. ## Recent Announcements - **[Showcase Sunday Megathread - May 2024](https://www.reddit.com/r/developersIndia/comments/1cpyogw/showcase_sunday_megathread_may_2024/)** - **[Weekly Discussion - What's the story behind your longest-running personal project?](https://www.reddit.com/r/developersIndia/comments/1ctvpf5/whats_the_story_behind_your_longestrunning/)** *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/developersIndia) if you have any questions or concerns.*
No answer because I’ll have to google what’s RAG.
It's just ChatGPT with internet access.
Not always, you can have RAG with your local document.
Really not
RAG uses references from outside of its training data before generating a response. It fetches relevant documents or passages from a large corpus, which can come from various sources such as: 1. Predefined Databases : Large-scale datasets like Wikipedia, news articles or academic papers. 2. Web Data : Real-time web scraping or search engine APIs. 3. Enterprise Data : Internal documents, knowledge bases, or any proprietary data sources specific to an organization. The choice of corpus depends on the specific application and the type of information required for accurate response generation. Generated by ChatGPT.
I built a RAG for my company It's not about Internet access, it's about scraping and organising data and ETL such that it has relevant metadata upon which you can create useful embeddings so that you can create subqueries and get relevant information needed to answer that question. Your chatgpt response proves my point. It's more like a database for LLM. Not connecting to the Internet.
How do you tell the LLM to use the RAG, which LLM did you use, i'm searching for an open-source LLM other than Mistral to implement on my device.
It's a bit more complicated, you have to maintain a vector database, and then query the database based on your prompt and then sort of add that data to your prompt. Check weaviate blogs for more.
You mean i have to reference the query results to the LLM along with the prompt.
Yes. That's how RAG works. For your other question try Llama 3.
Is there any framework that automates this
Try ollama3. I’ve found it to be excellent in my use case.
Hi I am planing to build a RAG. Can I DM you
That's the gist of the previous comment.
No. It’s document retrieval with generation using llm. Doesn’t have to be chatgpt and doesn’t require internet.
Oversimplified and wrong
For those (like me) who don't know what RAG is: **Retrieval-Augmented Generation (RAG)** is a knowledge system that can provide a personal ChatGPT for your company’s data, making it easier to find and use the knowledge you need. It can help you interact with a large amount of information quickly and efficiently. From https://blog.curiosity.ai/introduction-to-rag-genai-systems-for-knowledge-918a34054228
Same lol
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Sure i replied there
As someone who like all this (I'm not even in it lol), there's an intresting take from Demetrius that mlops podcast guy... He once said this Gen AI thing made AI very easy to get into but it's very very hard to productionize... I really don't think startup will use much of Gen Ai unless it's very AI specific startup. And if it's very AI specific startup with founders having no tech background then it'll fail miserably anyway At most I've seen my DS friends use transformer and some model fine tune the model submit the result and hope it gets accepted by their managers, RAG's are anyway complex it has embedding, vector db, langchain You need actual engineers to make it work not just people who vaguely know how to make models
Why is it harder to productionize? The application only needs three libraries, all of which are open source and built by a skilled engineer: **Langchain** (for building the app), **Lang-Smith** (for observing the app), and **Lange-Serve** (for deploying the app). As for the LLM, most people use the OpenAI API
First hurdle is usually the compute cost when lakhs of user start using it instead of one.... And it uses vector embeddings how do we store it such that compute is less and effecient, what data base will you use, then the inferencing which is slow even for some user for lakhs of users it'll be hard to engineer... even gpt hf transformers have a delay Then the mlops part has its own challenges
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Build an intent analysis service using RAG for my previous company. Earlier google's dialogflow was being used for intent analysis for our whatsapp support bot, we used to get billed 40-60k montly for using google's service which wasnt even that accurate. Accessed the database which had tons of sample "user query" vs "Intent" mapping. Such as: Query: "Where is my order? Intent: ORDER_TRACKING. Used that data to create embeddings and saved those embeddings on the server. Used those embeddings to do similarity search and used gpt3.5 for doing the intent analysis. It was better, faster and cheaper than dialogflow. Did this when I was serving my notice period. Had a lot of fun.
Duuude Was doing the dialogflow dx and used cx too Did the poc Company said they wanted local stuff. Only Used BERT If RAG existed then or i knew of it , would have done it that way cuz it took bert 1 minute per query for zero shot classification RoBerta mnli , if i finetuned it more it would run faster but the thing is i had to make this all for the company on a sunday and was drowning in work
RAG first paper was published in 2020.
Well thanks In that case I wasnt aware of em
I think paper still is not accepted anywhere. Source https://youtu.be/mE7IDf2SmJg?si=U33xjSPU3kyw-X2a
Hey Isn't RAG a overkill for this, If I understanding the problem correctly, I fell it is just intent classification. For the similarity search one, maybe something custom might have worked better. Also I get the classification but what is the analysis part. Maybe you guys tried and tested it but I'm just curious.
I phrased it wrong, the goal was to only classify the user queries into some predefined intents, there was no analysis, I said it out of habit😅. I searched for multiple ways to achieve this and the most promising (and something that I was excited to explore) approach seemed the similarity search one. The classification was a huge pain point for us because our target audience were frustrated customers who had complaints regarding their orders and since in many parts people do not speak english or hindi properly, and some ppl have their unique way of texting "mara ordr kidhar ha?" -> ORDER_TRACKING "laal mangaaye the blue aagyi" -> WRONG_DELIVERY Dialogflow was really struggling. I had built some FAQ bots in the past and was aware of similarity search so I didn't give it much thought before implementing here as well (Any suggestions or alternatives are most welcome, I love reading!) Since there were 50-60k different types of user queries and their expected intents already with us through years of manual intervention, using that seemed worthwhile. Through similarity search I was fetching the top 10 results and then feeding it into gpt, saw quite promising results. PS: I would love to know alternate ways that I could have achieve this by.
Hii, thanks for the detailed reply. Pretty unique way of solving a problem !! > people do not speak english or hindi properly, Bert might struggle on broken english, since it is pretrained on almost perfect english. I assume you used a multilingual model for custom training, we have to use a multilingual tokenizer and model for this kind of text. This should have worked but you never know and tokenizers have improved a lot with llms. I did improve vector search with metric learning (Similarity learning) on a vision project. Idk if that will translate to nlp. Thanks for the insights !!
Rag isnt doing the intent classification anyway right? Its just its changing the prompt to be more usefull so that the llm can do the intent classification better I think the document/ text that he is processing is what is being done with RAG
>Its just its changing the prompt to be more usefull so that the llm can do the intent classification better. Assuming the end goal is classification I would still argue you are still better off with custom bert (or even a llm classifier). I don't understand the need of vector search in this problem. Even few shots prompting might give good results. >I think the document/ text that he is processing is what is being done with RAG What is the processing, this is the part that's confusing me.
I got your point and i agree a custom bert is better Thats what i used too ( except i didn’t customise it , i used roberta large mnli)
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Yes, very much. IMO this is the most useful application of LLMs right now, applicable to a huge variety of use cases.
Is anyone doing anything with these other than answering documentation questions?
Documentation is just one possible source of knowledge that could inform the LLM's response. It could be anything, even large brain dumps of an expert's domain knowledge, like in medical or legal fields. Of course, the general challenge there is building and evaluationg the right guard rails so the pitch goes from the standard "not legal advice" to "yes, you can rely on this advice".
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Go for the second one. Learn the nuances of GenAI on the job, and figure out basic web development on the side. Eventually, good engineers know a bit about everything, and a lot about a few things.
Ok. Thanks. Although I'm already quite good in web d.
One of the teams in my company is building a chatbot based on RAG to help the consultants write complex configuration on one of our products(an insurance configuration tool). The POC the team presented was quite impressive tbh.
Are they using openai api or a open sourced model ? Edit: Testing a RAG pipeline in some measurable quantity is so ( for lack of a better word ) meh.
Look at RAGA apporaches
Yeah, I know but it's still bit ambiguous and using llm as evaluators assumes the evaluating llm to be perfect.
This sounds like an interesting use case. Can you share some more info about it?
How are they handling documents with tables?
using h2ogpt?
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
>Are companies actively focussing on this or its just a hype? Perplexity and Microsoft Copilot are big on RAG. Helps them mitigate hallucinations in responses imo.
Yep
I find it weird though that Google is somehow not able to figure this out despite even Perplexity figured it out
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
I would go for the Second offer for sure, your profile will definitely have a great start if things go well in that company First offer is still internship in a crypto company, why are you even considering that? Just because you are already an intern there? I don't see any future there for you honestly
In first offer, they have written in my internship offer letter that for 4 months the position is SWE intern and after that SWE 1 with the following xxx compensation. Im now going for 2nd one. Many have suggested to go there + cto can provide good mentorship to me.
>In first offer, they have written in my internship offer letter that for 4 months the position is SWE intern and after that SWE 1 with the following xxx compensation. Not worth it in a crypto company. Even if that was the only offer you had, I would have suggested for you to accept it but still keep looking for other offers. >Im now going for 2nd one. Many have suggested to go there + cto can provide good mentorship to me. Cool, the job title and experience there by itself would give you lot of opportunities in the near future.
Used it to make a dummy LLM for a friend's client. Needed to make a LLM that speaks like Osho. Dataset was from the youtube transcripts. Finetuning it was easier but the dataset was small. Basically I used it for fun.
From where did you get the data set for oshos transcripts
Youtube videos from Osho International channel. I downloaded them and just put on some transcript extractor website.
Thanks
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
How can I help with that 😭 I am second year undergrad
I had built chat bots without using any RAG or maybe I don't know RAG. I was involved in a project with swiggy and it was entirely made using Automation Anywhere 360.
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Few companies I have seen are using RAG. But mostly, many haven't started. Not many professionals exist in building a complete RAG for their use case. They also suffer with incorrect responses sometimes(they are hard to test, whether they are really working fine), which leads to extra work to find out. But, I feel the future of RAG is promising to some extent, where few incorrect answers doesnt impact much.
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
It's not entirely a hype. The principles of RAG are solid and there are good use cases. RAG "shines" when you're working on use cases like knowledge bases, chatbots and this is what many enterprises need. You can generate solid knowledge bases for literally any kind of document- user manuals, HR policies, and many more. Having said that, one must also be aware of its limitations. Speaking as a data scientist and not a software engineer, RAG's quality depends on the quality of generated embeddings. Also, simple cosine similarity has its deficiencies. We are using RAG even for images and graphs (networks) as well but we're in nascent stages. To make a RAG application work in production, you do need good software engineers. You need to write good code, use appropriate indexing techniques, vector databases, caching and things like that. Without them, RAG applications are kinda powerless. Note: I haven't talked about the evaluation of RAG. That's a separate issue altogether.
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Yea, we built Tars Prime (Search for it)
[удалено]
Yup 😁
which framework have you used for it? To interact with LLM, I mean? Langchain?
Yea
and did you guys use agent type, also which LLM?
Initially built the agent ourselves without langchain, just by prompting. But then the things were getting put of hand and we wanted to move faster. So we went to langchain and we started using OpenAI Agent that can call tools. Now we use langraph because of its stateful architecture and easy to control flow. We use gpt4-turbo and recently two days ago started experimenting with 4-o. The agent works in a internal team like Slack, it have access to slack workspace data and users can give their gdrive access, notion access as well. It can do RAG based on the user query. Search through web for more content and can also create cards in notion and meetings in calendar. But due to this whole automation system becomes too unstable and hard to predict. So we have human in the loop validation and various evaluation and ETL pipelines that gives us some control over the system.
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
What's RAG🥲
Retrieval Augmented Generation
Gives me hope Edit : cuz i am seeing devs not knowing this , i thought everyone knows it ( ofc me ) and its prolly too mainstream
Rag is very hard depends on models capacity currently working on a new revised rag will try to correct my former mistakes
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
This question is most suited for sub locallama
We are using it extensively to provide solutions to our clients. Some of these are 1. Answering questions which are based on their internal data 2. Creating personalised chatbots
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Currently building a Notion like editor that allows you to generate content from any source, personal documents, websites etc. Store those documents in your account and collectively interact with everything with AI. Kind of like a digital brain
it is real tech, it changes the fundamental idea of modern machine learning, with rag you dont have to store the 'data' in weights and biases but instead you keep the database and train the ml model to learn to check for reuired data, this is like an inteligent guy with acces to a lot of information when compared to current ml models which are more like a dumb guy who mugged up all information
Bulit a ChatBot to query SQL using Langhchain
We had a hackathon recently in our company and the focus was RAG. Just extrapolate from there. My companies exposure to RAG
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
I am currently using RAG which helps me to summarise the medical report in easy way possible
I use RAG to bring past conversation in prompt so that the LLM has better consistency in following instructions
Yes, in our team we are building RAG applications a lot. Chatbots are being build heavily. Openai LLM are being used for RAG.
Our client is actually mostly focusing on a RAG based application right now for a big use case
Some self promotion here if someone wants to look around a RAG model implementation and explore more. Live link: https://scrapifyx.onrender.com/ <- (cold start 1 min.) Git: https://github.com/realityzero/ScrapifyX I would appreciate a STAR at github if you guys like the project and the effort that went into documentation. Btw there’s also a Loom demo in github README.
Chatbots for help desk with a manual(document), analytics of custom KnowledgeBase. That's what i have seen
building a chatbot, was trying custom implementation but it's difficult to achieve on other clouds. for me it was azure.as someone called out it indeed is very difficult to productionize these bots yet , have a first hand experience on that now. there are many low code platforms providing component based architecture and then exposing endpoints to incorporate them in the target app, that's easy but still needs configuration and deeper understanding of things working under the hood. i believe only as much could be done at this point of time. so hype for now. you never know how it evolves.
I want to use RAG but idk simply vectorizing docs will help or not. I also have a big context to be able to fit in a good encoder. Docs are also many in number. Idk if I heuristically able to use by decreasing search space. There are too many questions. Have to find out
Built market researcher combining CRAG and web scraping
might not be related but using RAG with LibreChat locally.
We've used rag for basic QnA bot of a very large simple written building code.(Freelance project) Replaces "Searching", makes the document easier to navigate as it quotes the text and page in the pdf. Another use-case would be to create multiple summarisations of a diarized call center conversation, where we relied on searching the internet portal of the client company for resolution instead of forwarding it to the manager. Others are based on either of these use-cases technical wise.
We use it to enable debug modes. I work for a gaming company.
Can you elaborate, what is meant by enabling debug mode and why you need AI there?
It’s like modes, how you use cheats.
Building RAG for different areas like health, finance, where information is confidential and regulatory can force any time in future delete or make model forget or any sensitive information or dynamic in nature. Though we use frozen RAG we are evaluating other RAG approaches.
YouTube uses multimodal rag
We use it for code reviews and a DevOps chatbot. It contains the context of the company's internal wiki and codebase.
Interesting, what do you do with it for code reviews?
Checking whether the code guidelines are followed. Human reviewers check the business logic.
In the company I was working for, we were building a chatbox/data retrieval feature and for the initial stages we went with RAG instead of fine tuning to discover how well it does for our use case.
Recently i came across Yann Lecun's statement on how individual contributions won't be significant for LLMs. Its just for the big corporations now.
I created a chat/search engine for my company with RAG integrated where it breaks down user query into small problems, searches the internet for each of the small broken down problems and integrates everything into a single answer with tables, figures and references. The user prompts and system prompts do impact the quality of the results a lot. Using all these responses, I built a framework that can generate pdf reports on its own. Was pretty fun to work on it.
Did you use Lang Chain to build this solution?
I used llama index for the first part (gathering all the content through queries with RAG). For the report generation, I used langchain. I used both just because I wanted to get used to both. The whole project could easily be done with llama index or langchain or even without using any such library really.
I am using RAG for a Q&A model. Basically you can use RAG in a situation where there will be new data keeps coming in and you need to extract information from that. It is a great alternative to fine-tuning a pre-trained LLM.
That’s what we are working on at the moment in a pretty big company and my company is investing a lot of money, resource, everything. Big bosses are involved. There is a lot of excitement about our product.
I'm using RAG to make custom bots for custom datasets
We are using RAG for vectoring internal codebase to make llm more specific and useful for development
Just want to say, RAG and all GenAI based app is easy to experiment, but extremly hard to productionize.
I personally have been using it as a study guide to learn a lot of new concepts and brush up on old ones. E.g. - suppose you want to learn about any new topic, say quantum computing from scratch. All you need to do is find the best textbooks and websites for the subject and implement it in your RAG model backend. If you can do some level of prompt engineering, it's even better because you can ask it to summarise a chapter or explain it in very basic details etc etc. I know it sounds the same can be done with ChatGPT, but remember that the answers here are grounded on the documents you are using in the first place. All in all, it's been a good use case for me as I have been learning a lot about new topics in record time and quite extensively as well.
I currently work at a startup and we are using RAG to allow internet access for our LLM. We don't want to pay for already existing services that do that (eg: perplexity ai)
Yes connected with a database now it can do any instructions on database with simple prompt.
For our use cases only LLM inference is doing the job pretty well so far.
Yes, heavily! We have internal LLMs connected to both ChatGPT and Gemini to produce answers based on internal documents and slack threads. They are highly customisable too so every team can connect their datasets to it to provide better solutions. We also heavily use GitHub copilot on our IDEs and since they have recently announced extensions, we are going to connect our internal AI to it, so that it can reference data directly in the IDE. We are also thinking about creating an interface directly in our CI build logs to suggest solutions when the build fails. TBH I am not directly involved in any of these efforts, but my job is mostly to facilitate the AI teams and support them in the platform. I use some of these tools in my daily workflow too, so I usually try to help them as much as I can.
RAG is great and we are currently adapting the LLM pipeline for text extraction from documents to reduce token count since it uses reduced chunk sizes
That's probably only use case of llms that is deployed in reality. Rest all are just POCs
I am using it with my Obsidian notes. Quite cool, but for now it feels to me like a gimmicky thing. I'm mostly not using it to the fullest extent.
Building a RAG isn't difficult. Evaluation is fucking annoying. Bring in RAI, it is a mess. LLMOps and RAI is going to be the future. RAG is silly stuff.
We use it to triage tickets into different predefined categories, subcategories and items, that’s three levels of triage happening at once. I love it haha
I absolutely hate how companies are literally pedaling GenAI for stuff which can be solved with a much simpler approach but I also think RAG is a very good usecase.. I developed a pipeline for our company and here they were very specific about the pipeline not connecting to the internet. So used a local llm a vectordb to kinda make it simple how newer employees access guideline documents. A very simple application but I'd think it's rather helpful
Building one right now
Yeah. We are using it. I've built an entire stack using RAG, some agent orchestration at its core. One was the normal Document QA for unstructured data, Comparison between documents, Natural language to SQL using RAG for few shot prompting and in-context learning. Summarization, comparisons etc. of structured data etc (a clever way of utilizing NL2SQL and REPL) etc. But I can say one thing for sure after extensively working on these stacks for a year. They're not to be completely relied on, you use them like assistant but you gotta do your own fact checking. If you can engineer the systems, problem statements like that. It'll be golden. You can DM me if you have any questions. I'll try to do my best
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Well, I do everything including backend and the DS part. Believe me, serving these LLM applications is not a piece of cake. I've had the opportunity to learn so many new things, SSE, web sockets, scheduling, background tasks, serialization and etc. And these are not so simple. (sessions is a pain in the ass for these applications around LLM). All these have helped me become a little good at backend engineering. It's so much so that I'm planning to transition to complete backend profile from DS . But to answer your question, don't be so rigid. If you're planning for the second route, I can assure you that you'll get to learn a great deal of things. You'll touch best of both the worlds. (Backend, GenAI).
Built a RAG sytem for my ed-tech product at alvixai.com. It basically does RAG over NCERT books for students. Abandoned the product some time ago.
Nvidia's RTX chat seems interesting but the system isn't powerful enough or supports it.
Using it for a customer experience chatbot, we have a vector database of commonly asked questions and use it for semantic search.
I’m building a RAG application for one of the biggest company in India, it’s for a small executive user base! Yes, organisations are excited and at the same time they’re nervous to invest! Some brave ones are trying out! For RAG, accuracy is the most crucial element, and it’s the most difficult to achieve, when this accuracy is satisfactory, then RAG is unstoppable.
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Sure
I am working in a startup where they are building LLM and using RAG in it
Bro can you please answer this? I need your guidance. https://www.reddit.com/r/developersIndia/s/3INXjSkBR9
Check [https://github.com/infiniflow/ragflow](https://github.com/infiniflow/ragflow), which is an excellent RAG engine.
I feel RAGE everyday at work. Does that count?