Supporting blog content - Local rag with lightweight Elasticsearch #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

Delacrobix wants to merge 16 commits into elastic:main from Delacrobix:local-rag-with-lightweight-elasticsearch

Contributor

Delacrobix commented Sep 18, 2025

No description provided.

Delacrobix added 3 commits

September 18, 2025 11:13


          supporting blog content local-rag-with-lightweight-elasticsearch

8792ea0


          app logs

3b15238


          app-logs

40d7e6a

gitnotebooks bot commented Sep 18, 2025

Review these changes at https://app.gitnotebooks.com/elastic/elasticsearch-labs/pull/488

Delacrobix added 5 commits

September 18, 2025 20:53


          script changes and docker image

89f9988


          Deleting docker-compose, adding tinyllama results, code changes and d…

505ad49

…ataset changes


          tinyLlama results

b8e502a


          qwen3:4b model results

401d76c


          local-rag-with-lightweight-elasticsearch

f2d67f3

carlyrichmond requested changes

View reviewed changes

.../local-rag-with-lightweight-elasticsearch/app-logs/llama-smoltalk-3.2-1b-instruct_results.md

Contributor

carlyrichmond Nov 21, 2025

Can you amend the output to ask the LLM to include sources? This will make it easier for the audience to find the applicable document from the dataset.

Contributor Author

Delacrobix Nov 25, 2025

I added citations to the LLM’s responses!

.../local-rag-with-lightweight-elasticsearch/app-logs/llama-smoltalk-3.2-1b-instruct_results.md

Contributor

carlyrichmond Nov 21, 2025

I'm slightly worried about including an example suggesting that a particular technology (specifically Elasticsearch) is slow, since this is on Elasticsearch labs. Especially since one fo the other models suggests it's inefficient. It might be worth amending the transcripts to include a common slowness use case (such as sharding or node issues for a high volume) and regenerate the answer. Alternatively I would change it to different technologies.

Contributor Author

Delacrobix Nov 25, 2025

You’re right, I will make sure to avoid mentioning specific names in the datasets. I removed those names and used generic ones; for example, I replaced Elasticsearch with “Database” and Redis with “Cache implementation.” Related commit.

.../local-rag-with-lightweight-elasticsearch/app-logs/llama-smoltalk-3.2-1b-instruct_results.md Outdated



		## Stats
		✅ Indexed 5 documents in 250ms

Contributor

carlyrichmond Nov 21, 2025

Why is the indexing differing between models? I would expect indexing is a one-off operation independent to the model. This doesn't make sense to me. Should it be removed or clarified?

Contributor Author

Delacrobix Nov 25, 2025

I’m not sure why it differs between tests; I think it’s better to remove it. I did.

...og-content/local-rag-with-lightweight-elasticsearch/app-logs/why-elasticsearch-is-so-cool.md Outdated

Contributor

carlyrichmond Nov 21, 2025

I would change this example to something generic such as "Why is the sky blue?" or something else. As a developer this comes across as quite cringy to me.

Contributor Author

Delacrobix Nov 25, 2025

Changed. I added a new file showing the output response. Related commit.

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py Outdated

+              from openai import OpenAI
+              ES_URL = "http://localhost:9200"
+              ES_API_KEY = "your-api-key-here"

Contributor

carlyrichmond Nov 21, 2025

I would change the URL, API key and LOCAL_AI_URL values to environment variables that are loaded via something like dotenv and a local .env file. While this is fine for local development, when developers try to move this to production they need to tidy it up. So lets set the example now.

Contributor Author

Delacrobix Nov 25, 2025

Ok! Added dotenv support! Related commit.

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py Outdated

		ai_client = OpenAI(base_url=LOCAL_AI_URL, api_key="sk-x")


		def build_documents(dataset_folder, index_name):

Contributor

carlyrichmond Nov 21, 2025

Should this be called load_documents as it's opening text files. It's not really building the documents from scratch which is misleading.

Contributor Author

Delacrobix Nov 25, 2025

Done! Method renamed!

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py Outdated

+                      if filename.endswith(".txt"):
+                          filepath = os.path.join(dataset_folder, filename)
+                          with open(filepath, "r", encoding="utf-8") as file:

Contributor

carlyrichmond Nov 21, 2025

I would add a comment explaining why you've used utf-8 encoding here.

Contributor Author

Delacrobix Nov 25, 2025

Comment added!

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py

		}


		def index_documents():

Contributor

carlyrichmond Nov 21, 2025

Add top level comments for each function explaining what they do.

Contributor Author

Delacrobix Nov 25, 2025

Added functions descriptions! Related commit.

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py

+                  start_time = time.time()
+                  try:
+                      response = ai_client.chat.completions.create(

Contributor

carlyrichmond Nov 21, 2025

I would perhaps add a comment making clear that this is a simple generate rather than streaming of the response token by token.

Contributor Author

Delacrobix Nov 25, 2025

Added a clarifying comment!

supporting-blog-content/local-rag-with-lightweight-elasticsearch/script.py Outdated

+                  try:
+                      start_time = time.time()
+                      success, _ = helpers.bulk(

Contributor

carlyrichmond Nov 21, 2025

You should add the index creation code either here based on the condition that the index doesn't exist, or in a separate utility function. For semantic text you'll need to specify that mapping when creating the index, and that step is missing here.

Contributor Author

Delacrobix Nov 25, 2025

I added the index creation as a method in the script. I also added a verification step before bulk-indexing the data.

Delacrobix added 8 commits

November 24, 2025 11:29


          adding readme methos for create mappings and inference endpoint

876f604


          changing dataset technology references to generic ones

ac05850


          adding dotenv support

6866f1e


          build_documents method renamed to load_documents

0c45108


          adding code comments and index validations

b8e92a8


          removing indexing latency calc

d96c99c

ulations


          prompt changes

c7fae03


          results files updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet