Skip to content

Commit 051e9e3

Browse files
authored
Add black formatter as pre-commit step (#189)
* Add `black-jupyter` pre-commit hook * apply `black` formatting everywhere * CI: remove Android from runner, set timeout=60 * directly remove irrelevant software without action * add df output
1 parent 9d9d275 commit 051e9e3

File tree

77 files changed

+8535
-8637
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+8535
-8637
lines changed

.github/workflows/tests.yml

+9-1
Original file line numberDiff line numberDiff line change
@@ -30,14 +30,22 @@ jobs:
3030
ports:
3131
- 9200:9200
3232
steps:
33+
- name: Remove irrelevant software # to free up required disk space
34+
run: |
35+
df -h
36+
sudo rm -rf /opt/ghc
37+
sudo rm -rf /opt/hostedtoolcache/CodeQL
38+
sudo rm -rf /usr/local/lib/android
39+
sudo rm -rf /usr/share/dotnet
40+
df -h
3341
- name: Checkout
3442
uses: actions/checkout@v4
3543
- name: Setup python
3644
uses: actions/setup-python@v5
3745
with:
3846
python-version: '3.10'
3947
- name: Setup nbtest
40-
run: make nbtest
48+
run: make install-nbtest
4149
- name: Warm up
4250
continue-on-error: true
4351
run: sleep 30 && PATCH_ES=1 ELASTIC_CLOUD_ID=foo ELASTIC_API_KEY=bar bin/nbtest notebooks/search/00-quick-start.ipynb

.pre-commit-config.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,7 @@ repos:
1313
# generic [...]_PASSWORD=[...] pattern
1414
- --additional-pattern
1515
- '_PASSWORD=[0-9a-zA-Z_-]{10}'
16+
- repo: https://github.com/ambv/black
17+
rev: 24.1.1 # Use latest tag on GitHub
18+
hooks:
19+
- id: black-jupyter

CONTRIBUTING.md

+13-3
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,34 @@ If you would like to contribute new example apps to the `elasticsearch-labs` rep
55
## Before you start
66

77
Prior to opening a pull request, please:
8-
- Create an issue to [discuss the scope of your proposal](https://github.com/elastic/elasticsearch-labs/issues). We are happy to provide guidance to make for a pleasant contribution experience.
9-
- Sign the [Contributor License Agreement](https://www.elastic.co/contributor-agreement/). We are not asking you to assign copyright to us, but to give us the right to distribute your code without restriction. We ask this of all contributors in order to assure our users of the origin and continuing existence of the code. You only need to sign the CLA once.
8+
1. Create an issue to [discuss the scope of your proposal](https://github.com/elastic/elasticsearch-labs/issues). We are happy to provide guidance to make for a pleasant contribution experience.
9+
2. Sign the [Contributor License Agreement](https://www.elastic.co/contributor-agreement/). We are not asking you to assign copyright to us, but to give us the right to distribute your code without restriction. We ask this of all contributors in order to assure our users of the origin and continuing existence of the code. You only need to sign the CLA once.
10+
3. Install pre-commit...
1011

1112
### Pre-commit hook
1213

1314
This repository has a pre-commit hook that ensures that your contributed code follows our guidelines. It is strongly recommended that you install the pre-commit hook on your locally cloned repository, as that will allow you to check the correctness of your submission without having to wait for our continuous integration build. To install the pre-commit hook, clone this repository and then run the following command from its top-level directory:
1415

1516
```bash
16-
make pre-commit
17+
make install
1718
```
1819

1920
If you do not have access to the `make` utility, you can also install the pre-commit hook with Python:
2021

2122
```bash
2223
python -m venv .venv
24+
.venv/bin/pip install -qqq -r requirements-dev.txt
2325
.venv/bin/pre-commit install
2426
```
2527

28+
Now it can happen that you get an error when you try to commit, for example if your code or your notebook was not formatted with the [black formatter](https://github.com/psf/black). In this case, please run this command from the repo root:
29+
30+
```bash
31+
make pre-commit
32+
```
33+
34+
If you now include the changed files in your commit, it should succeed.
35+
2636
## General instruction
2737

2838
- If the notebook or code sample requires signing up a Elastic cloud instance, make sure to add appropriate `utm_source` and `utm_content` in the cloud registration url. For example, the Elastic cloud sign up url for the Python notebooks should have `utm_source=github&utm_content=elasticsearch-labs-notebook` and code examples should have `utm_source=github&utm_content=elasticsearch-labs-samples`.

Makefile

+14-10
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
11
# this is the list of notebooks that are integrated with the testing framework
22
NOTEBOOKS = $(shell bin/find-notebooks-to-test.sh)
3+
VENV = .venv
34

4-
.PHONY: install pre-commit nbtest test notebooks
5+
.PHONY: install install-pre-commit install-nbtest test notebooks
56

6-
test: nbtest notebooks
7+
test: install-nbtest notebooks
78

89
notebooks:
910
bin/nbtest $(NOTEBOOKS)
1011

11-
install: pre-commit nbtest
12+
pre-commit: install-pre-commit
13+
$(VENV)/bin/pre-commit run --all-files
1214

13-
pre-commit:
14-
python -m venv .venv
15-
.venv/bin/pip install -qqq -r requirements-dev.txt
16-
.venv/bin/pre-commit install
15+
install: install-pre-commit install-nbtest
1716

18-
nbtest:
19-
python3 -m venv .venv
20-
.venv/bin/pip install -qqq elastic-nbtest
17+
install-pre-commit:
18+
python -m venv $(VENV)
19+
$(VENV)/bin/pip install -qqq -r requirements-dev.txt
20+
$(VENV)/bin/pre-commit install
21+
22+
install-nbtest:
23+
python3 -m venv $(VENV)
24+
$(VENV)/bin/pip install -qqq elastic-nbtest

bin/mocks/elasticsearch.py

+9-9
Original file line numberDiff line numberDiff line change
@@ -8,30 +8,30 @@ def patch_elasticsearch():
88

99
# remove the path entry that refers to this directory
1010
for path in sys.path:
11-
if not path.startswith('/'):
11+
if not path.startswith("/"):
1212
path = os.path.join(os.getcwd(), path)
13-
if __file__ == os.path.join(path, 'elasticsearch.py'):
13+
if __file__ == os.path.join(path, "elasticsearch.py"):
1414
sys.path.remove(path)
1515
break
1616

1717
# remove this module, and import the real one instead
18-
del sys.modules['elasticsearch']
18+
del sys.modules["elasticsearch"]
1919
import elasticsearch
2020

2121
# restore the import path
2222
sys.path = saved_path
2323

24-
# preserve the original Elasticsearch.__init__ method
24+
# preserve the original Elasticsearch.__init__ method
2525
orig_es_init = elasticsearch.Elasticsearch.__init__
2626

2727
# patched version of Elasticsearch.__init__ that connects to self-hosted
2828
# regardless of connection arguments given
2929
def patched_es_init(self, *args, **kwargs):
30-
if 'cloud_id' in kwargs:
31-
assert kwargs['cloud_id'] == 'foo'
32-
if 'api_key' in kwargs:
33-
assert kwargs['api_key'] == 'bar'
34-
return orig_es_init(self, 'http://localhost:9200')
30+
if "cloud_id" in kwargs:
31+
assert kwargs["cloud_id"] == "foo"
32+
if "api_key" in kwargs:
33+
assert kwargs["api_key"] == "bar"
34+
return orig_es_init(self, "http://localhost:9200", timeout=60)
3535

3636
# patch Elasticsearch.__init__
3737
elasticsearch.Elasticsearch.__init__ = patched_es_init

bin/nbtest

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
33

44
if [[ ! -f $SCRIPT_DIR/../.venv/bin/nbtest ]]; then
5-
make nbtest
5+
make install-nbtest
66
fi
77

88
if [[ "$PATCH_ES" != "" ]]; then

example-apps/chatbot-rag-app/api/chat.py

+20-12
Original file line numberDiff line numberDiff line change
@@ -36,31 +36,39 @@ def ask_question(question, session_id):
3636
if len(chat_history.messages) > 0:
3737
# create a condensed question
3838
condense_question_prompt = render_template(
39-
'condense_question_prompt.txt', question=question,
40-
chat_history=chat_history.messages)
39+
"condense_question_prompt.txt",
40+
question=question,
41+
chat_history=chat_history.messages,
42+
)
4143
condensed_question = get_llm().invoke(condense_question_prompt).content
4244
else:
4345
condensed_question = question
4446

45-
current_app.logger.debug('Condensed question: %s', condensed_question)
46-
current_app.logger.debug('Question: %s', question)
47+
current_app.logger.debug("Condensed question: %s", condensed_question)
48+
current_app.logger.debug("Question: %s", question)
4749

4850
docs = store.as_retriever().invoke(condensed_question)
4951
for doc in docs:
50-
doc_source = {**doc.metadata, 'page_content': doc.page_content}
51-
current_app.logger.debug('Retrieved document passage from: %s', doc.metadata['name'])
52-
yield f'data: {SOURCE_TAG} {json.dumps(doc_source)}\n\n'
52+
doc_source = {**doc.metadata, "page_content": doc.page_content}
53+
current_app.logger.debug(
54+
"Retrieved document passage from: %s", doc.metadata["name"]
55+
)
56+
yield f"data: {SOURCE_TAG} {json.dumps(doc_source)}\n\n"
5357

54-
qa_prompt = render_template('rag_prompt.txt', question=question, docs=docs,
55-
chat_history=chat_history.messages)
58+
qa_prompt = render_template(
59+
"rag_prompt.txt",
60+
question=question,
61+
docs=docs,
62+
chat_history=chat_history.messages,
63+
)
5664

57-
answer = ''
65+
answer = ""
5866
for chunk in get_llm().stream(qa_prompt):
59-
yield f'data: {chunk.content}\n\n'
67+
yield f"data: {chunk.content}\n\n"
6068
answer += chunk.content
6169

6270
yield f"data: {DONE_TAG}\n\n"
63-
current_app.logger.debug('Answer: %s', answer)
71+
current_app.logger.debug("Answer: %s", answer)
6472

6573
chat_history.add_user_message(question)
6674
chat_history.add_ai_message(answer)

example-apps/chatbot-rag-app/api/llm_integrations.py

+35-13
Original file line numberDiff line numberDiff line change
@@ -5,37 +5,54 @@
55

66
LLM_TYPE = os.getenv("LLM_TYPE", "openai")
77

8+
89
def init_openai_chat(temperature):
910
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
10-
return ChatOpenAI(openai_api_key=OPENAI_API_KEY, streaming=True, temperature=temperature)
11+
return ChatOpenAI(
12+
openai_api_key=OPENAI_API_KEY, streaming=True, temperature=temperature
13+
)
14+
15+
1116
def init_vertex_chat(temperature):
1217
VERTEX_PROJECT_ID = os.getenv("VERTEX_PROJECT_ID")
1318
VERTEX_REGION = os.getenv("VERTEX_REGION", "us-central1")
1419
vertexai.init(project=VERTEX_PROJECT_ID, location=VERTEX_REGION)
1520
return ChatVertexAI(streaming=True, temperature=temperature)
21+
22+
1623
def init_azure_chat(temperature):
17-
OPENAI_VERSION=os.getenv("OPENAI_VERSION", "2023-05-15")
18-
BASE_URL=os.getenv("OPENAI_BASE_URL")
19-
OPENAI_API_KEY=os.getenv("OPENAI_API_KEY")
20-
OPENAI_ENGINE=os.getenv("OPENAI_ENGINE")
24+
OPENAI_VERSION = os.getenv("OPENAI_VERSION", "2023-05-15")
25+
BASE_URL = os.getenv("OPENAI_BASE_URL")
26+
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
27+
OPENAI_ENGINE = os.getenv("OPENAI_ENGINE")
2128
return AzureChatOpenAI(
2229
deployment_name=OPENAI_ENGINE,
2330
openai_api_base=BASE_URL,
2431
openai_api_version=OPENAI_VERSION,
2532
openai_api_key=OPENAI_API_KEY,
2633
streaming=True,
27-
temperature=temperature)
34+
temperature=temperature,
35+
)
36+
37+
2838
def init_bedrock(temperature):
29-
AWS_ACCESS_KEY=os.getenv("AWS_ACCESS_KEY")
30-
AWS_SECRET_KEY=os.getenv("AWS_SECRET_KEY")
31-
AWS_REGION=os.getenv("AWS_REGION")
32-
AWS_MODEL_ID=os.getenv("AWS_MODEL_ID", "anthropic.claude-v2")
33-
BEDROCK_CLIENT=boto3.client(service_name="bedrock-runtime", region_name=AWS_REGION, aws_access_key_id=AWS_ACCESS_KEY, aws_secret_access_key=AWS_SECRET_KEY)
39+
AWS_ACCESS_KEY = os.getenv("AWS_ACCESS_KEY")
40+
AWS_SECRET_KEY = os.getenv("AWS_SECRET_KEY")
41+
AWS_REGION = os.getenv("AWS_REGION")
42+
AWS_MODEL_ID = os.getenv("AWS_MODEL_ID", "anthropic.claude-v2")
43+
BEDROCK_CLIENT = boto3.client(
44+
service_name="bedrock-runtime",
45+
region_name=AWS_REGION,
46+
aws_access_key_id=AWS_ACCESS_KEY,
47+
aws_secret_access_key=AWS_SECRET_KEY,
48+
)
3449
return BedrockChat(
3550
client=BEDROCK_CLIENT,
3651
model_id=AWS_MODEL_ID,
3752
streaming=True,
38-
model_kwargs={"temperature":temperature})
53+
model_kwargs={"temperature": temperature},
54+
)
55+
3956

4057
MAP_LLM_TYPE_TO_CHAT_MODEL = {
4158
"azure": init_azure_chat,
@@ -44,8 +61,13 @@ def init_bedrock(temperature):
4461
"vertex": init_vertex_chat,
4562
}
4663

64+
4765
def get_llm(temperature=0):
4866
if not LLM_TYPE in MAP_LLM_TYPE_TO_CHAT_MODEL:
49-
raise Exception("LLM type not found. Please set LLM_TYPE to one of: " + ", ".join(MAP_LLM_TYPE_TO_CHAT_MODEL.keys()) + ".")
67+
raise Exception(
68+
"LLM type not found. Please set LLM_TYPE to one of: "
69+
+ ", ".join(MAP_LLM_TYPE_TO_CHAT_MODEL.keys())
70+
+ "."
71+
)
5072

5173
return MAP_LLM_TYPE_TO_CHAT_MODEL[LLM_TYPE](temperature=temperature)

example-apps/chatbot-rag-app/data/index_data.py

+9-7
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,16 @@ def main():
6161

6262
print(f"Loading data from ${FILE}")
6363

64-
metadata_keys = ['name', 'summary', 'url', 'category', 'updated_at']
64+
metadata_keys = ["name", "summary", "url", "category", "updated_at"]
6565
workplace_docs = []
66-
with open(FILE, 'rt') as f:
66+
with open(FILE, "rt") as f:
6767
for doc in json.loads(f.read()):
68-
workplace_docs.append(Document(
69-
page_content=doc['content'],
70-
metadata={k: doc.get(k) for k in metadata_keys}
71-
))
68+
workplace_docs.append(
69+
Document(
70+
page_content=doc["content"],
71+
metadata={k: doc.get(k) for k in metadata_keys},
72+
)
73+
)
7274

7375
print(f"Loaded {len(workplace_docs)} documents")
7476

@@ -92,7 +94,7 @@ def main():
9294
index_name=INDEX,
9395
strategy=ElasticsearchStore.SparseVectorRetrievalStrategy(model_id=ELSER_MODEL),
9496
bulk_kwargs={
95-
'request_timeout': 60,
97+
"request_timeout": 60,
9698
},
9799
)
98100

0 commit comments

Comments
 (0)