Skip to content

Commit 573567e

Browse files
authored
Merge pull request #327 from raccoongang/golub-sergey/OeX_ElasticSearch/feature/transition-from-ES1.5-to-ES7
[BD-19] Transition to the new Elasticsearch libs version for cs_comments_service
2 parents 609eef0 + af1dbf1 commit 573567e

21 files changed

+361
-374
lines changed

.travis/docker-compose-travis.yml

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,15 @@ version: "2"
33

44
services:
55
elasticsearch:
6-
image: elasticsearch:1.5.2
6+
image: elasticsearch:7.8.0
77
container_name: "es.edx"
8+
environment:
9+
- discovery.type=single-node
10+
- bootstrap.memory_lock=true
11+
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
12+
volumes:
13+
- data01:/usr/share/elasticsearch/data
14+
- ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
815
mongo:
916
image: mongo:3.2.21
1017
container_name: "mongo.edx"
@@ -15,6 +22,7 @@ services:
1522
- ..:/edx/app/forum/cs_comments_service
1623
environment:
1724
MONGOID_AUTH_MECH: ""
25+
SEARCH_SERVER_ES7: "http://elasticsearch:9200"
1826
forum:
1927
extends: forum-base
2028
command: tail -f /dev/null
@@ -27,3 +35,6 @@ services:
2735
depends_on:
2836
- "elasticsearch"
2937
- "mongo"
38+
39+
volumes:
40+
data01:

.travis/elasticsearch.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
network.host: 0.0.0.0
2+
cluster.routing.allocation.disk.watermark.low: 150mb
3+
cluster.routing.allocation.disk.watermark.high: 100mb
4+
cluster.routing.allocation.disk.watermark.flood_stage: 50mb

Gemfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ gem 'will_paginate_mongoid', "~>2.0"
3838
gem 'rdiscount'
3939
gem 'nokogiri', "~>1.8.1"
4040

41-
gem 'elasticsearch', '~> 1.1.2'
42-
gem 'elasticsearch-model', '~> 0.1.9'
41+
gem 'elasticsearch', '~> 7.8.0'
42+
gem 'elasticsearch-model', '~> 7.1.0'
4343

4444
gem 'dalli'
4545

Gemfile.lock

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -50,25 +50,25 @@ GEM
5050
docile (1.3.2)
5151
domain_name (0.5.20170404)
5252
unf (>= 0.0.5, < 1.0.0)
53-
elasticsearch (1.1.2)
54-
elasticsearch-api (= 1.1.2)
55-
elasticsearch-transport (= 1.1.2)
56-
elasticsearch-api (1.1.2)
53+
elasticsearch (7.8.0)
54+
elasticsearch-api (= 7.8.0)
55+
elasticsearch-transport (= 7.8.0)
56+
elasticsearch-api (7.8.0)
5757
multi_json
58-
elasticsearch-model (0.1.9)
58+
elasticsearch-model (7.1.0)
5959
activesupport (> 3)
60-
elasticsearch (> 0.4)
60+
elasticsearch (> 1)
6161
hashie
62-
elasticsearch-transport (1.1.2)
63-
faraday
62+
elasticsearch-transport (7.8.0)
63+
faraday (~> 1)
6464
multi_json
6565
enumerize (2.1.2)
6666
activesupport (>= 3.2)
6767
factory_girl (4.8.0)
6868
activesupport (>= 3.0.0)
6969
faker (1.7.3)
7070
i18n (~> 0.5)
71-
faraday (0.12.1)
71+
faraday (1.0.1)
7272
multipart-post (>= 1.2, < 3)
7373
ffi (1.9.18)
7474
formatador (0.2.5)
@@ -84,7 +84,7 @@ GEM
8484
guard-unicorn (0.2.0)
8585
guard (>= 1.1)
8686
hashdiff (0.3.4)
87-
hashie (3.5.5)
87+
hashie (4.1.0)
8888
http-cookie (1.0.3)
8989
domain_name (~> 0.5)
9090
i18n (0.9.5)
@@ -117,8 +117,8 @@ GEM
117117
mongoid_magic_counter_cache (1.1.1)
118118
mongoid
119119
rake
120-
multi_json (1.12.1)
121-
multipart-post (2.0.0)
120+
multi_json (1.15.0)
121+
multipart-post (2.1.1)
122122
nenv (0.3.0)
123123
netrc (0.11.0)
124124
newrelic_rpm (5.6.0.349)
@@ -221,8 +221,8 @@ DEPENDENCIES
221221
dalli
222222
delayed_job
223223
delayed_job_mongoid
224-
elasticsearch (~> 1.1.2)
225-
elasticsearch-model (~> 0.1.9)
224+
elasticsearch (~> 7.8.0)
225+
elasticsearch-model (~> 7.1.0)
226226
enumerize
227227
factory_girl (~> 4.0)
228228
faker (~> 1.6)

README.rst

Lines changed: 12 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -35,45 +35,38 @@ Install the requisite gems:
3535
3636
$ bundle install
3737
38-
To initialize the index:
38+
To initialize indices:
3939

40-
Setup the search index. Note that the command below creates an alias with a unique name (e.g.
41-
content_20161220185820323), and assigns it a known alias: content. If you choose not to use the command below, you
42-
should still opt to reference your index by an alias rather than the actual index name. This will enable you to swap out
43-
indices (e.g. rebuild_index) without having to take downtime or modify code with a new index name.
40+
Setup search indices. Note that the command below creates `comments_20161220185820323` and
41+
`comment_threads_20161220185820323` indices and assigns `comments` and `comment_threads` aliases. This will enable you
42+
to swap out indices (e.g. rebuild_index) without having to take downtime or modify code with a new index name.
4443

4544
.. code-block:: bash
4645
4746
$ bin/rake search:initialize
4847
49-
To validate the 'content' alias exists and contains the proper mappings:
48+
To validate indices exist and contain the proper mappings:
5049

5150
.. code-block:: bash
5251
53-
$ bin/rake search:validate_index
52+
$ bin/rake search:validate_indices
5453
55-
To rebuild the index:
54+
To rebuild indices:
5655

57-
To rebuild a new index from the database and then point the alias 'content' to it, you can use the
58-
rebuild_index task. This task will also run catchup before and after the alias is moved, to minimize time where the
59-
alias does not contain all documents.
56+
To rebuild new indices from the database and then point the aliases `comments` and `comment_threads` to each index
57+
which has equivalent index prefix, you can use the rebuild_indices task. This task will also run catch up before
58+
and after aliases are moved, to minimize time where aliases do not contain all documents.
6059

6160
.. code-block:: bash
6261
63-
$ bin/rake search:rebuild_index
64-
65-
To rebuild a new index without moving the alias and without running catchup, use the following:
66-
67-
.. code-block:: bash
68-
69-
$ bin/rake search:rebuild_index[false]
62+
$ bin/rake search:rebuild_indices
7063
7164
You can also adjust the batch size (e.g. 200) and the sleep time (e.g. 2 seconds) between batches to lighten the load
7265
on MongoDB.
7366

7467
.. code-block:: bash
7568
76-
$ bin/rake search:rebuild_index[true,200,2]
69+
$ bin/rake search:rebuild_indices[200,2]
7770
7871
Run the server:
7972

api/search.rb

Lines changed: 68 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,82 +1,94 @@
11
def get_thread_ids(context, group_ids, local_params, search_text)
2-
filters = []
3-
filters.push({term: {commentable_id: local_params['commentable_id']}}) if local_params['commentable_id']
4-
filters.push({terms: {commentable_id: local_params['commentable_ids'].split(',')}}) if local_params['commentable_ids']
5-
filters.push({term: {course_id: local_params['course_id']}}) if local_params['course_id']
2+
must = []
3+
filter = []
4+
must.push({term: {commentable_id: local_params['commentable_id']}}) if local_params['commentable_id']
5+
must.push({terms: {commentable_id: local_params['commentable_ids'].split(',')}}) if local_params['commentable_ids']
6+
must.push({term: {course_id: local_params['course_id']}}) if local_params['course_id']
7+
must.push(
8+
{
9+
multi_match: {
10+
query: search_text,
11+
fields: [:title, :body],
12+
operator: :AND
13+
}
14+
}
15+
)
16+
group_id = local_params['group_id']
17+
18+
if group_id
19+
filter.push(
20+
{:bool => {:must_not => {:exists => {:field => :group_id}}}},
21+
{:term => {:group_id => group_id}}
22+
)
23+
end
624

7-
filters.push({or: [
8-
{not: {exists: {field: :context}}},
9-
{term: {context: context}}
10-
]})
25+
filter.push(
26+
{:bool => {:must_not => {:exists => {:field => :context}}}},
27+
{:term => {:context => context}}
28+
)
1129

1230
unless group_ids.empty?
13-
filters.push(
14-
{
15-
bool: {
16-
should: [
17-
{:not => {:exists => {:field => :group_id}}},
18-
{:terms => {:group_id => group_ids}}
19-
]
20-
}
21-
}
31+
filter.push(
32+
{:bool => {:must_not => {:exists => {:field => :group_id}}}},
33+
{:terms => {:group_id => group_ids}}
2234
)
2335
end
2436

2537
body = {
26-
size: CommentService.config['max_deep_search_comment_count'].to_i,
27-
sort: [
28-
{updated_at: :desc}
29-
],
30-
query: {
31-
filtered: {
32-
query: {
33-
multi_match: {
34-
query: search_text,
35-
fields: [:title, :body],
36-
operator: :AND
37-
}
38-
},
39-
filter: {
40-
bool: {
41-
must: filters
42-
}
43-
}
44-
}
38+
size: CommentService.config['max_deep_search_comment_count'].to_i,
39+
sort: [
40+
{updated_at: :desc}
41+
],
42+
query: {
43+
bool: {
44+
must: must,
45+
should: filter
4546
}
47+
}
4648
}
4749

48-
response = Elasticsearch::Model.client.search(index: Content::ES_INDEX_NAME, body: body)
50+
response = Elasticsearch::Model.client.search(index: TaskHelpers::ElasticsearchHelper::INDEX_NAMES, body: body)
4951

5052
thread_ids = Set.new
5153
response['hits']['hits'].each do |hit|
52-
case hit['_type']
53-
when CommentThread.document_type
54-
thread_ids.add(hit['_id'])
55-
when Comment.document_type
56-
thread_ids.add(hit['_source']['comment_thread_id'])
57-
else
58-
# There shouldn't be any other document types. Nevertheless, ignore them, if they are present.
59-
next
54+
if hit['_index'].include? CommentThread.index_name
55+
thread_ids.add(hit['_id'])
56+
elsif hit['_index'].include? Comment.index_name
57+
thread_ids.add(hit['_source']['comment_thread_id'])
58+
else
59+
# There shouldn't be any other indices. Nevertheless, ignore them, if they are present.
60+
next
6061
end
6162
end
6263
thread_ids
6364
end
6465

6566
def get_suggested_text(search_text)
6667
body = {
67-
suggestions: {
68-
text: search_text,
69-
phrase: {
70-
field: :_all
71-
}
68+
suggest: {
69+
body_suggestions: {
70+
text: search_text,
71+
phrase: {
72+
field: :body
73+
}
74+
},
75+
title_suggestions: {
76+
text: search_text,
77+
phrase: {
78+
field: :title
79+
}
7280
}
81+
}
7382
}
74-
response = Elasticsearch::Model.client.suggest(index: Content::ES_INDEX_NAME, body: body)
75-
suggestions = response.fetch('suggestions', [])
76-
if suggestions.length > 0
77-
options = suggestions[0]['options']
78-
if options.length > 0
79-
return options[0]['text']
83+
84+
response = Elasticsearch::Model.client.search(index: TaskHelpers::ElasticsearchHelper::INDEX_NAMES, body: body)
85+
body_suggestions = response['suggest'].fetch('body_suggestions', [])
86+
title_suggestions = response['suggest'].fetch('title_suggestions', [])
87+
88+
[body_suggestions, title_suggestions].each do |suggestion|
89+
if suggestion.length > 0
90+
options = suggestion[0]['options']
91+
return options[0]['text'] if options.length > 0
8092
end
8193
end
8294

config/application.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
level_limit: 3
22
api_key: <%= ENV['API_KEY'] || 'PUT_YOUR_API_KEY_HERE' %>
3-
elasticsearch_server: <%= ENV['SEARCH_SERVER'] || 'http://localhost:9200' %>
3+
elasticsearch_server: <%= ENV['SEARCH_SERVER_ES7'] || 'http://localhost:9200' %>
44
max_deep_search_comment_count: 5000
55
enable_search: true
66
default_locale: <%= ENV['SERVICE_LANGUAGE'] || 'en-US' %>

0 commit comments

Comments
 (0)