You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+18-3Lines changed: 18 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -18,6 +18,7 @@ Tensorflow Hub URLs will be enough.
18
18
*[ConveRT finetuned on Ubuntu](#convert-finetuned-on-ubuntu)
19
19
*[Keras layers](#keras-layers)
20
20
*[Encoder client](#encoder-client)
21
+
*[Citations](#citations)
21
22
*[Development](#development)
22
23
23
24
@@ -31,7 +32,7 @@ Using these models requires [Tensorflow Hub](https://www.tensorflow.org/hub) and
31
32
## ConveRT
32
33
33
34
This is the ConveRT dual-encoder model, using subword representations and lighter-weight more efficient transformer-style
34
-
blocks to encode text, as described in TODO.
35
+
blocks to encode text, as described in [the ConveRT paper](https://arxiv.org/abs/1911.03688).
35
36
It provides powerful representations for conversational data, and can also be used as a response ranker.
36
37
The model costs under $100 to train from scratch, can be quantized to under 60MB, and is competitive with larger Transformer networks on conversational tasks.
37
38
We share an unquantized version of the model, facilitating fine-tuning. Please [get in touch](https://www.polyai.com/contact/) if you are interested in using the quantized ConveRT model. The Tensorflow Hub url is:
@@ -98,7 +99,7 @@ tokens = module(
98
99
99
100
## Multi-Context ConveRT
100
101
101
-
This is the multi-context ConveRT model from TODO, that uses extra contexts from the conversational history to refine the context representations. This is an unquantized version of the model. The Tensorflow Hub url is:
102
+
This is the multi-context ConveRT model from [the ConveRT paper](https://arxiv.org/abs/1911.03688), that uses extra contexts from the conversational history to refine the context representations. This is an unquantized version of the model. The Tensorflow Hub url is:
@@ -136,7 +137,7 @@ See [`encoder_client.py`](encoder_client.py) for code that computes these featur
136
137
137
138
This is the multi-context ConveRT model, fine-tuned to the DSTC7 Ubuntu response ranking task. It has the exact same signatures as the extra context model, and has TFHub uri `http://models.poly-ai.com/ubuntu_convert/v1/model.tar.gz`. Note that this model requires prefixing the extra context features with `"0: "`, `"1: "`, `"2: "` etc.
138
139
139
-
The [`dstc7/evaluate_encoder.py`](dstc7/evaluate_encoder.py) script demonstrates using this encoder to reproduce the results from TODO.
140
+
The [`dstc7/evaluate_encoder.py`](dstc7/evaluate_encoder.py) script demonstrates using this encoder to reproduce the results from [the ConveRT paper](https://arxiv.org/abs/1911.03688).
Internally it implements caching, deduplication, and batching, to help speed up encoding. Note that because it does batching internally, you can pass very large lists of sentences to encode without going out of memory.
171
172
173
+
# Citations
174
+
175
+
*[ConveRT: Efficient and Accurate Conversational Representations from Transformers](https://arxiv.org/abs/1911.03688)
176
+
```bibtext
177
+
@article{Henderson2019convert,
178
+
title={{ConveRT}: Efficient and Accurate Conversational Representations from Transformers},
179
+
author={Matthew Henderson and I{\~{n}}igo Casanueva and Nikola Mrk\v{s}i\'{c} and Pei-Hao Su and Tsung-Hsien and Ivan Vuli\'{c}},
0 commit comments