-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
114 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -65,10 +65,64 @@ audio_file = "audio.wav" | |
prediction = asr.predict(audio_file) | ||
``` | ||
|
||
# Pipeline | ||
|
||
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library,offering a simple API dedicated to several tasks Masked Language Modeling, Sentiment Analysis . | ||
|
||
|
||
|
||
|
||
**bert-base-wolof** is pretrained bert-base model on wolof language . | ||
**sora-wolof** is pretrained roberta model on wolof language . | ||
|
||
## Models in Wolof library | ||
|
||
| Model name | Number of layers | Attention Heads | Embedding Dimension | Total Parameters | | ||
| :------: | :---: | :---: | :---: | :---: | | ||
| `bert-base-wolof` | 6 | 12 | 514 | 56931622 M | | ||
| `soraberta-base` | 6 | 12 | 514 | 83 M | | ||
|
||
## Using Soraberta or BERT-base-wolof | ||
|
||
```python | ||
>>> from wolof import Pipeline | ||
>>> unmasker = Pipeline(task='fill-mask', model_name='abdouaziiz/bert-base-wolof') | ||
>>> unmasker("kuy yoot du [MASK].") | ||
|
||
[{'sequence': '[CLS] kuy yoot du seqet. [SEP]', | ||
'score': 0.09505125880241394, | ||
'token': 13578}, | ||
{'sequence': '[CLS] kuy yoot du daw. [SEP]', | ||
'score': 0.08882280439138412, | ||
'token': 679}, | ||
{'sequence': '[CLS] kuy yoot du yoot. [SEP]', | ||
'score': 0.057790059596300125, | ||
'token': 5117}, | ||
{'sequence': '[CLS] kuy yoot du seqat. [SEP]', | ||
'score': 0.05671025067567825, | ||
'token': 4992}, | ||
{'sequence': '[CLS] kuy yoot du yaqu. [SEP]', | ||
'score': 0.0469999685883522, | ||
'token': 1735}] | ||
``` | ||
|
||
|
||
for ***`task`*** we can have the following values: 'fill-mask', 'sentiment-analysis' | ||
|
||
|
||
|
||
|
||
|
||
|
||
You can checkout examples in `examples/` | ||
|
||
<hr> | ||
|
||
|
||
|
||
|
||
|
||
## Author | ||
- Abdou Aziz DIOP @abdouaziz | ||
- email : [email protected] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ | |
|
||
setuptools.setup( | ||
name="wolof", | ||
version="0.0.1", | ||
version="0.0.3", | ||
author="Abdou Aziz DIOP", | ||
author_email="[email protected]", | ||
description="wolof is a python library for the Wolof language", | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from .asr import * | ||
|
||
from .model import * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
|
||
from transformers import pipeline | ||
|
||
|
||
|
||
|
||
class Pipeline(object): | ||
""" | ||
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, | ||
offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, | ||
Sentiment Analysis, Feature Extraction and Question Answering. | ||
""" | ||
def __init__(self, task , model_name="abdouaziiz/bert-base-wolof"): | ||
""" | ||
Initialize the model | ||
Args: | ||
model_name (str): The name of the model to load | ||
""" | ||
self.task = task | ||
self.model_name = model_name | ||
self.pipe = pipeline(self.task, model=self.model_name) | ||
|
||
|
||
def __call__(self, text): | ||
return self.pipe(text) |