You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+29-9
Original file line number
Diff line number
Diff line change
@@ -43,29 +43,49 @@ Texplot is a little program that turns a document into a network of terms that a
43
43
44
44
## Generating graphs
45
45
46
-
The easiest way to build out a graph is to use the `frequent` function, which wraps up all the intermediate steps of tokenizing the text, computing the term distance matrix, generating the per-word topic lists, etc. (Or, use the `clumpy` function, which tries to pick words that concentrate really tightly in specific parts of the text). First, spin up a virtualenv:
46
+
The easiest way to build out a graph is to use the `textplot` executable, which wraps up all the intermediate steps of tokenizing the text, estimating probability densities for the words, and indexing the distance matrix.
47
+
48
+
First, install Textplot via PyPI:
49
+
50
+
`pip3 install textplot`
51
+
52
+
Or, clone the repo and install the package manually:
47
53
48
54
```bash
49
-
virtualenv env
55
+
pyvenv env
50
56
. env/bin/activate
51
-
pip install -r requirements.txt
57
+
pip3 install -r requirements.txt
58
+
python3 setup.py install
52
59
```
53
60
61
+
Then, generate graphs with:
62
+
63
+
`texplot generate [] []`
64
+
65
+
66
+
67
+
68
+
54
69
Then, fire up an IPython terminal and build a network:
55
70
56
71
```bash
57
-
In [1]: from textplot import frequent
72
+
In [1]: from textplot.helpers import build_graph
73
+
74
+
In [2]: g = build_graph('../texts/war-and-peace.txt')
- **(int) `term_depth=500`** - The number of terms to include in the network. Right now, the code just rakes the top X most frequent terms, after stopwords are removed.
0 commit comments