On the use of cooccurgen.py

Dear William,
I’d like to use your code to induce a sentiment lexicon from a new corpus. In your answer to the [issue #8](https://github.com/williamleif/socialsent/issues/8), you wrote that the first step is to “Use representations/cooccurgen.py to process a corpus and construct co-occurrence matrices.”
By looking at cooccurgen.py, it seems that it takes in input a corpus in the COHA word_lemma_pos format and it also needs a file called index.pkl.
- Do I have to transform my corpus into a tabular format like the COHA format?
- How is the index.pkl file created?
- Is there any way to use the script starting from a raw corpus?

Thanks a lot in advance!
Best,
Rachele

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On the use of cooccurgen.py #21

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

On the use of cooccurgen.py #21

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions