Skip to content

On the use of cooccurgen.py #21

@RacheleSprugnoli

Description

@RacheleSprugnoli

Dear William,
I’d like to use your code to induce a sentiment lexicon from a new corpus. In your answer to the issue #8, you wrote that the first step is to “Use representations/cooccurgen.py to process a corpus and construct co-occurrence matrices.”
By looking at cooccurgen.py, it seems that it takes in input a corpus in the COHA word_lemma_pos format and it also needs a file called index.pkl.

  • Do I have to transform my corpus into a tabular format like the COHA format?
  • How is the index.pkl file created?
  • Is there any way to use the script starting from a raw corpus?

Thanks a lot in advance!
Best,
Rachele

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions