Quantcast
Channel: Economics Job Market Rumors Topic: Text Analysis Question
Viewing all articles
Browse latest Browse all 16

StatsBro on "Text Analysis Question"

$
0
0

^^ yes - it's very context dependent but in general I would start with frequent terms - usually the way it's done is in the cleaning process you specify a sparsity level to remove sparse terms so you'll only get terms which show up enough in a significant enough number of observations - you'll have to play with the sparsity level to get the number you want - b/c if you set it too low you can generate huge number of variables/terms.

So after removing sparse terms, stop words, etc. - you should have a list of frequent terms which are hopefully meaningful to your text


Viewing all articles
Browse latest Browse all 16

Trending Articles