Quantcast
Viewing latest article 5
Browse Latest Browse All 16

StatsBro on "Text Analysis Question"

Yes - create a document term matrix from the data - this is a matrix of dummies for word mentions in each observation (you can also do bi-grams, tri-grams, etc.) but you'll likely have to do some cleaning first - e.g. stemming, removing punctuation/numbers, named entities, etc.

Then you can reg away, find correlations, term frequencies, create models, etc.


Viewing latest article 5
Browse Latest Browse All 16

Trending Articles