Quantcast
Viewing latest article 6
Browse Latest Browse All 16

Economist on "Text Analysis Question"

Yes - create a document term matrix from the data - this is a matrix of dummies for word mentions in each observation (you can also do bi-grams, tri-grams, etc.) but you'll likely have to do some cleaning first - e.g. stemming, removing punctuation/numbers, named entities, etc.
Then you can reg away

Ok, but I need to know ex ante the words I want to test in order to do that. My question is more about "what are the words" I should look at. Are the most frequent the only ones?

btw, I'm pretty new to this.


Viewing latest article 6
Browse Latest Browse All 16

Trending Articles