Yes - create a document term matrix from the data - this is a matrix of dummies for word mentions in each observation (you can also do bi-grams, tri-grams, etc.) but you'll likely have to do some cleaning first - e.g. stemming, removing punctuation/numbers, named entities, etc.
Then you can reg away
Ok, but I need to know ex ante the words I want to test in order to do that. My question is more about "what are the words" I should look at. Are the most frequent the only ones?
btw, I'm pretty new to this.