Archive

Archive for February, 2018

sanitizing tweets

February 12, 2018 Leave a comment

Problem
You have the text of a tweet and you want to get rid of the bullshit (smileys, emojis, etc.)

Solution
See https://github.com/s/preprocessor. It’s customizable, you can select what to remove, e.g. URLs, smileys, etc.