So, I’ve heard that ML manipulates tokens and specifically for the English corpora they take place of words. If we want model to be polite and not to speak uncomfortable language we can remove certain words from the internal array where all tokens and their associative data are stored, for example “fuck”.
Why? There are so many snowflakes around, that you’d get texts like this:
I’d say get a life and don’t be offended so much. On the other hand, the human plague would be solved fast when nobody would fuck anymore. You’re on to something. ;)