profanity filter

for ref:!/glitchchord?path=public%2Fjs%2Fchat.js%3A124%3A3

this chat i would like to have a swear filter, getting words from a DB or array and censoring the words before saving

before: oh swearword
after: oh *********

Okay, so you want profanity filter. Let’s get started.

Filtering can be easy but in reality no
People will separate letters or replace them with modified fonts to bypass detection, and even may send multiple messages in order to form a bad word (this is also includes text art). They can even use other languages and encryption. Which is very annoying and hard to prevent, but I will list some ways that can improve detection.

  1. Remove all non-alphabetic characters
    You want to replace all characters except alphabetical ones, as this will remove special characters, numbers, bypassed fonts, and many more. This will improve detection, there is a way around this but it’s still good to do it.
  2. Make entire string lowercased/uppercase
    Obvious one, but it will prevent people from using lowercased and uppercased characters in order to bypass the detection.
  3. Use regular expressions
    This one is really good one, since we can detect by patterns.
  4. Use artificial intelligence
    A modern and growing technology. You can train AI to detect bad stuff almost like a human. This one would be very good but would take time to train.
  5. Filter it out by list of bad words from array
    Old good method but easy to bypass.

When and where I should filter string?
There is two ways.

  1. Filter string on server before sending to other clients
    Good one, since we can keep database clean out of bad words, however if server gets hacked, hacker can send bad stuff.
  2. Filter string on client before showing it to user
    Really good one, since even if hacker hacks into server, or someone bypasses filtering on their client, other clients will still have non-hacked filtering system and theirs will work just fine.

But turns out, you can’t prevent it at all. It’s constant battle.

