WSpell ActiveX Spelling Checker |
|
|
Home |
|
|
|
|
This article describes some techniques for using WSpell to detect the presence of specific words in text, such as profanity.
Usually, WSpell is used to detect words which are not present in a dictionary of words. Words not in the dictionary are deemed to be misspelled and are reported so they can be corrected. Detecting the presence of specific words is, in certain respects, the reverse of this.
Each word in a dictionary used by WSpell has an action code associated with it. When a word in the text being checked matches a word in a dictionary, WSpell examines the action code associated with the word and performs the indicated action. The most common action tells WSpell to ignore or skip over the word, usually because the word is correctly spelled so no further action is associated with it.
Other action codes supported by WSpell cause the word to be automatically or conditionally replaced with another word. These actions are typically used to "auto correct" certain frequently misspelled words, such as replacing "recieve" with "receive." Automatic or conditional replacements are not made by WSpell directly. Instead, WSpell reports to the calling application (i.e., your application) that the word should be replaced with another word. Normally, the calling application then makes the replacement by calling certain methods in the WSpell API, perhaps after confirming the replacement with the user. The key thing to note here is that certain words can be assigned an action code which causes WSpell to report to your application when those words are encountered. This is exactly what is needed to detect the presence of those words.
Entries in text dictionaries contain a word, an action, and a replacement word. Suppose you want to detect the presence of the words "dog," "cat," or "pig" in the text (we'll pretend these words are profanity). You could create a new text dictionary (by calling the CreateUserDictionary method) and add three entries to it, one for each word you want to detect. Actions which can be associated with words in dictionaries are listed in the WSpell programmer's guide under "User dictionary action codes." We'll use wSpellConditionalChangeAction as the action. The replacement word will be an encoded string. The encoded string will tell our application that the word is profanity. To keep things simple, we'll just use "XXX" as the replacement word. The three entries can be created by calling the AddToUserDictionary method:
wspell1.AddToUserDictionary("profanity.tlx", "dog", wSpellConditionalChangeAction, "XXX");
wspell1.AddToUserDictionary("profanity.tlx", "cat", wSpellConditionalChangeAction, "XXX");
wspell1.AddToUserDictionary("profanity.tlx", "pig", wSpellConditionalChangeAction, "XXX");With this dictionary in use, WSpell will fire the functions will fire the ConditionallyChangeWord event whenever "dog", "cat", or "pig" are encountered in the text. Your application can examine the ReplacementWord property to determine if the word is profanity: If ReplacementWord is "XXX", a match has been found.
Private Sub WSpell1_ConditionallyChangeWord()
If wspell1.ReplacementWord = "XXX" Then
MsgBox "Tsk, tsk tsk!"
End If
End Sub
The replacement word can be any string, so additional information can be encoded in it. For example, you might want to follow "XXX" with a digit indicating the "offensiveness," with "1" meaning "mildly inappropriate" and "9" being reserved for words that would make Tony Soprano blush.
The same approach can be used in other circumstances where you want to detect the presence of certain words: Categorizing e-mail into folders, filtering spam, detecting part numbers, etc.
|
|
|
Home |
|
|
Copyright © 2006 Wintertree Software Inc. Last modified |