Zum Inhalt springen

Spell Checker- Operation on Text-NLP

operation on text:

  • Split
  • Delete
  • Swap
  • Replace
  • Insert

Split

We are going to split the word but character by character.

def split(word):
    parts=[]
    for i in range(len(word)+1):
        parts+=[(word[ : i],word[i :])]
    return parts
split('datatoinfinity')
Output:
[('', 'datatoinfinity'),
 ('d', 'atatoinfinity'),
 ('da', 'tatoinfinity'),
 ('dat', 'atoinfinity'),
 ('data', 'toinfinity'),
 ('datat', 'oinfinity'),
 ('datato', 'infinity'),
 ('datatoi', 'nfinity'),
 ('datatoin', 'finity'),
 ('datatoinf', 'inity'),
 ('datatoinfi', 'nity'),
 ('datatoinfin', 'ity'),
 ('datatoinfini', 'ty'),
 ('datatoinfinit', 'y'),
 ('datatoinfinity', '')]

Delete

def delete(word):
    output=[]
    for l,r in split(word):
        output.append(l+r[1:])
    return output
delete('hello')
['ello', 'hllo', 'helo', 'helo', 'hell', 'hello']

It is deleting character by character.

Swap

def swap(word):
        
    output = []    
    for l,r in split(word):
        if (len(r) > 1):
            output.append(l + r[1] + r[0] + r[2:])
    return output
            
swap('Hello')
['eHllo', 'Hlelo', 'Hello', 'Helol']

Replace

def replace(word):
    
    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []    

    for l,r in split(word):
        for char in characters:
            output.append(l + char +  r[1:])
    return output

len(replace('lave'))
130

Insert

def insert(word):

    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []

    for l,r in split(word):
        for char in characters:
            output.append(l + char + r)

    return output

len(insert('lve'))
104

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert