Addressing Viterbi Algorithm challenges

EliteAI
3 min readMar 10, 2021

Part2: Viterbi decoding algorithm for POS tagging

website:https://eliteaihub.com/

In the last blog I discussed about how you can implement The Viterbi algorithm from scratch that can be used for decoding, finding the most likely tag sequence. Today we will identify some of the challenges and try to address them. Let’s go through the algorithm as well:

viterbi algoritm

1. Underflow and Overflow

Except for rather short sequences in which n is relatively small, say, a couple hundred or so, the algorithm above will fail due to the fact that we are trying to represent numbers that are too small (or too large) for the computer to handle. And what’s worse, you will usually receive no warning or error that something has gone wrong. Basically, the issue is that (in most programming languages), there is a limit on how small (or large) of a number can be represented. (For example, in a couple of languages that I use, the lower limit seems to be around 10−323, and the upper limit around 10308.) Anything smaller (or larger) than this will be considered to be exactly zero (or infinity). This is referred to as “arithmetic underflow” (or “arithmetic overflow”)

The standard solution to this problem is to work with logs. This is a trick that works in a lot of other problems as well.
Denote l= log p, e.g., l(z1) = log p(z1), l(zt |zt−1) = log p(zt |zt−1), and l(xt |zt) = log p(xt |zt).

2. Dealing with unknown words:

If o is not in the training text then bi(ot) = 0. Despite of course being a very bad estimate of the true probability, zero-probabilities like this will also result in that all possible state-sequences for an observation sequence containing an unknown word will have probability 0.Here I used maximum-entropy classifier to predict the emission probabilities for the unknown words.

regex_pattern = [
(r​’^[aA-zZ].*[0–9]+’​), ​#word contains a number

(r​’^[A-Z]+’​), ​#word contains an upper-case letter

(r​’[aA-zZ]+(ed|ing|es)$’​), ​#word ending with ‘ing’ or ‘ed’

(r​’^[A-Z]{2,}+’​), ​#word is all upper-case

(r​’.*s$’​), ​# plural nouns

(r​’.[aA-zZ]+(ment|tion|ness|ism|ance)$’​), (r​’.*\’s$’​),​# possessive nouns — words ending with ‘s

(r​’.[aA-zZ]+(able|ible|est|ful)$’​), ​#Adjective — ends with able, ible, est, ful

(r​’.[aA-zZ]+(ly|er)$’​) ​#Adverb — ends with er,ly ]

Sources:

[1] CHAPTER DRAFT — | Stanford Lagunita. https://lagunita.stanford.edu/c4x/Engineering/CS-224N/asset/slp4.pdf

Thanks for reading. If you have any feedback, please feel to reach out by commenting on this post.

Links to the youtube videos

Implementing Viterbi algorithm : https://youtu.be/DmQgcZ-L1Uk

Viterbi decoding algorithm : https://youtu.be/yGtC10JJNq8

Machine Learning : https://youtu.be/Q3mZui3H6MM

Other Stories:

a. Viterbi Decoding model from scratch: https://eliteai-coep.medium.com/build-viterbi-decoding-model-from-scratch-aa6d39726877

b. Threats of adversarial attacks in deep learning: https://eliteai-coep.medium.com/threat-of-adversarial-attacks-on-deep-learning-in-computer-vision-5cb7ef50f020

c. Building Ngram model from scratch:https://eliteai-coep.medium.com/building-n-gram-language-model-from-scratch-9a5ec206b520

d. Ngram Language models : https://eliteai-coep.medium.com/n-gram-language-models-55a0c60a3343

--

--