Other Student Works

Document Type


Publication Date

Spring 2-19-2020


Parts of speech (POS) tagging is the process of assigning a word in a text as corresponding to a part of speech based on its definition and its relationship with adjacent and related words in a phrase, sentence, or paragraph. POS tagging falls into two distinctive groups: rule-based and stochastic. In this paper, a rule-based POS tagger is developed for the English language using Lex and Yacc. The tagger utilizes a small set of simple rules along with a small dictionary to generate sequences of tokens.

Start Page No.


End Page No.


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


An article written on a rule-based part of speech tagger implemented in C through Flex and Yacc.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.