Towards Building a Modern Written Tamil Treebank

Parameswari, K.; Sarveswaran, K.

DSpace Home
→
Faculty of Science
→
Computer Science
→
View Item

Towards Building a Modern Written Tamil Treebank

Parameswari, K.; Sarveswaran, K.

URI: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9036

Date: 2022

Abstract:

In this paper, we describe the creation of a morphosyntactically annotated treebank for modern written Tamil following the Universal Dependencies (UD) framework to support the implementation and evaluation of Tamil dependency parsers. At present, this treebank consists of 534 sentences. This paper discusses unique constructions found in Tamil and explains sub-relations and language-specific relations introduced, apart from outlining the methodology. This carefully annotated treebank can also serve as the benchmark dataset to evaluate Tamil Natural Language Processing (NLP) tools. The treebank will be extended further to cover more complex constructions in Tamil, and annotations will be enriched by incorporating the Enhanced Universal Dependencies scheme.

Show full item record