Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs [PDF]
Sanskrit (sa\d{m}sk\d{r}ta) enjoys one of the largest and most varied literature in the whole world. Extracting the knowledge from it, however, is a challenging task due to multiple reasons including complexity of the language and paucity of standard ...
Hrishikesh Terdalkar, A. Bhattacharya
semanticscholar +1 more source
SanskritShala: A Neural Sanskrit NLP Toolkit with Web-Based Interface for Pedagogical and Annotation Purposes [PDF]
We present a neural Sanskrit Natural Language Processing (NLP) toolkit named SanskritShala (a school of Sanskrit) to facilitate computational linguistic analyses for several tasks such as word segmentation, morphological tagging, dependency parsing, and ...
Jivnesh Sandhan +4 more
semanticscholar +1 more source
San-BERT: Extractive Summarization for Sanskrit Documents using BERT and it's variants [PDF]
In this work, we develop language models for the Sanskrit language, namely Bidirectional Encoder Representations from Transformers (BERT) and its variants: A Lite BERT (ALBERT), and Robustly Optimized BERT (RoBERTa) using Devanagari Sanskrit text corpus.
Kartikeya Bhatnagar +3 more
semanticscholar +1 more source
Samayik: A Benchmark and Dataset for English-Sanskrit Translation [PDF]
We release Saamayik, a dataset of around 53,000 parallel English-Sanskrit sentences, written in contemporary prose. Sanskrit is a classical language still in sustenance and has a rich documented heritage.
Ayush Maheshwari +5 more
semanticscholar +1 more source
A Benchmark and Dataset for Post-OCR text correction in Sanskrit [PDF]
Sanskrit is a classical language with about 30 million extant manuscripts fit for digitisation, available in written, printed or scannedimage forms.
Ayush Maheshwari +3 more
semanticscholar +1 more source
TransLIST: A Transformer-Based Linguistically Informed Sanskrit Tokenizer [PDF]
Sanskrit Word Segmentation (SWS) is essential in making digitized texts available and in deploying downstream tasks. It is, however, non-trivial because of the sandhi phenomenon that modifies the characters at the word boundaries, and needs special ...
Jivnesh Sandhan +5 more
semanticscholar +1 more source
Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights [PDF]
Automatic speech recognition (ASR) in Sanskrit is interesting, owing to the various linguistic peculiarities present in the language. The Sanskrit language is lexically productive, undergoes euphonic assimilation of phones at the word boundaries and ...
D. Adiga +5 more
semanticscholar +1 more source
A Novel Neural Machine Translation Approach for low-resource Sanskrit-Hindi Language pair
Sanskrit is one of the earliest native languages and is correctly described as "the gods' language" because of its wide use in Indian religious literature from the past. However, it is becoming less popular in modern India. Due in significant part to the
N. Sethi, Amita Dev, Poonam Bansal
semanticscholar +1 more source
Handwritten Vedic Sanskrit Text Recognition Using Deep Learning and Convolutional Neural Networks
Recognizing Vedic Sanskrit text is essential for accessing classical Indo-Aryan language, predominantly utilized in the Vedas. Currently, there is limited awareness about the Vedas, making this field a highly demanding and challenging area in pattern ...
Et al. Ashi Maheshwari
semanticscholar +1 more source
Data-driven dependency parsing of Vedic Sanskrit
This paper describes the first data-driven parser for Vedic Sanskrit, an ancient Indo-Aryan language in which a corpus of important religious and philosophical texts has been composed.
Oliver Hellwig +2 more
semanticscholar +1 more source

