Thank you!
We have received your request. We will reach out to you soon!!
Building a semantic role labelling system for Vietnamese
Content
Authors: Thai-Hoang Pham; Xuan-Khoai Pham; Phuong Le-Hong
Abstract: Semantic role labelling (SRL) is a task in natural language processing which detects and classifies the semantic arguments associated with the predicates of a sentence. It is an important step towards understanding the meaning of a natural language. There exists SRL systems for well-studied languages like English, Chinese or Japanese but there is not any such system for the Vietnamese language. In this paper, we present the first SRL system for Vietnamese with encouraging accuracy. We first demonstrate that a simple application of SRL techniques developed for English could not give a good accuracy for Vietnamese. We then introduce a new algorithm for extracting candidate syntactic constituents, which is much more accurate than the common node-mapping algorithm usually used in the identification step. Finally, in the classification step, in addition to the common linguistic features, we propose novel and useful features for use in SRL. Our SRL system achieves an F1 score of 73.53% on the Vietnamese PropBank corpus. This system, including software and corpus, is available as an open source project and we believe that it is a good baseline for the development of future Vietnamese SRL systems.
Do you need a workthrough of our platform? Let us know
Do you need a workthrough of our platform? Let us know
Related Posts
Operate an Intelligent Call Center with Conversational AI
Contact center remains a “priority investment” for businesses to quickly and effectively serve customers. Artificial Intelligence (AI) has proven itself to be the key solution to sophisticated business problems, in which 79% of call centers expect to invest in this technology (Deloitte, 2021). Prominently, Conversational AI solution will help automate two-way interaction with customers, allowing … Continued
A Faster R-CNN Approach for Partially Occluded Robot Object Recognition
Content Authors: Delowar Hossain, Sivapong Nilwong, Duc Dung Tran, Genci Capi Abstract: Many objects in household and industrial environments are commonly found partially occluded. In this paper, we address the problem of recognizing objects for use in partially occluded object recognition. To enable the use of more expensive features and classifiers, a region proposal network (RPN) which … Continued
Multifeature Image Indexing for Robot Localization in Textureless Environments
Content Authors: Tran Duc Dung; Delowar Hossain; Shin-ichiro Kaneko; Genci Kapi Abstract: Robot localization is an important task for mobile robot navigation. There are many methods focused on this issue. Some methods are implemented in indoor and outdoor environments. However, robot localization in textureless environments is still a challenging task. This is because in these environments, the scene … Continued
Towards Understanding User Requests in AI Bots
Content Authors: Luong Chi Tho, Tran Thi Oanh Abstract: This paper presents the task of deeply analyzing user requests: the situation in ordering bots where users input an utterance, the bots would hopefully extract its full product descriptions and then parse them to recognize each product information (PI). This information is useful to help bots better understand … Continued
Understanding what the users say in chatbots: A case study for the Vietnamese language
ContentAuthors: Luong Chi Tho, Tran Thi Oanh Abstract: This paper1 presents a study on understanding what the users say in chatbot systems: the situation where users input utterances bots would hopefully detect intents and recognize corresponding contexts implied by utterances. This helps bots better understand what users are saying, and act upon a much wider … Continued
A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar
ContentAuthors: Le Hong Phuong, Dang Hoang Vu Abstract: This paper presents a joint syntactic-semantic embedding model which not only uses syntactic information to enrich the word embeddings but also generates distributed representations for the syntactic structures themselves. The syntactic input to our model comes from a Lexicalized Tree-Adjoining Grammar parser. The word embeddings from our … Continued
An Efficient Algorithm for the k-Dominating Set Problem on Very Large-Scale Networks (Extended Abstract)
Content Authors: Tran The Trung, Nguyen Minh Hai, Ha Minh Hoang, Hoang Thai Dinh, Eryk Dutkiewicz, Diep N. Nguyen Abstract: The minimum dominating set problem (MDSP) aims to construct the minimum-size subset D⊂VD⊂V of a graph G=(V,E)G=(V,E) such that every vertex has at least one neighbor in D. The problem is proved to be NP-hard. In a recent industrial application, we encountered a more general variant … Continued
Introducing a Large-Scale Dataset for Vietnamese POS Tagging on Conversational Texts
Content Authors: Oanh Tran, Tu Pham, Vu Dang, Bang Nguyen Abstract: This paper introduces a large-scale human-labeled dataset for the Vietnamese POS tagging task on conversational texts. To this end, wepropose a new tagging scheme (with 36 POS tags) consisting of exclusive tags for special phenomena of conversational words, developthe annotation guideline and manually annotate 16.310K sentences … Continued
Online Adaptation of Language Models for Speech Recognition
Content Authors: Dang Hoang Vu, Van Huy Nguyen, Phuong Le-Hong Abstract: Hybrid models of speech recognition combine a neural acoustic model with a language model, which rescores the output of the acoustic model to find the most linguistically likely transcript. Consequently the language model is of key importance in both open and domain specific speech recognition and … Continued
Towards Task-Oriented Dialogue in Mixed Domains
Content Authors: Luong Chi Tho, Le Hong Phuong Abstract: This work investigates the task-oriented dialogue problem in mixed-domain settings. We study the effect of alternating between different domains in sequences of dialogue turns using two related state-of-the-art dialogue systems. We first show that a specialized state tracking component in multiple domains plays an important role and gives … Continued
Building Vietnamese Linguistic Resources for Social Network Text Analysis
Content Authors: The-Tuyen Nguyen; Xuan-Luong Vu; Phuong Le-Hong Abstract: In this paper, we report our work on building linguistic resources for Vietnamese social network text analysis in multiple domains. We first describe our annotation methodology including guidelines development, annotation softwares and quality assurance. We then present results of the first pilot phase of the project. Finally, we outline some … Continued
Operate an Intelligent Call Center with Conversational AI
Contact center remains a “priority investment” for businesses to quickly and effectively serve customers. Artificial Intelligence (AI) has proven itself to be the key solution to sophisticated business problems, in which 79% of call centers expect to invest in this technology (Deloitte, 2021). Prominently, Conversational AI solution will help automate two-way interaction with customers, allowing … Continued
A Faster R-CNN Approach for Partially Occluded Robot Object Recognition
Content Authors: Delowar Hossain, Sivapong Nilwong, Duc Dung Tran, Genci Capi Abstract: Many objects in household and industrial environments are commonly found partially occluded. In this paper, we address the problem of recognizing objects for use in partially occluded object recognition. To enable the use of more expensive features and classifiers, a region proposal network (RPN) which … Continued
Multifeature Image Indexing for Robot Localization in Textureless Environments
Content Authors: Tran Duc Dung; Delowar Hossain; Shin-ichiro Kaneko; Genci Kapi Abstract: Robot localization is an important task for mobile robot navigation. There are many methods focused on this issue. Some methods are implemented in indoor and outdoor environments. However, robot localization in textureless environments is still a challenging task. This is because in these environments, the scene … Continued
Towards Understanding User Requests in AI Bots
Content Authors: Luong Chi Tho, Tran Thi Oanh Abstract: This paper presents the task of deeply analyzing user requests: the situation in ordering bots where users input an utterance, the bots would hopefully extract its full product descriptions and then parse them to recognize each product information (PI). This information is useful to help bots better understand … Continued
Understanding what the users say in chatbots: A case study for the Vietnamese language
ContentAuthors: Luong Chi Tho, Tran Thi Oanh Abstract: This paper1 presents a study on understanding what the users say in chatbot systems: the situation where users input utterances bots would hopefully detect intents and recognize corresponding contexts implied by utterances. This helps bots better understand what users are saying, and act upon a much wider … Continued
A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar
ContentAuthors: Le Hong Phuong, Dang Hoang Vu Abstract: This paper presents a joint syntactic-semantic embedding model which not only uses syntactic information to enrich the word embeddings but also generates distributed representations for the syntactic structures themselves. The syntactic input to our model comes from a Lexicalized Tree-Adjoining Grammar parser. The word embeddings from our … Continued
An Efficient Algorithm for the k-Dominating Set Problem on Very Large-Scale Networks (Extended Abstract)
Content Authors: Tran The Trung, Nguyen Minh Hai, Ha Minh Hoang, Hoang Thai Dinh, Eryk Dutkiewicz, Diep N. Nguyen Abstract: The minimum dominating set problem (MDSP) aims to construct the minimum-size subset D⊂VD⊂V of a graph G=(V,E)G=(V,E) such that every vertex has at least one neighbor in D. The problem is proved to be NP-hard. In a recent industrial application, we encountered a more general variant … Continued
Introducing a Large-Scale Dataset for Vietnamese POS Tagging on Conversational Texts
Content Authors: Oanh Tran, Tu Pham, Vu Dang, Bang Nguyen Abstract: This paper introduces a large-scale human-labeled dataset for the Vietnamese POS tagging task on conversational texts. To this end, wepropose a new tagging scheme (with 36 POS tags) consisting of exclusive tags for special phenomena of conversational words, developthe annotation guideline and manually annotate 16.310K sentences … Continued
Online Adaptation of Language Models for Speech Recognition
Content Authors: Dang Hoang Vu, Van Huy Nguyen, Phuong Le-Hong Abstract: Hybrid models of speech recognition combine a neural acoustic model with a language model, which rescores the output of the acoustic model to find the most linguistically likely transcript. Consequently the language model is of key importance in both open and domain specific speech recognition and … Continued
Towards Task-Oriented Dialogue in Mixed Domains
Content Authors: Luong Chi Tho, Le Hong Phuong Abstract: This work investigates the task-oriented dialogue problem in mixed-domain settings. We study the effect of alternating between different domains in sequences of dialogue turns using two related state-of-the-art dialogue systems. We first show that a specialized state tracking component in multiple domains plays an important role and gives … Continued
Building Vietnamese Linguistic Resources for Social Network Text Analysis
Content Authors: The-Tuyen Nguyen; Xuan-Luong Vu; Phuong Le-Hong Abstract: In this paper, we report our work on building linguistic resources for Vietnamese social network text analysis in multiple domains. We first describe our annotation methodology including guidelines development, annotation softwares and quality assurance. We then present results of the first pilot phase of the project. Finally, we outline some … Continued
Get ahead with AI-powered technology updates!
Subscribe now to our newsletter for exclusive insights, expert analysis, and cutting-edge developments delivered straight to your inbox!