A new approach to automatically find and fix erroneous labels in dependency parsing treebanks

Bilgin, Metin

Publication:
A new approach to automatically find and fix erroneous labels in dependency parsing treebanks

dc.contributor.author	Bilgin, Metin
dc.contributor.buuauthor	BİLGİN, METİN
dc.contributor.department	Mühendislik Fakültesi
dc.contributor.department	Bilgisayar Mühendisliği Bölümü
dc.contributor.researcherid	AAH-2049-2021
dc.date.accessioned	2024-06-28T10:51:07Z
dc.date.available	2024-06-28T10:51:07Z
dc.date.issued	2021-05-01
dc.description.abstract	Dependency Parsing (DP) is the existence of sub-term/upper-term relations between the words that make up that sentence for each sentence in the text. DP serves to produce meaningful information for high-level applications. Correct labeling of the text corpus used in DP studies is very important. There will be mistakes in the results of the studies that will be performed with the wrongly-labeled text corpus. If text corpus is labeled manually or automatically by human beings, then faulty cases will occur. As a result of the cases that may arise from human factors or annotations used for labeling, faulty labels will be on freebanks. In order to prevent these errors, detection, and correction of possible faulty labeling is very important in terms of increasing the accuracy of the studies to be carried out. Manual correction of possible faulty labels requires great effort and time. The purpose of this study is to create a model that automatically finds possible faulty labels and offers new label suggestions for faulty labels. With the help of the proposed model, it is aimed to detect and correct possible faulty labels that are included in a text corpus, and to increase consistency among the text corpus of the same language. With the help of the developed model, suggesting new labels for faulty labels by a language expert will be a great convenient for the specialist. Another advantage of the model is that the developed model provides a language-independent structure. It has succeeded in obtaining successful results in finding and correcting potentially faulty labels in experimental studies for Turkish. An increase in accuracy has been detected in studies carried out for languages other than Turkish. In investigating the accuracy of the results obtained by the system, the results were analyzed with the help of 10 different language experts.
dc.identifier.doi	10.34028/iajit/18/3/12
dc.identifier.endpage	364
dc.identifier.issn	1683-3198
dc.identifier.issue	3
dc.identifier.startpage	356
dc.identifier.uri	https://doi.org/10.34028/iajit/18/3/12
dc.identifier.uri	https://iajit.org/PDF/May%202021,%20No.%203/19927.pdf
dc.identifier.uri	https://hdl.handle.net/11452/42581
dc.identifier.volume	18
dc.identifier.wos	000667208600012
dc.indexed.wos	WOS.SCI
dc.language.iso	en
dc.publisher	Zarka Private Univ
dc.relation.journal	International Arab Journal of Information Technology
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Errors
dc.subject	Natural language processing
dc.subject	Dependency parsing
dc.subject	Universal dependency
dc.subject	Error detection
dc.subject	Freebank consistency
dc.subject	Science & technology
dc.subject	Technology
dc.subject	Computer science, artificial intelligence
dc.subject	Computer science, information systems
dc.subject	Engineering, electrical & electronic
dc.subject	Computer science
dc.subject	Engineering
dc.title	A new approach to automatically find and fix erroneous labels in dependency parsing treebanks
dc.type	Article
dspace.entity.type	Publication
local.contributor.department	Mühendislik Fakültesi/Bilgisayar Mühendisliği Bölümü
relation.isAuthorOfPublication	cf59076b-d88e-4695-a08c-b06b98b4e25a
relation.isAuthorOfPublication.latestForDiscovery	cf59076b-d88e-4695-a08c-b06b98b4e25a