Compounding is one of the most productive word formation processes in many languages and is therefore a main source of data sparsity in language modeling. Many solutions have been suggested to model compound words, most of which break the compound into its constituents and train a new model with them. In earlier work, we argued that this approach is suboptimal and we presented a novel technique that clusters new, domain-specific compound words together with their semantic heads. The clusters were then used to build a class-based n-grarn model that enabled a reliable estimation of n-grarn probabilities, without the need for additional training data. In this paper, we investigate how this "semantic head mapping" can best be made an integral part of the language modeling strategy and find that, with some adaptations, our technique is capable of producing more accurate compound probability estimates than a baseline word-based n-gram language model, which lead to a significant word error rate reduction for Dutch read speech.
from #MedicinebyAlexandrosSfakianakis via xlomafota13 on Inoreader http://ift.tt/2cQqx2T
via IFTTT
Αρχειοθήκη ιστολογίου
-
►
2020
(289)
- ► Φεβρουαρίου (28)
-
►
2019
(9071)
- ► Δεκεμβρίου (19)
- ► Σεπτεμβρίου (54)
- ► Φεβρουαρίου (3642)
- ► Ιανουαρίου (3200)
-
►
2018
(39872)
- ► Δεκεμβρίου (3318)
- ► Σεπτεμβρίου (3683)
- ► Φεβρουαρίου (2693)
- ► Ιανουαρίου (3198)
-
►
2017
(41099)
- ► Δεκεμβρίου (3127)
- ► Σεπτεμβρίου (2173)
-
▼
2016
(13807)
- ► Δεκεμβρίου (700)
-
▼
Σεπτεμβρίου
(600)
-
▼
Σεπ 12
(50)
- IDENTIFICATION OF NATURAL LEAD MOLECULES OF CENTEL...
- FORMULATION AND EVALUATION OF HERBAL EMULGEL OF LA...
- MONITORING OF ADVERSE DRUG REACTIONS (ADR) IN CHRO...
- EVALUATION OF MEDICATION ADHERENCE AND IMPACT OF P...
- MYCOSYNTHESIS OF SILVER NANOPARTICLES BY ALTERNARI...
- TRANSDERMAL UNANI FEMALE CONTRACEPTIVE FORMULATION...
- LOCAL INFILTRATION TECHNIQUE IN UNILATERAL TOTAL K...
- EVALUATION OF ANTIBACTERIAL AND ANTIFUNGAL ACTIVI...
- EVALUATION OF ANTIDEPRESSANT ACTIVITY OF RUTIN AN...
- GASTRORETENTIVE IN SITU GEL FORMULATION SYSTEM
- ABC VED ANALYSIS OF DRUG STORE IN TERTIARY CARE H...
- PHYSIO CHEMICAL STANDARDIZATION OF THE SIDDHA HERB...
- KNOWLEDGE AND ATTITUDE OF HEALTH PROFESSIONALS TOW...
- INTEGRATION OF FAMILY PLANING SERVICES WITH ABORTI...
- FORMULATION AND CHARACTERIZATION OF THEOPHYLLINE H...
- Evidence for feasibility of fetal trophoblastic ce...
- Manifesting the sergeants-and-soldiers principle i...
- Competitive interactions between two fishing fleet...
- De l'abbatiat laïque à l'avouerie. Le cas des mona...
- Building ArtBots to attract students into STEM lea...
- The augment in Fwe
- An experimental study to investigate the impact of...
- Nanoparticle behaviour dissected
- Schrijnwerker, kunstenaar of iets anders? De orgel...
- A mobile school-based HCT service: is it youth fri...
- Scale of Emotional Development - Short
- Description of two free-living nematode species of...
- Improving N-Gram probability estimates by compound...
- Reduced B. pullicaecorum levels in mucosa of UC pa...
- Crohn's disease associated with spondyloarthropath...
- ROLE OF MIRNA-122 AND MIRNA-200B IN INTRATUMOR HET...
- STUDY OF LABORATORY PARAMETERS IN HEMOPHILIA PATIENTS
- AN ANTHROPOMETRIC STUDY OF STATURE ESTIMATION AMON...
- Layer-by-Layer Encapsulation of Probiotics for Del...
- Response to second treatment after initial failed ...
- Normalized semantic web distance
- Internet of things, linked data, and citizen parti...
- Methods and models for brain connectivity assessme...
- Inflation during times of economic slack and delev...
- Complementarities Between Stakeholder Management a...
- Reply to: 'Letter to the editor: comparing pace an...
- Handheld pose tracking using visioninertial sensor...
- Patient-specific image-based computer simulation f...
- Crowdsourcing mobility insights – Reflection of at...
- Exploring entity recognition and disambiguation fo...
- Tumescent Liposuction without Lidocaine
- Radiofrequency-assisted Liposuction for Neck and L...
- Direct Hospital Cost of Outcome Pathways in Implan...
- The Safety, Effectiveness, and Efficiency of Autol...
- Anatomy of the Gynecomastia Tissue and Its Clinica...
-
▼
Σεπ 12
(50)
- ► Φεβρουαρίου (1350)
- ► Ιανουαρίου (1400)
-
►
2015
(1500)
- ► Δεκεμβρίου (1450)
Ετικέτες
Δευτέρα 12 Σεπτεμβρίου 2016
Improving N-Gram probability estimates by compound-head clustering
Εγγραφή σε:
Σχόλια ανάρτησης (Atom)
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου