Publication date: November 2017
Source:Computer Speech & Language, Volume 46
Author(s): Scott Piao, Fraser Dallachy, Alistair Baron, Jane Demmen, Steve Wattam, Philip Durkin, James McCracken, Paul Rayson, Marc Alexander
Automatic extraction and analysis of meaning-related information from natural language data has been an important issue in a number of research areas, such as natural language processing (NLP), text mining, corpus linguistics, and data science. An important aspect of such information extraction and analysis is the semantic annotation of language data using a semantic tagger. In practice, various semantic annotation tools have been designed to carry out different levels of semantic annotation, such as topics of documents, semantic role labeling, named entities or events. Currently, the majority of existing semantic annotation tools identify and tag partial core semantic information in language data, but they tend to be applicable only for modern language corpora. While such semantic analyzers have proven useful for various purposes, a semantic annotation tool that is capable of annotating deep semantic senses of all lexical units, or all-words tagging, is still desirable for a deep, comprehensive semantic analysis of language data. With large-scale digitization efforts underway, delivering historical corpora with texts dating from the last 400 years, a particularly challenging aspect is the need to adapt the annotation in the face of significant word meaning change over time. In this paper, we report on the development of a new semantic tagger (the Historical Thesaurus Semantic Tagger), and discuss challenging issues we faced in this work. This new semantic tagger is built on existing NLP tools and incorporates a large-scale historical English thesaurus linked to the Oxford English Dictionary. Employing contextual disambiguation algorithms, this tool is capable of annotating lexical units with a historically-valid highly fine-grained semantic categorization scheme that contains about 225,000 semantic concepts and 4,033 thematic semantic categories. In terms of novelty, it is adapted for processing historical English data, with rich information about historical usage of words and a spelling variant normalizer for historical forms of English. Furthermore, it is able to make use of knowledge about the publication date of a text to adapt its output. In our evaluation, the system achieved encouraging accuracies ranging from 77.12% to 91.08% on individual test texts. Applying time-sensitive methods improved results by as much as 3.54% and by 1.72% on average.
from #MedicinebyAlexandrosSfakianakis via xlomafota13 on Inoreader http://ift.tt/2rAqQoO
via IFTTT
Αρχειοθήκη ιστολογίου
-
►
2020
(289)
- ► Φεβρουαρίου (28)
-
►
2019
(9071)
- ► Δεκεμβρίου (19)
- ► Σεπτεμβρίου (54)
- ► Φεβρουαρίου (3642)
- ► Ιανουαρίου (3200)
-
►
2018
(39872)
- ► Δεκεμβρίου (3318)
- ► Σεπτεμβρίου (3683)
- ► Φεβρουαρίου (2693)
- ► Ιανουαρίου (3198)
-
▼
2017
(41099)
- ► Δεκεμβρίου (3127)
- ► Σεπτεμβρίου (2173)
-
▼
Μαΐου
(6766)
-
▼
Μαΐ 30
(425)
- The effect of subinhibitory concentrations of gent...
- Podcast Interviews
- The Role of Patients
- Decision Making for Diagnosis and Management
- Safer VL intubation: Don't lift or displace the to...
- Flap Basics I
- Anatomy of the Skin and the Pathogenesis of Nonmel...
- Reconstruction of Cutaneous Nasal Defects
- Scar Revision and Recontouring Post-Mohs Surgery
- The Physiology and Biomechanics of Skin Flaps
- Speech-evoked auditory brainstem responses in chil...
- Glucosamine has an antiallergic effect in mice wit...
- Safety and efficacy of a bioabsorbable fluticasone...
- Safety and tolerability of surfactant nasal irriga...
- Safer Intubation Tip #5
- Disease activity and mucosal healing in inflammato...
- SMARCA4-deficient pulmonary adenocarcinoma: clinic...
- Comparative evaluation of insertion torque and mec...
- QUILT-3.047: NANT Head and Neck Squamous Cell Carc...
- Phase 2 Trial of Apatinib Mesylate in Locally Adva...
- "Model for Early Allograft Function" outperforms "...
- Plasma Exosomes from HLA-Sensitized Kidney Transpl...
- Comparative Evaluation of [alpha]CD40 (2C10R4) and...
- Magnetic behaviour of hydrogenated Ho(1-x)Mm(x)Co(...
- Multi-Objective Genetic Algorithms for the minimis...
- DeepEar: Robust Smartphone Audio Sensing in Uncons...
- Goodbye Warm Front: Evaluating the Delivery of Ene...
- Can deep learning revolutionize mobile sensing?
- A theoretical elucidation of glucose interaction w...
- Modified Radius Directed Equaliser for High Order QAM
- Social Touch Gesture Recognition using Random Fore...
- Optical Non-Contact Railway Track Measurement with...
- Public understanding of the purpose of cancer scre...
- How to study spoken language understanding: a surv...
- Early Endarterectomy Carries a Lower Procedural Ri...
- Joint CHEST-SGP Congress 2017. Basel, Switzerland,...
- Graphene Oxide Framework Materials: Theoretical Pr...
- The healthy human cerebellum engaging in complex p...
- Identity projects in complementary and mainstream ...
- Adsorption Sites and Binding Nature of CO 2 in Pro...
- A new family of metal borohydride ammonia borane c...
- Exploring mobile news reading interactions for new...
- Multi-Objective Genetic Algorithms for the minimis...
- Magnetic behaviour of hydrogenated Ho(1-x)Mm(x)Co(...
- Goodbye Warm Front: Evaluating the Delivery of Ene...
- Early Endarterectomy Carries a Lower Procedural Ri...
- Optical Non-Contact Railway Track Measurement with...
- Delayed diagnosis of subcutaneous dirofilariasis f...
- Can deep learning revolutionize mobile sensing?
- Radiosensitization by BRAF inhibitors
- DeepEar: Robust Smartphone Audio Sensing in Uncons...
- A theoretical elucidation of glucose interaction w...
- Public understanding of the purpose of cancer scre...
- How to study spoken language understanding: a surv...
- Efficient cross-coupling of aryl chlorides with ar...
- Thermodynamics of addition of H-2, CO, N-2, and C-...
- Full title with Editorial board members
- On the origin of selective nitrous oxide N-N bond ...
- IOP-details
- Instructions to Authors
- Increased Frequency of Bronchiolar Histotypes in L...
- Olefin metathesis-active ruthenium complexes beari...
- Four-coordinate molybdenum chalcogenide complexes ...
- Thermodynamics of phosphine coordination to the [P...
- On the origin of selective nitrous oxide N-N bond ...
- Olefin metathesis-active ruthenium complexes beari...
- Four-coordinate molybdenum chalcogenide complexes ...
- Efficient cross-coupling of aryl chlorides with ar...
- Thermodynamics of addition of H-2, CO, N-2, and C-...
- Thermodynamics of phosphine coordination to the [P...
- α2,6-Sialylation mediates hepatocellular carcinoma...
- The immunosuppressive cytokine interleukin-4 incre...
- Prostate cancer incidence as an iceberg
- Association between childhood adversity and a diag...
- The hidden epidemic of schistosomiasis in recent A...
- Post San Antonio Breast Cancer Symposium
- Morphological control of self-assembled multivalen...
- Crosslinked shells for nano-assembled capsules: a ...
- Lewis acid catalyzed cascade annulation of alkynol...
- Observing the Dynamic "Hot Spots" on Two Dimension...
- Role of apoptosis in the development of autosomal ...
- Extrarenal determinants of kidney filter function
- Quantifying podocyte depletion: theoretical and pr...
- Role of TGF-β in metastatic colon cancer: it is fi...
- Engineering kidney cells: reprogramming and direct...
- A Dissimilar Biosimilar?: Lichenoid Drug Eruption ...
- "Anticancer Res"[jour]; +75 new citations
- Thermodynamics of addition of CO, isocyanide, and ...
- Oncologic safety of cervical nerve preservation in...
- Corrigendum to “Differences in Brain Metabolic Imp...
- First transition metal-boryl bond energy and quant...
- New Metamaterial Helps Improve High Field MRI Scans
- Altered postcapillary and collecting venular react...
- Spectroscopic detection of organolanthanide dihydr...
- The Response of Macro- and Micronutrient Nutrient ...
- Corrigendum to "Is there a correlation between nas...
- Evolving trends in head and neck cancer epidemiolo...
- Salvage surgery for oropharyngeal squamous cell ca...
- A numerical kinematic model of welding process for...
- Heat girdling does not affect xylem integrity: an ...
-
▼
Μαΐ 30
(425)
-
►
2016
(13807)
- ► Δεκεμβρίου (700)
- ► Σεπτεμβρίου (600)
- ► Φεβρουαρίου (1350)
- ► Ιανουαρίου (1400)
-
►
2015
(1500)
- ► Δεκεμβρίου (1450)
Ετικέτες
Τρίτη 30 Μαΐου 2017
A time-sensitive historical thesaurus-based semantic tagger for deep semantic annotation
Εγγραφή σε:
Σχόλια ανάρτησης (Atom)
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου