Background: The large-scale analysis of phenomic data (i.e., full phenotypic traits of an organism, such as shape, metabolic substrates, and growth conditions) in microbial bioinformatics has been hampered by the lack of tools to rapidly and accurately extract phenotypic data from existing legacy text in the field of microbiology. To quickly obtain knowledge on the distribution and evolution of microbial traits, an information extraction system needed to be developed to extract phenotypic characters from large numbers of taxonomic descriptions so they can be used as input to existing phylogenetic analysis software packages. Results: We report the development and evaluation of Microbial Phenomics Information Extractor (MicroPIE, version 0.1.0). MicroPIE is a natural language processing application that uses a robust supervised classification algorithm (Support Vector Machine) to identify characters from sentences in prokaryotic taxonomic descriptions, followed by a combination of algorithms applying linguistic rules with groups of known terms to extract characters as well as character states. The input to MicroPIE is a set of taxonomic descriptions (clean text). The output is a taxon-by-character matrix—with taxa in the rows and a set of 42 pre-defined characters (e.g., optimum growth temperature) in the columns. The performance of MicroPIE was evaluated against a gold standard matrix and another student-made matrix. Results show that, compared to the gold standard, MicroPIE extracted 21 characters (50%) with a Relaxed F1 score > 0.80 and 16 characters (38%) with Relaxed F1 scores ranging between 0.50 and 0.80. Inclusion of a character prediction component (SVM) improved the overall performance of MicroPIE, notably the precision. Evaluated against the same gold standard, MicroPIE performed significantly better than the undergraduate students. Conclusion: MicroPIE is a promising new tool for the rapid and efficient extraction of phenotypic character information from prokaryotic taxonomic descriptions. However, further development, including incorporation of ontologies, will be necessary to improve the performance of the extraction for some character types.
from #MedicinebyAlexandrosSfakianakis via xlomafota13 on Inoreader http://ift.tt/2hovkGt
via IFTTT
Αρχειοθήκη ιστολογίου
-
►
2020
(289)
- ► Φεβρουαρίου (28)
-
►
2019
(9071)
- ► Δεκεμβρίου (19)
- ► Σεπτεμβρίου (54)
- ► Φεβρουαρίου (3642)
- ► Ιανουαρίου (3200)
-
►
2018
(39872)
- ► Δεκεμβρίου (3318)
- ► Σεπτεμβρίου (3683)
- ► Φεβρουαρίου (2693)
- ► Ιανουαρίου (3198)
-
►
2017
(41099)
- ► Δεκεμβρίου (3127)
- ► Σεπτεμβρίου (2173)
-
▼
2016
(13807)
-
▼
Δεκεμβρίου
(700)
-
▼
Δεκ 12
(81)
- Secondary mandibular reconstruction for paediatric...
- Development of Improved HDAC6 Inhibitors as Pharma...
- Qualitatively Understanding Patients’ and Health P...
- Empirically-Supported Psycho-Oncology Practices: R...
- Correlation of reduced interlayer charge transfer ...
- Differences in Behavior and Brain Activity during ...
- SALVAGE THERAPY WITH CEFTOLOZANE/TAZOBACTAM FOR MU...
- Population Pharmacokinetic Modeling of Tenofovir i...
- Pharmacokinetics of miltefosine in children and ad...
- Role of pyrazinamide in the emergence of extensive...
- ESBL-producing Enterobacteriaceae in France: inven...
- Cephem potentiation by inactivation of non-essenti...
- Impact of Vancomycin Minimum Inhibitory Concentrat...
- Protracted regional dissemination of GIM-1-produci...
- In vitro antifungal susceptibility profile of 12 a...
- Synergistic Interactions between Hepatitis B Virus...
- Safety, tolerability, and pharmacokinetics of the ...
- Synthesis and evaluation of chirally defined side ...
- Fungal CYP51 Inhibitors VT-1161 and VT-1129 exhibi...
- Effects of Oritavancin on Coagulation Tests in the...
- Chromosome-encoded broad-spectrum Ambler class A {...
- β-Amyloid triggers aberrant over-scaling of homeos...
- The curious case of an internal pilot in a multice...
- Safety and efficacy of vinorelbine in combination ...
- Metabolic profiling of pregnancy: cross-sectional ...
- New classification of endometrial cancers: the dev...
- Microbial phenomics information extractor (MicroPI...
- Developing an approach to assessing the political ...
- Desire for predictive testing for Alzheimer’s dise...
- Exploratory analysis of CD63 and CD203c expression...
- Stabbing energy and force required for pocket-kniv...
- Green pus in the subdural space and within the ven...
- Make-up and love bites: two reports about exceptio...
- Customary law, traditional punishment, and death i...
- Unintentional asphyxia, SIDS, and medically explai...
- The impact of luminance on tonic and phasic pupill...
- Impact of cystic fibrosis disease on archaea and b...
- The hot oyster: levels of virulent Vibrio parahaem...
- In vitro fermentation of B-GOS: impact on faecal b...
- Exploratory analysis of CD63 and CD203c expression...
- Exploratory analysis of CD63 and CD203c expression...
- European Society of Neuroradiology (ESNR)
- Annual Report 2016
- Dual-targeting nanoparticles for in vivo delivery ...
- Dual inhibition of key proliferation signaling pat...
- Measuring safety of inhaled corticosteroids in asthma
- Immunodominance in allergic T-cell reactivity to J...
- Inhalation devices, delivery systems, and patient ...
- Inhaled corticosteroids
- Instructions for Authors
- Development and usability of a computer-tailored p...
- The effect of a reduced power quality on the energ...
- From normative influence to social pressure: how r...
- Equating accelerometer estimates among youth: the ...
- 'Can I afford to help?’: how affordances of commun...
- Micro-architecture independent branch behavior mod...
- Measuring dwell time percentage from head-mounted ...
- Concept study of a double rotor induction machine ...
- Development and usability of a computer-tailored p...
- The effect of a reduced power quality on the energ...
- Plasma microRNA biomarker detection for mild cogni...
- Videotaping of surgical procedures and outcomes fo...
- Latissimus dorsi flap with vascularized lymph node...
- Cover Image, Volume 36, Issue 12
- Issue Information
- Proliferation-Related Activity in Endothelial Cell...
- New tool to help predict dementia risk in older pe...
- Noise-Corrected Principal Component Analysis of fl...
- Objective identification of dental abnormalities w...
- Abdominal Wall Reconstruction Using Retrorectus Se...
- The Nipple–Areola Preserving Mastectomy: The Value...
- Verification of aggregated flows in OpenFlow networks
- The effect of relaxation therapy on autonomic func...
- Upstream content scheduling in Wi-Fi DenseNets dur...
- Multi-segmented foot landing kinematics in subject...
- Verification of aggregated flows in OpenFlow networks
- The effect of relaxation therapy on autonomic func...
- Upstream content scheduling in Wi-Fi DenseNets dur...
- Multi-segmented foot landing kinematics in subject...
- Stereoselective Synthesis of Functionalized Bicycl...
- A study on treatment outcome and adverse drug reac...
-
▼
Δεκ 12
(81)
- ► Σεπτεμβρίου (600)
- ► Φεβρουαρίου (1350)
- ► Ιανουαρίου (1400)
-
▼
Δεκεμβρίου
(700)
-
►
2015
(1500)
- ► Δεκεμβρίου (1450)
Ετικέτες
Δευτέρα 12 Δεκεμβρίου 2016
Microbial phenomics information extractor (MicroPIE): a natural language processing tool for the automated acquisition of prokaryotic phenotypic characters from text sources
Εγγραφή σε:
Σχόλια ανάρτησης (Atom)
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου