Αρχειοθήκη ιστολογίου

Τρίτη 4 Ιουνίου 2019

Computer Assisted Radiology and Surgery

EasyLabels: weak labels for scene segmentation in laparoscopic videos

Abstract

Purpose

We present a different approach for annotating laparoscopic images for segmentation in a weak fashion and experimentally prove that its accuracy when trained with partial cross-entropy is close to that obtained with fully supervised approaches.

Methods

We propose an approach that relies on weak annotations provided as stripes over the different objects in the image and partial cross-entropy as the loss function of a fully convolutional neural network to obtain a dense pixel-level prediction map.

Results

We validate our method on three different datasets, providing qualitative results for all of them and quantitative results for two of them. The experiments show that our approach is able to obtain at least \(90\%\) of the accuracy obtained with fully supervised methods for all the tested datasets, while requiring \(\sim 13\) \(\times \) less time to create the annotations compared to full supervision.

Conclusions

With this work, we demonstrate that laparoscopic data can be segmented using very few annotated data while maintaining levels of accuracy comparable to those obtained with full supervision.



Position-based modeling of lesion displacement in ultrasound-guided breast biopsy

Abstract

Purpose

Although ultrasound (US) images represent the most popular modality for guiding breast biopsy, malignant regions are often missed by sonography, thus preventing accurate lesion localization which is essential for a successful procedure. Biomechanical models can support the localization of suspicious areas identified on a preoperative image during US scanning since they are able to account for anatomical deformations resulting from US probe pressure. We propose a deformation model which relies on position-based dynamics (PBD) approach to predict the displacement of internal targets induced by probe interaction during US acquisition.

Methods

The PBD implementation available in NVIDIA FleX is exploited to create an anatomical model capable of deforming online. Simulation parameters are initialized on a calibration phantom under different levels of probe-induced deformations; then, they are fine-tuned by minimizing the localization error of a US–visible landmark of a realistic breast phantom. The updated model is used to estimate the displacement of other internal lesions due to probe-tissue interaction.

Results

The localization error obtained when applying the PBD model remains below 11 mm for all the tumors even for input displacements in the order of 30 mm. This proposed method obtains results aligned with FE models with faster computational performance, suitable for real-time applications. In addition, it outperforms rigid model used to track lesion position in US-guided breast biopsies, at least halving the localization error for all the displacement ranges considered.

Conclusion

Position-based dynamics approach has proved to be successful in modeling breast tissue deformations during US acquisition. Its stability, accuracy and real-time performance make such model suitable for tracking lesions displacement during US-guided breast biopsy.



IJCARS—IPCAI 2019 special issue: conference information processing for computer-assisted interventions, 10th international conference 2019—part 1


Automatic self-gated 4D-MRI construction from free-breathing 2D acquisitions applied on liver images

Abstract

Purpose

MRI slice reordering is a necessary step when three-dimensional (3D) motion of an anatomical region of interest has to be extracted from multiple two-dimensional (2D) dynamic acquisition planes, e.g., for the construction of motion models used for image-guided radiotherapy. Existing reordering methods focus on obtaining a spatially coherent reconstructed volume for each time. However, little attention has been paid to the temporal coherence of the reconstructed volumes, which is of primary importance for accurate 3D motion extraction. This paper proposes a fully automatic self-sorting four-dimensional MR volume construction method that ensures the temporal coherence of the results.

Methods

First, a pseudo-navigator signal is extracted for each 2D dynamic slice acquisition series. Then, a weighted graph is created using both spatial and motion information provided by the pseudo-navigator. The volume at a given time point is reconstructed following the shortest paths in the graph starting that time point of a reference slice chosen based on its pseudo-navigator signal.

Results

The proposed method is evaluated against two state-of-the-art slice reordering algorithms on a prospective dataset of 12 volunteers using both spatial and temporal quality metrics. The automated end-exhale extraction showed results closed to the median value of the manual operators. Furthermore, the results of the validation metrics show that the proposed method outperforms state-of-the-art methods in terms of both spatial and temporal quality.

Conclusion

Our approach is able to automatically detect the end-exhale phases within one given anatomical position and cope with irregular breathing.



Uncertainty-aware performance assessment of optical imaging modalities with invertible neural networks

Abstract

Purpose

Optical imaging is evolving as a key technique for advanced sensing in the operating room. Recent research has shown that machine learning algorithms can be used to address the inverse problem of converting pixel-wise multispectral reflectance measurements to underlying tissue parameters, such as oxygenation. Assessment of the specific hardware used in conjunction with such algorithms, however, has not properly addressed the possibility that the problem may be ill-posed.

Methods

We present a novel approach to the assessment of optical imaging modalities, which is sensitive to the different types of uncertainties that may occur when inferring tissue parameters. Based on the concept of invertible neural networks, our framework goes beyond point estimates and maps each multispectral measurement to a full posterior probability distribution which is capable of representing ambiguity in the solution via multiple modes. Performance metrics for a hardware setup can then be computed from the characteristics of the posteriors.

Results

Application of the assessment framework to the specific use case of camera selection for physiological parameter estimation yields the following insights: (1) estimation of tissue oxygenation from multispectral images is a well-posed problem, while (2) blood volume fraction may not be recovered without ambiguity. (3) In general, ambiguity may be reduced by increasing the number of spectral bands in the camera.

Conclusion

Our method could help to optimize optical camera design in an application-specific manner.



Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos

Abstract

Purpose

Real-time surgical tool tracking is a core component of the future intelligent operating room (OR), because it is highly instrumental to analyze and understand the surgical activities. Current methods for surgical tool tracking in videos need to be trained on data in which the spatial positions of the tools are manually annotated. Generating such training data is difficult and time-consuming. Instead, we propose to use solely binary presence annotations to train a tool tracker for laparoscopic videos.

Methods

The proposed approach is composed of a CNN + Convolutional LSTM (ConvLSTM) neural network trained end to end, but weakly supervised on tool binary presence labels only. We use the ConvLSTM to model the temporal dependencies in the motion of the surgical tools and leverage its spatiotemporal ability to smooth the class peak activations in the localization heat maps (Lh-maps).

Results

We build a baseline tracker on top of the CNN model and demonstrate that our approach based on the ConvLSTM outperforms the baseline in tool presence detection, spatial localization, and motion tracking by over \(5.0\%\) \(13.9\%\) , and \(12.6\%\) , respectively.

Conclusions

In this paper, we demonstrate that binary presence labels are sufficient for training a deep learning tracking model using our proposed method. We also show that the ConvLSTM can leverage the spatiotemporal coherence of consecutive image frames across a surgical video to improve tool presence detection, spatial localization, and motion tracking.



Deformable multimodal registration for navigation in beating-heart cardiac surgery

Abstract

Purpose:

Minimally invasive beating-heart surgery is currently performed using endoscopes and without navigation. Registration of intraoperative ultrasound to a preoperative cardiac CT scan is a valuable step toward image-guided navigation.

Methods:

The registration was achieved by first extracting a representative point set from each ultrasound image in the sequence using a deformable registration. A template shape representing the cardiac chambers was deformed through a hierarchy of affine transformations to match each ultrasound image using a generalized expectation maximization algorithm. These extracted point sets were matched to the CT by exhaustively searching over a large number of precomputed slices of 3D geometry. The result is a similarity transformation mapping the intraoperative ultrasound to preoperative CT.

Results:

Complete data sets were acquired for four patients. Transesophageal echocardiography ultrasound sequences were deformably registered to a model of oriented points with a mean error of 2.3 mm. Ultrasound and CT scans were registered to a mean of 3 mm, which is comparable to the error of 2.8 mm expected by merging ultrasound registration with uncertainty of cardiac CT.

Conclusion:

The proposed algorithm registered 3D CT with dynamic 2D intraoperative imaging. The algorithm aligned the images in both space and time, needing neither dynamic CT imaging nor intraoperative electrocardiograms. The accuracy was sufficient for navigation in thoracoscopically guided beating-heart surgery.



Catheter localization in 3D ultrasound using voxel-of-interest-based ConvNets for cardiac intervention

Abstract

Purpose

Efficient image-based catheter localization in 3D US during cardiac interventions is highly desired, since it facilitates the operation procedure, reduces the patient risk and improves the outcome. Current image-based catheter localization methods are not efficient or accurate enough for real clinical use.

Methods

We propose a catheter localization method for 3D cardiac ultrasound (US). The catheter candidate voxels are first pre-selected by the Frangi vesselness filter with adaptive thresholding, after which a triplanar-based ConvNet is applied to classify the remaining voxels as catheter or not. We propose a Share-ConvNet for 3D US, which reduces the computation complexity by sharing a single ConvNet for all orthogonal slices. To boost the performance of ConvNet, we also employ two-stage training with weighted cross-entropy. Using the classified voxels, the catheter is localized by a model fitting algorithm.

Results

To validate our method, we have collected challenging ex vivo datasets. Extensive experiments show that the proposed method outperforms state-of-the-art methods and can localize the catheter with an average error of 2.1 mm in around 10 s per volume.

Conclusion

Our method can automatically localize the cardiac catheter in challenging 3D cardiac US images. The efficiency and accuracy localization of the proposed method are considered promising for catheter detection and localization during clinical interventions.



Toward an automatic preoperative pipeline for image-guided temporal bone surgery

Abstract

Purpose

Minimally invasive surgery is often built upon a time-consuming preoperative step consisting of segmentation and trajectory planning. At the temporal bone, a complete automation of these two tasks might lead to faster interventions and more reproducible results, benefiting clinical workflow and patient health.

Methods

We propose an automatic segmentation and trajectory planning pipeline for image-guided interventions at the temporal bone. For segmentation, we use a shape regularized deep learning approach that is capable of automatically detecting even the cluttered tiny structures specific for this anatomy. We then perform trajectory planning for both linear and nonlinear interventions on these automatically segmented risk structures.

Results

We evaluate the usability of segmentation algorithms for planning access canals to the cochlea and the internal auditory canal on 24 CT data sets of real patients. Our new approach achieves similar results to the existing semiautomatic method in terms of Dice but provides more accurate organ shapes for the subsequent trajectory planning step. The source code of the algorithms is publicly available.

Conclusion

Automatic segmentation and trajectory planning for various clinical procedures at the temporal bone are feasible. The proposed automatic pipeline leads to an efficient and unbiased workflow for preoperative planning.



Face detection in the operating room: comparison of state-of-the-art methods and a self-supervised approach

Abstract

Purpose

Face detection is a needed component for the automatic analysis and assistance of human activities during surgical procedures. Efficient face detection algorithms can indeed help to detect and identify the persons present in the room and also be used to automatically anonymize the data. However, current algorithms trained on natural images do not generalize well to the operating room (OR) images. In this work, we provide a comparison of state-of-the-art face detectors on OR data and also present an approach to train a face detector for the OR by exploiting non-annotated OR images.

Methods

We propose a comparison of six state-of-the-art face detectors on clinical data using multi-view OR faces, a dataset of OR images capturing real surgical activities. We then propose to use self-supervision, a domain adaptation method, for the task of face detection in the OR. The approach makes use of non-annotated images to fine-tune a state-of-the-art detector for the OR without using any human supervision.

Results

The results show that the best model, namely the tiny face detector, yields an average precision of 0.556 at intersection over union of 0.5. Our self-supervised model using non-annotated clinical data outperforms this result by 9.2%.

Conclusion

We present the first comparison of state-of-the-art face detectors on OR images and show that results can be significantly improved by using self-supervision on non-annotated data.



Alexandros Sfakianakis
Anapafseos 5 . Agios Nikolaos
Crete.Greece.72100
2841026182
6948891480

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου