Intuitive Slice-based Exploration of Volumetric Medical Data

Medical imaging technologies have become an essential component in different areas related to health care. Volume visualization (VV) of medical data is an invaluable support in tasks such as clinical diagnosis, treatment planning, surgery rehearsal, education, and research. Several algorithms and systems have been developed to enable the visualization and interaction with volumetric data. Slice-based visualization methods dominate the field of medical volumes scanning since they allow more detailed analysis of the data. However, an intensive training is usually required for the user to be able to effectively explore the data. In this paper, we present novel a slice-based methodology which objective is to facilitate the exploration of medical volumetric data. The proposed method consist of the use of augmented reality principles to determine the spatial position and orientation of rigid planar objects within a defined space in the real-world which represents the medical volumetric information. The results obtained by a usability study indicate the feasibility of employing this technique for a natural human-computer interaction with the medical data, having the potential of making the process of medical volume exploration more easy and intuitive.


INTRODUCTION
structural, morphologic and functional information in the form of 3D or higher-dimensional data (i.e., volumetric data) which is crucial for the proper diagnosis, treatment, and evaluation of the progress of the patients [1] . Analysis of these data requires sophisticated computerized quantification and visualization tools [2] . Medical visualization includes modalities ranging from a single 2D image, 2D images series integrated by thin slices of an anatomic volume, 3D images, and even 4D images that represent how a volume of interest change over time [3] . 3D volume visualization (VV) of medical data, as those obtained using confocal microscopy, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), or functional MRI, is an important aid for diagnosis, treatment planning, surgery rehearsal, education, and research [4] .
Current visualization systems consist of one or more graphical displaying devices (e.g., 2D and stereoscopic 3D screens), one or more navigation devices (e.g., mouse, touchpad, touchscreen, tactile and force feedback, multiaxis, and voice commanded controllers) [5,6] . Several methods have been developed to enable the visualization and interaction with volumetric data [7,8,9,10] . These methods can be grouped into three basic categories: slice-based, surface rendering and volume rendering [11] .
Slice-based methods consist of extracting sections of a medical volume using cutting planes that can be orthogonal (i.e., axial, coronal, and sagittal) or arbitrary if computed employing Multiplanar Reconstruction (MPR).
Surface rendering methods consist of extracting contours and constructing a polygonal mesh to define the surface of the structure to be visualized. Volume rendering methods make 3D data visible by assigning specific values of color and opacity to each volume element (voxel) and projecting the result onto the image plane.
While 3D visualization often provides an overview of the spatial relations of the regions of interest which is useful for the diagnosis of complex pathologies, 2D visualization such as slice-based methods represent the dominant mode of exploring the volumes since it enables more detailed analysis of the data [12,13] .
However, the extraction of 2D slices with the desired orientation within a volume is a challenging task because the traditional user interfaces employ mediating devices (e.g. mouse and keyboard) and planar surfaces (e.g. monitors and displays) that operate with 2D projections of 3D data [14] increasing the final user's cognitive effort [15] . Therefore, a solution for VV may be one based on the direct exploration of 3D images through a physical real-world environment instead of a fixed 3D virtual space representation such as the use of augmented reality technologies [16] .
Augmented reality (AR) refers to those technologies to visualize and interact with computer-generated images superimposed over a user's view of the real world. To achieve a convincing augmentation effect it is required to accurately estimate the position and orientation of the relevant objects in physical space and align (register) them with the virtual objects considering the point of view of the observer, in a real-time three dimensional interactive way [17] . This is known as the pose estimation problem and consist of extracting 3D spatial information from 2D images by the use of a camera as a measuring device. The pose of a 3D rigid object indicates its position t and its orientation R with respect to a global reference system (Fig. 1). Numerous solutions to this inverse problem have been proposed over time [18] . In general, it is possible to estimate the object pose if the correspondences between some 3D features and their projections on the image plane are known.
Perspective-n-Point (PnP) is a special case of pose estimation (6 DoF) [19] . PnP algorithms are applicable only if a set of 3D-2D correspondences was previously estab-lished. In general, these correspondences can be obtained either by using marker and markerless-based approaches. In the marked-based AR, one or more cameras are employed to automatically detect and recognize a specific known pattern (i.e., fiducial markers) and use  [20] . Other markerless methods use information from sensors such as GPS, depth cameras or ultrasonic, magnetic and inertial devices (or some combination of them) to estimate the spatial position and orien-tation of objects [21,22,23] . Another important problem to solve is the visualization of the virtual objects within the real world. This is generally accomplished by the use of head mounted displays, holographic displays or by the superimposition of images on a computer, cell phone or tablet screen that projects the images acquired by a camera. It has been shown that the use of augmented reality technologies in health care have the potential to improve the existing techniques and procedures such as health care education [24,25,26] , medical training [25,27] , mentoring [28] , and assisted interventions [29,30,31] .
In this paper, we present a slice-based methodology for the exploration of 3D medical data. We use the aug-    The proposed method is based on the idea of mapping a real-world space to a virtual space where a virtual volume can be explored through the manipulation of a tangible object. In particular, we propose tracking the position and orientation of a rigid planar object with respect to a defined space. Then, we use the spatial configuration of the object to define a plane that intersects the virtual volume. Finally, a 2D slice of the 3D medical image is generated (Fig. 5). In this work, we present two implementations of the proposed methodology that differs on the techniques used for solving the pose estimation problem.

RGB-D data based method
We the HSV color space. Next, a threshold segmentation is applied on the hue and saturation channels to obtain a binary image that helps us to determine the regions of the image that corresponds to the spherical markers. Due to illumination changes, it is possible that the binary image contains additional objects different from the four spheres. To remove the non-sphere objects we apply a connected-component labeling algorithm [32] to determine the areas of every object in the binary image which allow us to discard the possible noise (i.e., small pixel regions with color values inside the threshold). By keeping only the four largest objects in the scene, we identify the region belonging to each sphere. Finally, we apply a hole-filling algorithm [33] to remove the possible holes in the segmented spheres due to variations in illumination of the scene, and we employ a mean filter to smooth the edges of the segmented spheres (Fig. 6).
(1) Then, we need to estimate the coordinates of the centroids of the spherical markers in the real-world.
The pseudo-inverse matrix is computed with the single value decomposition algorithm (SVD) [34] .
A slice of the virtual volume can be extracted by find-

Fiducial markers based method
In planar targets, square fiducial markers simplify the pose estimation problem by providing four key points [35] . The second method relies on the use of fiducial markers that are placed on the objects to track.
In our case, we attached a marker Mp on one side of the rigid planar object employed to slice the medical volume and one more Mv as a reference surface that will define the real-world space where the virtual volume will be mapped. To detect the markers, a fixed RGB camera is placed facing down at a certain height L2 ≤ 1 meter (Fig. 7).
The height H < L2 is calculated using the number of slices N and the spacing of the DICOM data. With W as a known lateral length of the reference surface, a square prism of dimensions W × W × H can be defined.
Again, we need to calculate a mapping between this physical space and the virtual space containing the medical volumetric data.  (Fig. 8).

RESULTS AND DISCUSSION
We implemented the RGB-D based method in C++ employing the ITK library [36] for the loading and processing of the medical images, the VTK [37] library for the visualization of the results, and OpenNI [38]   principle is based on the detection of a dot pattern projected by an infrared laser, the performance of the depth sensor can be affected by natural light. These conditions limit the application of the proposed implementation in a real-life setup.
The fiducial markers based method was implemented in C++ using the OpenCV library [39] and the ArUco library [40] to determine the position and orientation of the fiducial markers, and the VTK library [37] for the loading and processing of the medical images. The object's position processing part was executed on an NVidia Jetson TK1, and the image corresponding to the section of the volume displayed on a PC with a Core I5 processor @ 4GHz and 8GB RAM using Unity as the 3D graphics engine. In all tests, we used an HD webcam calibrated off-line. Figure 10 depicts some results of using the fiducial markers based method to explore a reconstructed medical volume from a CT scan of a head.
Note that similarly with the RGB-D method, the user was able to easily explore different sections of the volume by the manipulation of the slicing plane.
Differently from the RGB-D approach, in this case, it was possible to set up the system in a smaller area which allows the easier manipulation of the plane. The above make this implementation more feasible to be incorporated in a clinic or as a device that may be used on a desk. In addition, since this implementation does not require a depth camera, usually more expensive than a regular webcam, it can be built at a lower cost.
Therefore, we believe that fiducial markers based implementation is more suitable for practical uses.

Usability Study
To evaluate the perception of users when using the proposed methodology for the exploration of volumetric medical data, we employ two usability metrics tools: the Single Ease Question (SEQ) [41] and the System Usability Scale (SUS) [42] . SEQ is a rating scale to assess how difficult users find a task and that is administered immediately after a user attempts a task in a usability test. In our case, the question asked to the user was: "Overall, how difficult or easy was the task to complete", which is rated from 1 to 5 where 1 means very difficult and 5 means very easy.
In the same way, SUS is a reliable tool for measuring the usability of a system and that is currently the industry standard. It consists of a 10 item questionnaire with 5 response options for respondents; from Strongly agree (5) to Strongly disagree (1). The items employed were: 1. I think that I would like to use this system frequently 2. I found the system unnecessarily complex 3. I thought the system was easy to use 4. I think that I would need the support of a technical person to be able to use this system 5. I found the various functions in this system were well integrated 6. I thought there was too much inconsistency in this system 7. I would imagine that most people would learn to use this system very quickly 8. I found the system very cumbersome to use 9. I felt very confident using the system 10. I needed to learn a lot of things before I could get going with this system Additionally, we asked the participants how much were they familiar with handling medical volumes such as CT or MRI on a five-point scale where one means no familiar at all and five very familiar. A total of fifty persons with ages ranged from 19 to 23 years old were asked to use the proposed method. They explored a brain MRI for two minutes without any explanation of how the system should be used or what the system was supposed to do. Then, we ask the participants to answer the questionnaires according to their experience with the system. The average SEQ score obtained was 4.36. Figure 11 depicts the histogram of the obtained SEQ scores. Note that the majority of the persons consider the system easy or very easy to use. The average SUS score obtained was 72.55 which correspond to an subjective rating of "good" according to the scale. Figures 12 and   13 depicts the histogram of the obtained scores for all the participants and for each question, respectively. Note that most of the people indicated that they think that the system was easy to use, that they would like to use the system frequently and that they imagine that most people would learn to use the system very quickly. However, the results indicate that some people think that they would need the support of a technical person and that they would have to learn more things before starting to use the system.

CONCLUSIONS
One of the main limitations of current methods for analysis of medical image volumes with the computer is the required ability for controlling the mouse and keyboard in such a way that the desired interaction is achieved. Therefore the physicians typically rely on radiologists experts to perform the search on the data and report printouts with the slices that may contain the most useful information. The slice-based methodology for volume viewing that we have presented have the potential of making the process of medical volume exploration easier and intuitive since it is based on the use of a manageable controlled rigid plane in the realworld. Therefore, it is possible that future versions of this system will be present in medical facilities and health clinics as a tool that will help to further explore the data produced by volumetric medical modalities, potentially translating into better diagnosis and treatment of diseases.
However, to be able to convert the proposed system into a successful product it would be necessary to overcome current limitations such as the requirement of a sufficient environmental illumination that would allow to accurately detect the rigid plane. Moreover, it would be important to be able to perform more advanced interactions with the medical volumes as those that are available in commercial slicing software (i.e., segmentation, classification, isosurface reconstruction, etc.). The overcoming of those limitations and the incorporation of VR and AR headsets is subject of future work.
Despite these limitations, our results indicate the feasibility of employing the proposed methodology to enable any user to explore volumetric medical data without the requirement of any training or instruction. This natural intuitive capability of the proposed system could be an advantage for physicians who want to explore medical data but are not familiar with the complicated software and systems currently available on the market.