Head-to-Head Comparison of Real-Time Video Tablet Platforms for Telestroke Applications

Telestroke is an integral platform for physicians to provide accurate diagnosis and treatment to patients with a possible stroke in remote locations. We evaluated the performance of two iPad applications, Clear Sea and Cisco Jabber for potential usability in telestroke. We conducted a single blind study wherein 15 volunteers underwent 4 separate assessments using abbreviated versions of the National Institutes of Health on-site and off-site, each using Cisco Jabber and Clear Sea. Both volunteer and investigator surveys were collected. Perceptions and usability of each application were measured by grading each variable on a score of 1 5 on the Likert scale. The Cisco Jabber mean (±SD) total score was 4.15 ± 0.78 versus 3.88 ± 0.82 for Clear Sea (P = 0.18) indicating 91% probability that Jabber was superior to Clear Sea. A sample of 60 volunteers would have 80% power. The maximum difference was noted in image quality, where Cisco Jabber scored 3.93 ± 0.82 and Clear Sea scored 3.60 ± 0.78 (P = 0.13). Since the directionality of the experiment was not predetermined, a two-tailed pared t-test was used to arrive at a conclusion. With the statistical results shown above we concluded that there was a modest preference for Cisco Jabber over Clear Sea, but a larger trial with stroke patients was still warranted.


Introduction
The use of telemedicine in stroke assessment is one that has been established and studied over the years, reducing disparities in healthcare delivery between rural and urban medical centers [1].An example of this is Mayo Clinic, which has been deploying telemedicine carts in various locations in the state of Arizona since the year 2007 [1]- [3].Timely delivery of treatment to a patient with acute ischemic stroke depends upon an accurate history and assessment.The American Heart Association (AHA) and the American Stroke Association (ASA) have guidelines in place for validating the reliability of the National Institutes of Health Stroke Scale (NIHSS) conducted via high quality video teleconferencing (HQ VTC) when face-to-face assessment is not an option [3] [4]. .Demaerschalk et al. analyzed the use of HQ VTC on smartphones and have particularly addressed these concerns in a recent study (STRokE DOC AZ TIME trial) [5].Smartphones are both relatively inexpensive and already widely used by physicians, and this study demonstrated high physician satisfaction along with high NIHSS score correlation.Verification of the viability of handheld platforms for HQ VTC in telestroke has opened the doors for investigation of other portable devices, such as tablets [3] [4].If the cost of telemedicine can be decreased while increasing or maintaining connection quality, its application and development can become even more ubiquitous.Mobile tablets like the iPad enable installation of applications like LifeSize Clear Sea and Cisco jabber.These applications are useful for high quality desktop or mobile/tablet video conferencing.High quality video conferencing helps enable better quality of assessment [6].
The purpose of this project was to compare the efficacy of two real-time video applications Clear Sea and Cisco Jabber from a patient stand point while evaluating a stroke patient in a remote setting.This was done with the help of volunteers who work at the Mayo Clinic serving patients and evaluating the two applications with the examiner being off-site and on-site.Hence, a total of two evaluations for each application were collected.The volunteers enrolled were members of the Volunteer Services at Mayo Clinic and were not medical care personnel.
Further described are the methods in which this study was carried out along with the preferred statistical method of analysis and a brief review of the results.This is followed by a discussion and conclusion highlighting the pros and cons of the software used in this experiment.

Methods
The study was designed as a comparative technology quality assessment.We conducted focused mock stroke consultations with the help of representing volunteers, utilizing specific parts of the NIHSS (mentioned in detail below) to assess the effectiveness of each platform, as the reliability of remote NIHSS assessments compared to face-to-face evaluations has already been established [3] [4].These methods focused on a comparison of the two applications' merits rather than comparison with a face-to-face assessment.The telemedicine cart was located at the Mayo Clinic Hospital in Phoenix, Arizona, with the assessment being conducted both from an off-Mayo network campus site (e.g. home office) and from an on-campus (on the Mayo network) site to evaluate quality of connection and viability of a portable tablet.
In this study, the handheld tablet device of choice was the iPad.Both Clear Sea and Cisco Jabber applications were tested in the two different locations for a comparison of performance.As the Clear Sea application allowed for manipulation of the camera on the telemedicine cart, elements of zoom, pan, and tilt were also assessed for this particular platform to provide preliminary insight as to the desirability or functionality of these features.
The on-site assessments involved elements taken from the NIHSS, including the following: 1b) Level of consciousness questions: volunteers were asked the current month and their age; 1c) Level of consciousness commands: volunteers were instructed to open/close their eyes and make a fist and release; 4) Facial palsy: volunteers were asked to show their teeth or smile widely and raise their eyebrows; 9) Best language: volunteers were asked to read standard sentences; 10) Dysarthria: volunteers were asked to read words off a standard list.Element 1b of the NIHSS was largely used to test initial sound quality, 1c and 4 dealt with image quality on the investigator end and 9 and 10 involved testing image quality on the volunteer end.
Off-site assessments involved the use of NIHSS elements 2) Best Gaze: horizontal eye movements which assessed video quality for the investigator end; The NIHSS element 9) Best Language: thorough object identification and picture description was used to assess video quality on the volunteer end.Audio quality was assessed through investigator-volunteer interactions throughout the session.
Order of the applications was randomized to try to minimize bias for each volunteer from on-site to off-site assessments.Different NIHSS elements assessing the same basic quality were used for on-site versus off-site assessments to reduce bias while maintaining consistency as well.
The primary goal of this study was to assess technology for quality control rather than to assess patients.

Evaluation Tool
The evaluation tool was a Likert scale from 1 -5 for the four variables, with 1 being the lowest (worst) rating and 5 being the highest (best) rating.These variables were chosen based on relevance to a preliminary assessment of the different applications' potentials for use in telemedicine.Volunteers were given four forms to fill out, one for each applications tested at each site.The volunteer evaluation forms asked for responses to two different main categories: "Perception" and "Usability" with the following subcategories: 1) Patient perception a) Sound quality; b) Image quality.
2) Overall usability a) Ease of communication; b) Quality of the connection.For a more delineated description of each numerical rating, the following descriptors were given to study participants for clarification: 1 = very poor; 2 = poor; 3 = neutral/no opinion either way; 4 = good; 5 = very good.The volunteers were also presented with the option to include comments or other points of clarification for their reasoning.
On the investigator end, a similar sheet was used for each application with the following additions: 1) Interface design for navigation 2) Clear Sea specific features a) Zoom; b) Pan; c) Tilt.
The study was a single blind study where the volunteers were unaware of the application being used.The investigator was not blinded so as to note which application was in use.Additional comments were also recorded as needed.

Statistical Methods for Analysis
Means and standard deviations were calculated for relevant sets of data.The two applications were compared on the basis of the aforementioned four variables.Since the groups assessing the two applications comprised of the same 15 volunteers, the groups were considered to be matched.As directionality of expected effect (which application is better) was not previously determined, a two-tailed paired t test was used for our assessment.The difference between groups was also assessed by calculating the Bayes posterior probability with a noninformative prior.
For comparing the two applications as a whole, all four variables were averaged out on-site as well as off-site, to arrive at a conclusion.
To rule out any bias involving on-site or off-site assessments, we compared the four variables at either site as well (Table 1).

Results
The data was analyzed as a comparison of the two applications as a whole, as well as each applications performance on-site versus off-site.
Table 1 represents the results for Clear Sea vs. Cisco Jabber on site, off site as well as the combined results for each application.On comparing each of the four variables, as shown in Table 1, we concluded that though there was no statistically significant difference between the two applications, Cisco Jabber scored consistently higher than the Clear Sea application with a minimum variance amongst volunteers.The maximum difference was noted in the image quality, where Clear Sea scored 3.6 ± 0.78 and Cisco Jabber scored 3.93 ± 0.82, P = 0.14 (P > 0.05).However this value is statistically insignificant.
When each application was assessed as a whole, Clear Sea scored a mean of 3.88 ± 0.82 and Cisco Jabber scored 4.16 ± 0.78 with a p value of 0.18 (P > 0.05).Hence the difference is not statistically significant.
Table 2 presents the results for on-site vs. off-site, regardless of technology.The results show no statistically significant difference in the performance (P = 0.25, P > 0.05) even though the off-site location scored higher than the on-site.
As the variation for the investigator data is not nearly as great as the variation found in volunteer data, only the overall averages were considered, as shown in Table 1.
In on-site versus off-site assessments, the on-site average was 4.27 while the off-site assessment average was 4.40.
For the Clear Sea additional evaluations, zoom was rated 3, and pan and tilt were rated 2. The mean score for Clear Sea was 4.31 and Cisco Jabber was 4.36.Thus, difference in Clear Sea scores and Cisco Jabber scores was 0.05, which can be considered negligible.

Discussion
From the results, there is an indication that Cisco Jabber may trend toward the perception of better performance than Clear Sea in all of the categories evaluated even though there was no statistically significant difference noticed.There was also indication of a slight preference for the off-site assessment (4.15 ± 0.77) over the on-site assessment (3.89 ± 0.93) in all categories.These results can be seen in Table 1 and Table 2 with calculated P values.
Several of the volunteers noted that the iPad yielded very jumpy or shaky images when it was being handheld, and setting the device down on a table with a prop seemed to resolve this.Additionally, the images had to be held extremely close to the camera for some of the volunteers in order for the pictures or words to be clearly identified.In terms of video quality, the lower averages seen for Clear Sea can be explained through image rendering.Volunteers often reported that the Clear Sea application yielded distorted images with more pixilation than that present for Cisco Jabber.During trial periods, Clear Sea was confirmed to more consistently result in pixilation when extreme movements were made.While pixilation was evident for both applications, Clear Sea recovered from this pixilation much slower than Cisco Jabber did.Hence the maximum difference noted between the two was in regards to the image quality (Clear Sea = 3.6 ± 0.84 and Cisco Jabber = 3.93 ± 0.82).Clear Sea had an average lag of 4 to 6 seconds, whereas Cisco Jabber had a lag of 1 to 3 seconds, when the internet connection was unstable.This lag was not nearly as evident with the video image, but was noticeable with audio communication.Rendering issues were particularly evident in larger areas like the face or any images and text that the volunteer was asked to observe.Clear Sea appeared to respond slower and experience more problems with the assessments, although part of this could be mitigated by having a stable internet connection.Clear Sea also froze up or lagged more often than Cisco Jabber, which oftentimes would experience tiling of the image without any present lag or freezing.
One volunteer noted that the Clear Sea sound quality was not as good as that of Cisco Jabber for both on-site and off-site assessments, which accounts for the lower averages obtained for the former application.Volume was adjustable for the telemedicine cart, although some of the volunteers who may have been hard of hearing had to adjust the volume to a rather high level in order to clearly comprehend the investigator.Though adjusting the volume was an option, sound clarity seemed to be the major issue when comparing the two applications.
The Clear Sea screen depicting the investigator was more zoomed in, and thus the space available for use was smaller than that in Cisco Jabber.There was no noticeable way to fix this issue using the application from the investigator end, and the only solution would have been to place the iPad further away, which would compromise many other aspects of the videoconferencing technology.This did not really affect use of the application for stroke assessments and was noted to be a minor inconvenience.
While the investigator results also found a preference for Cisco Jabber (4.32 versus 4.23), the two apps were equally ranked from an off-site setting (both at 4.40).Overall, there was a slight preference for the off-site setting as well (4.27 for the on-site and 4.40 for the off-site).
The off-site preference can be attributed to internet connection quality.While on the internal wireless network of the hospital, there were oftentimes hiccups in the internet resulting in slight patches of lag.This was resolved by being on the wireless network of a home internet, which did not undergo the stress of a corporate wireless network and thus was more insulated from external circumstances.Video quality for the two applications generally remained about comparable for the two applications.The video was sometimes a little blurred depending on the internet connection and how fast movements were made due to delayed rendering, but it did not tile or freeze for the most part.This was true even for situations where the video on the volunteer end did tile.Internet connections dropping temporarily or completely to the point that the videoconference call was dropped resulted in frozen images on the investigator side as well, but beyond these extreme cases of frozen images, blurring was more common for the Clear Sea application.
In terms of sound quality, it was sometimes hard to hear some of the volunteers who were not able to speak very loudly.Even with the iPad turned to full volume, the microphone attached to the telemedicine cart was not sensitive enough to adjust speech clarity.Both interfaces for the applications were well-designed and easy to navigate, so the assessment values for this particular quality remained constant from assessment to assessment.
Cameras featuring zoom, pan, and tilt capabilities allow clinicians some autonomy to independently observe and examine a patient in general and specific neurological features such as pupillary dilation, extraocular movement, nystagmus, and other necessary components of the National Institute of Health Stroke Scale (NIHSS) examination.As a result, zoom, pan, tilt cameras may have advantages for transmitting video during telestroke consultations.However, technical observations appeared to interfere with these features' optimal functioning.
Another important fact is that although development in video conferencing software continues to proliferate with an increasing number of vendors entering the field, the core technology and standards used to transmit audio and video data have seen incremental changes spread out over a long timeline.Many of the standards (H.323, SIP) used today in video conferencing were developed in the 1990s.The graphical user interface and imple-mentation of the standards are variable between vendors and, as a result, provide opportunities for comparison.

Conclusions
In conclusion, the data indicate 91% probability that the mean total score for Cisco Jabber is higher than for Clear Sea.There was also an investigator preference for Cisco Jabber during on-site assessments, but the two applications were evenly ranked for remote assessments.Remote/off-site assessment ratings exceeded those of on-site assessments in both volunteer and investigator surveys, most likely due to quality of internet connections.The Clear Sea application features (zoom, pan and tilt) did not seem to influence ratings on any of the parameters.It appears that the iPad in conjunction with these two particular applications presents another potentially viable alternative for telestroke delivery.Tablets for telestroke may offer advantages through portability, low cost, and operator convenience.
Limitations to this study include volunteer number and the use of volunteers rather than stroke patients.While the goal of this study was to preliminarily assess two different applications and thus volunteers are appropriate for the scope, eventual acceptance of any platform would require clinical application with actual implementation of the NIHSS when it matters.Beyond this, a sample size consisting of 15 volunteers was not large enough for statistical power to determine whether or not a true difference exists between the two samples.Other confounding variables include intrinsic qualities of different applications, camera quality on the telemedicine platform and the internet connection at any given point in time.

Table 2 .
On-Site versus Off-Site.