Design and voice ‐ based control of a nasal endoscopic surgical robot

In traditional nasal surgery, surgeons are prone to fatigue and jitter by holding the endoscope for a long ‐ time. Some complex operations require assistant surgeon to assist with holding the endoscope. To address the above problems, the authors design a remote centre of motion based nasal robot, and propose a voice ‐ based robot control method. First, through the operation space analysis of nasal surgery, the design scheme of the robot based on RCM mechanism is proposed. On this basis, the design parameters of the robot are analysed to complete the entire design of robot. Then, considering that the surgeon's hands are occupied by surgical instruments during complex surgical operations, avoice ‐ basedrobotcontrolmethodisproposed.Thismethodobtainsdirectioninstructions fromsurgeonsbyanalysingthemovementoftheendoscopicimage.Afterward,acommercial speechrecognitioninterfaceisusedtorealisetheofflinegrammarcontrolwordslib compatiblewithbothChineseandEnglish,andtheoverallstrategyofrobotcontrolis proposed.Finally,anexperimentalplatformforvirtualrobotcontrolisestablished, andthevoice ‐ based robot control experiment is performed. The results show that the proposed voice ‐ based control method is feasible, and it provides guidance for the subsequent development and control of the actual robot system.


| INTRODUCTION
In traditional nasal surgery, the surgeon usually holds a nasal endoscope with left hand and operates a surgical instrument with right hand to perform surgery. There are some disadvantages in this operation mode: surgeon is prone to fatigue and jitter when holding the endoscope for a long time, which affects the stability of the surgical field of view. Moreover, for some complex surgical operations, an assistant surgeon is required to assist in holding the endoscope, and the chief surgeon operates the surgical instruments with his both hands. The operations between them are easy to interfere, and coordination is difficult. To address the above issues, scholars try to use robots instead of assistant surgeon to hold the nasal endoscope and act as the third hand of the chief surgeon during nasal surgery. Nasal endoscopic surgical robot can help surgeons to achieve the 'one surgeon-three hands' operation mode, thereby reducing the surgeon's burden and improving the surgery quality.
At present, the endoscope robots suitable for nasal surgery can be divided into two categories: passive endoscope robots and active endoscope robots. Passive endoscope robots were first adopted in nasal endoscopic surgery due to their simple structure and low cost. For example, the passive endoscope robot Tiska [1] was developed by Karl Storz company in Germany. All joints of the robot adopt a passive structure, and the locking and loosening of each joint are realised through the principle of electromagnetic friction. The passive endoscope robot Unitrac Arm [2,3] developed by Aesculap company uses pneumatic locking. The surgeon presses the pneumatic switch to release each ball joint, and the robot can be freely dragged to the desired pose, each ball joint is locked by releasing the pneumatic switch, and the robot keeps its current pose. The passive endoscope robot Endo Arm [4][5][6][7] developed by the Olympus company also adopts pneumatic reverse locking technology to improve the safety of the robot. Sun et al. developed a passive endoscope robot for nasal surgery [8]. The robot also uses pneumatic reverse braking technology, and in-vitro animal experiments show that this robot can meet the requirements of nasal surgery.
The above-mentioned passive endoscope robot can basically realise the function of assisting the surgeon in holding the endoscope during the surgery. However, these passive endoscope robots still require manual operation to lock or loosen all joints during surgery. As a result, the pose adjustment is troublesome, and the degree of automation and intelligence is relatively low. With the development of minimally invasive surgery and robotics technology, scholars have turned their attention to more intelligent active endoscope robots. For example, the robot [9] is developed by Taylor et al. at the Johns Hopkins University. The largest feature of this robot is that the end-effector is designed based on the remote centre of motion (RCM) mechanism to prevent the nostril from being pulled during surgery. The AESOP robot [10][11][12][13][14][15] is developed by Computer Motion company. The endoscope pose of the robot can be controlled by surgeon's voice, but special voice collection training is required before surgery. The Endo Assist robot [16,17] developed by the Armstrong company can be controlled through the surgeon's head movement. Clinical comparative experiments show that the robot can shorten the operation time and improve the operation efficiency. Rilk et al. in TU Braunschweig built an active endoscope robot system based on an industrial manipulator [18][19][20][21][22], and the feasibility is verified using a robot to automatically obtain the surgical field of view during nasal surgery. Navarro-Alarcon et al. of the Chinese University of Hong Kong have developed an endoscope robot for nasal surgery [23][24][25], which uses a wearable foot pedal to control the movement of the robot. The proposed interactive mode frees the surgeon's hands to a certain extent. In addition, Hu et al. of the Shenzhen Institute of Advanced Technology has developed an endoscope robot for nasal surgery [8,[26][27][28].
In general, the current passive endoscope robot has the advantages of simple structure and low cost, but requires surgeon to manually adjust the pose of the nasal endoscope during surgery, which is cumbersome and low intelligence. Multi-arm active robots, such as Da Vinci robot and Micro-Hand robot, have larger end-effectors, which can only be applied to surgical types with large surgical areas such as the abdominal cavity. They are not suitable to surgical types with narrow surgical areas such as the nasal cavity. The current single-port surgical robot has a large end diameter, which is also difficult to apply to nasal surgery. Using the existing endoscope robot, surgeons can achieve intraoperative humanrobot interaction through wearable pedals, voice (pre-operative training required), head movement, navigation tracking etc. However, the efficiency of human-robot interaction during surgery is relatively low, and the learning curve is relatively long. With the gradual maturity of speech recognition technology, more and more speech recognition technology is used in the control of robots [29][30][31][32][33]. The development of the latest speech technology makes speech recognition no longer need preoperative training, and can be compatible with multiple languages. Its combination with robotics technology helps to realise the multi-language natural interaction between robots and human, and improves the level of interaction control.
Aiming at the current clinical needs of nasal surgery, a nasal endoscope robot based on RCM is designed for nasal surgery by the authors. The robot can use both Chinese and English speech recognition to control the endoscope movement of the robot, so as to instead of the assistant surgeon to assist in holding the endoscope during nasal surgery. The voice-based control of the robot enables the surgeon to achieve 'one surgeon-three hands' surgical operation, which helps to reduce the surgeon's operation burden and improve the surgeon's operation ability.
The rest of this article is organised as follows. In Section 2, through the analysis of the nasal surgery, the endoscope robot suitable for nasal surgery is designed. In Section 3, the voicebased control method of the endoscope robot is introduced in detail. The proposed voice-based control method is simulated and verified in Section 4, and the conclusion is summarised in Section 5. Figure 1 shows the anatomical structure of the human nasal cavity and the workspace of the nasal endoscope. It can be seen from the figure that due to the complex spatial anatomy of the nasal cavity, the workspace of the endoscope is very complicated, but on the whole, the spatial distribution of the nasal cavity and sinuses presents an 'inverted funnel' shape with a small entrance and a large internal space.

| Workspace analysis of nasal endoscope
Due to the 'inverted funnel' shape characteristics of the nasal cavity anatomy, the nasal endoscope and instruments swing around the nostrils during surgery to form a workspace similar to the 'inverted funnel' shape [21]. From the perspective of robot workspace matching, the nasal endoscope workspace is similar to the workspace of the RCM mechanism. Therefore, the RCM mechanism has naturally become the preferred end-effector of F I G U R E 1 Anatomical structure of nasal cavity and workspace of nasal endoscope [21] nasal endoscopic surgical robot. The RCM mechanism is commonly adopted in the design of the end effector in minimally invasive surgical robots. It can effectively prevent the nasal endoscope from pulling the tissue at nostrils during surgery; thereby improve the motion safety of the robot. Based on the workspace characteristics of the nasal endoscope, an RCM-based endoscope robot is designed for nasal surgery by the authors. The robot has seven degrees of freedom. The positioning arm adopts the prismatic-revolute-revolute (PRR) configuration, which is convenient for the initial positioning during surgery. The RCM mechanism is adopted for the end-effector to realise the pulling protection of the nostril during nasal surgery.

| Analysis of robot design parameters
The nasal endoscopic surgical robot designed here is a 7-DOF robot, in which the positioning arm is a 3-DOF cylindrical coordinate PRR configuration, as shown in Figure 2(a) for its structure diagram. The first joint is prismatic joint, which is used to realise the movement of the entire robot in a direction perpendicular to the operating table. The second joint and third joint are revolute joints, which are used to adjust the position of the robot in the plane of the operating table. Figure 2(b) shows the schematic diagram of the 4-DOF revolute-revolute-prismatic-revolute (RRPR) end-effector based on the RCM mechanism. The DOFs 4 and 5 are revolute joints, which are used to construct the RCM mechanism and adjust the posture of the nasal endoscope during surgery. The DOF 6 is a prismatic joint, which is used to control the depth of the nasal endoscope into the nostril, and the DOF 7 is a revolute joint, which is used to control the angle of the nasal endoscope.
The robot design parameters determine the size of each joint of the robot and the size of the robot workspace, which in turn affects the compactness of the entire robot structure. Trevillot et al. [34] obtained the parameters range of each DOF of the end-effector during nasal surgery through a typical nasal surgery experiment. On the basis of Trevillot's parameters, the design parameters of the end-effector are determined as shown in Table 1.

| Design of robot positioning arm
Based on the positioning arm configuration determined above, the positioning arm is designed in detail, as shown in Figure 3. The up and down movement of the joint is realised by using a motor to drive the ball screw pair; and the two horizontal revolute joints are realised using a motor and a harmonic reducer to directly drive the load. Each joint is equipped with a zero point and limit position photoelectric switch for preoperative system zero return operation and robot motion range protection. In order to prevent the robot from being unable to maintain the current posture due to gravity, when the power is cut off, the pulley and the wire rope are used to balance the load of the joint 1 in the vertical direction. -125

| Design of end-effector based on RCM mechanism
Considering factors such as the compactness of the RCM mechanism, the type of driving joints and the difficulty of manufacturing, the RCM mechanism based on parallelogram is selected as the design basis of the end-effector, together with the self-rotation angle control mechanism and the feeding depth mechanism, to realise the detailed design of the robot end-effector. The detailed structure of the RCM-based endeffector is shown in Figure 4.

| The overall structure of the nasal surgical robot
Integrating the positioning arm and the RCM-based endeffector, the overall design model of the nasal endoscopic surgical robot is obtained, as shown in Figure 5. The Denavit-Hartenberg (DH) parameters of the robot are shown in Table 2. The drivers of the robot are distributed beside each joint motor, which help to reduce the number of wires in the system and facilitate postoperative cleaning and disinfection of the robot.

| Motion analysis of nasal endoscope
During nasal surgery, surgeons use nasal endoscopic images to determine the surgical status and diagnosis. As shown in Figure 6, surgeons need four DOFs (eight motion directions) to control the nasal endoscopic field of view, which are up and down, left and right, forward and backward and clockwise and counter clockwise. The motion of these 4 DOFs corresponds to the velocity of the nasal endoscope tip in the nasal endoscope coordinate system O camera as follows: Considering that the nasal endoscope also has four DOFs under the RCM constraints (three rotational motions and one translational motion), the speed of the above-mentioned endoscopic field of view can be converted into the corresponding speed V end in the RCM coordinate system, as shown in Figure 7.  After obtaining the endoscope speed V end of the robot end-effector, the inverse Jacobian matrix of the robot can be used to obtain the corresponding speed _ q of each joint.
where _ q-the joint speed of the robot; J -1 -the inverse of the robot Jacobian matrix; and V end -the endoscope tip speed of the robot.
With the computed speed of each joint of the robot, the motor of each joint of the robot can be driven to realise the control of the endoscopic surgical field of view.

| Design of offline grammar rule library for both Chinese and English
Now, to realise the voice-based control of the nasal robot, it is necessary to design the voice motion instructions (in Chinese and English) corresponding to the motion direction of the nasal endoscope, and design the grammatical rules for the voice motion instructions. The authors use the application programming interface (API) function library from iFlytek (a leading company in the field of speech recognition) to design intraoperative voice motion instructions. Ten different instructions are considered here. (These command words were determined after discussing with the surgeons of our partner hospital.) They are start, stop, up, down, left, right, forward, backward, clockwise, and counter clockwise. Based on the determined ten voice motion instructions, the grammar rule library for offline speech recognition can be designed as shown in Figure 8. The offline speech recognition grammar rule library is not only suitable for Chinese command words but also for English command words. Moreover, the required command words can be added or modified as needed during surgery.
On the basis of the above-mentioned offline speech recognition grammar rule library and the working principle of iFlytek offline command word recognition, the offline recognition process of robot motion instructions is obtained as shown in Figure 9.

| Voice-based control of the nasal endoscopic surgical robot
According to endoscope motion analysis and offline grammar rule library design, the entire voice control process can be obtained as shown in Figure 10. First, the surgeon uses a headset wireless microphone to control the motion direction of the surgical field of view (i.e. the endoscope motion of the robot) when his hands are occupied by other surgical instruments. The offline speech recognition system intelligently recognises the surgeon's voice motion instructions (start/stop, up down etc.), and displays them on the surgeon's operation interface in real time (refer to the disc-shaped motion directions display module in Figure 11). At the same time, the robot controls the motion direction of the endoscope through voice motion instructions, and realises the control of the surgical field of view, so that surgeon can achieve 'one surgeon-three hands' surgical operations.

| EXPERIMENTS AND DISCUSSION
In order to verify the proposed robot control method, a robotassisted nasal surgery simulation experimental system is developed. Experiments are carried out to verify the effectiveness of the proposed voice-based control method of the nasal endoscopic robot.

| Experimental setup
The simulation experiment system is shown in Figure 11, which includes a wireless microphone system, a nasal endoscopic surgical robot simulation system, a virtual nasal endoscopy system, and a digital nasal cavity model.

| Voice-based control experiments of the nasal robot
For the robot surgical field of view control based on speech recognition, the recognition success rate of offline voice command words is very important. Therefore, experimental tests are conducted on the predefined eight voice commands words (i.e. up and down, left and right, forward and backward and clockwise and counter clockwise). For the eight voice commands words, each word is repeatedly tested 10 times in each group, and 10 groups are tested in total. The experimental process is defined as follows: First, by speaking the eight predefined voice command words, the near-field speech recognition system captures the voice command information through the headset wireless microphone. Through the offline command grammar rule library, the voice command and the confidence level are recognised. Then, the robot motion speed in Cartesian coordination is obtained through the correspondence between the robot motion commands and the voice commands set in advance. Finally, the obtained robot motion speed is visualised and feedback to the surgeon. At the same time, the joint speed of the robot is obtained based on the inverse Jacobian matrix of the robot, and the robot is driven to realise the voice control of the surgical field of view. The experimental results are shown in Figure 12.
It can be seen from Figure 12 that for the voice control command word 'counterclockwise,' its recognition accuracy is low, about 80%, and the recognition accuracy of the remaining 7 command words averages above 90%. By analysing the reasons, it is found that for offline command word recognition, words with too long pronunciation are not suitable. Generally two to three syllables are preferred. In addition, the command word should try to choose words with large pronunciation differences and easy to pronounce. According to the above F I G U R E 1 2 The recognition success rate of robot voice control command words F I G U R E 1 3 The recognition success rate of robot voice control command words (after voice command words optimisation) HE ET AL. selection principle of command word, the command word is modified ('counterclockwis' changed to 'counterclock'), and retested. The recognition result of new command word is shown in Figure 13. It can be seen that after the command word is optimised, the recognition accuracy of all voice command words averages above 90%. We have discussed with the surgeons of our partner hospital, and the success rate can basically meet the control requirement of the nasal robot when the movement distance of each voice command is set reasonably (such as for each voice command, the end-effector motion distance in the corresponding motion direction is set to 2.0 mm). For the Chinese voice command words, the test process is the same with English ones, and the recognition accuracy is similar to English ones. For simplicity, detailed test results are not shown here. Considering the motion safety of the robot, the recognition process needs to be optimised to further improve the recognition accuracy of command words. Besides, foot pedals can be used to control the robot motion speed, start and stop etc. to improve the motion safety of the robot.

| CONCLUSION
Aiming to solve the problems of hand-holding endoscope in traditional nasal endoscopic surgery, the authors designed a nasal endoscopic surgical robot based on the RCM mechanism to replace the assistant surgeon to hold the endoscope during nasal surgery. A voice-based control method is proposed, so that surgeon can achieve 'one surgeon-three hands' surgical operation, which is beneficial to improve the surgeon operation ability. Firstly, with respect to the inverted funnel shape workspace of the nasal endoscope, a nasal endoscopic robot based on the RCM mechanism is designed. The positioning arm of the robot adopts the PRR configuration, which is convenient for the initial positioning during surgery, and the RCM mechanism is adopted at the end-effector of the robot to realise the pulling protection of the nostril during surgery. Then, considering the needs of the surgeon to operate surgical instruments with both hands, a robot control method based on speech recognition is proposed. This method analyses the eight motion directions of the endoscopic image, and obtains the corresponding motion speed of the endoscope tip under RCM constraints. Based on the commercial speech recognition software interface, an offline grammar rule library for robot motion instructions compatible in Chinese and English is implemented. Combined with the surgical procedure, the overall strategy of robotic voice-based control is established. Finally, a virtual robot experimental system is built to verify the robot control method based on speech recognition. The experimental results show that the voice-based control method can realise the surgical field of view control of the robotic nasal surgery, which provides guidance for the subsequent development of the actual robot system. The recognition accuracy of the eight voice motion commands proposed by the authors is over 90% on average. For the offline command words selection, the pronunciation length has a greater influence on the recognition results. Words that are too long or too short are not suitable and 2-3 syllables are preferred. In addition, the command word should try to choose words with large pronunciation differences and easy to pronounce. In future works, it is decided to manufacture the prototype of the nasal endoscopic robot, and carry out experiments of the robot nasal endoscopic surgery based on voice control. Considering the motion safety of the robot, the recognition algorithm needs to be optimised to further improve the recognition accuracy of command words. Besides, pedals can be also introduced to control the motion speed of the robot and improve the safety motion of the robot.