Probabilistic,movement,primitive,based,motion,learning,for,a,lower,limb,exoskeleton,with,black-box,optimization*

来源:优秀文章 发布时间:2023-03-23 点击:

Jiaqi WANG, Yongzhuo GAO, Dongmei WU, Wei DONG

State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, China

Abstract: As a wearable robot, an exoskeleton provides a direct transfer of mechanical power to assist or augment the wearer’s movement with an anthropomorphic configuration. When an exoskeleton is used to facilitate the wearer’s movement, a motion generation process often plays an important role in high-level control. One of the main challenges in this area is to generate in real time a reference trajectory that is parallel with human intention and can adapt to different situations. In this paper, we first describe a novel motion modeling method based on probabilistic movement primitive (ProMP) for a lower limb exoskeleton, which is a new and powerful representative tool for generating motion trajectories. To adapt the trajectory to different situations when the exoskeleton is used by different wearers, we propose a novel motion learning scheme based on black-box optimization (BBO) PIBB combined with ProMP. The motion model is first learned by ProMP offline, which can generate reference trajectories for use by exoskeleton controllers online. PIBB is adopted to learn and update the model for online trajectory generation, which provides the capability of adaptation of the system and eliminates the effects of uncertainties. Simulations and experiments involving six subjects using the lower limb exoskeleton HEXO demonstrate the effectiveness of the proposed methods.

Key words: Lower limb exoskeleton; Human-robot interaction; Motion learning; Trajectory generation; Movement primitive; Black-box optimization

In recent years, there has been growing interest in the application of robotic exoskeletons as a solu‐tion to assist people undertaking activities. Many lower extremity exoskeletons have been developed and are widely used in power augmentation (Guizzo and Gold‐stein, 2005; Kazerooni and Steger, 2006; Zoss et al., 2006; Walsh et al., 2007), walking assistance (Sankai, 2010; Hassan et al., 2014), and rehabilitation training (Colombo et al., 2000; Veneman et al., 2007; Strausser and Kazerooni, 2011; Esquenazi et al., 2012; Sanz-Merodio et al., 2014). One of the toughest issues in this area is that, as a typical human-robot coupling system, the exoskeleton should work cooperatively with the human wearer (Deng et al., 2020). Though the development of appropriate control strategies is fast, motion learning is one of the main research sub‐jects in the field of exoskeleton robots (Lee et al., 2015; Yan et al., 2015; Xu and Sun, 2018). A humanlike reference trajectory can help the exoskeleton sys‐tem achieve favorable human-robot interaction and is directly related to the comfort of the wearer. In addi‐tion, trajectory in parallel with human intention is con‐sidered to help smooth the movement and optimize the mechanical efficiency to save energy.

Motion trajectory generation has been extensively investigated by researchers in the field of human-robot interaction. A model-based strategy is a classic method to generate trajectories for lower limbs (Kagawa et al., 2015; Kazemi and Ozgoli, 2019). Based on the model and the stability criteria, like the link model (Fu and Chen, 2008), inverted pendulum model (IPM) (Komura et al., 2005), and zero-moment point (ZMP) model (Vukobratović and Borovac, 2004; Al-Shuka et al., 2016; He et al., 2017), the trajectories are generated using mathematical expressions. This kind of method relies on the accuracy of the human-exoskeleton and environment model, so its effectiveness is limited by this objective condition (Kazemi and Ozgoli, 2019). Therefore, the trajectory has poor adaptability to the actual environment and poor robustness to disturbance.

The role of the exoskeleton is to provide walk‐ing assistance in coordination with a person. Naturally, exoskeletons generate trajectories by emulating human movement. It is essential to detect and realize the wea‑rer’s movement instead of using a predefined motion. Some intelligent technologies have been developed for this, and learning from demonstration has recently gained considerable interest in studies of robot sys‐tems (Yang et al., 2019; Deng et al., 2020). Move‐ment primitive (MP) is a well-established approach for representing and generating movement from demon‐stration (Schaal et al., 2003; Krüger et al., 2007; Kulić et al., 2012). Ijspeert et al. (2002, 2013) proposed a tool named dynamic movement primitive (DMP) for representing rhythmic and discrete trajectories. In Huang et al. (2018), DMP was combined with locally weighted regression (LWR) to model exoskeleton tra‐jectories. Representing motion by means of MP is considered motion generation. Continuous learning is also required to achieve the flexibility needed in a human-robot coupling system. An exoskeleton with self-adaptive motion learning is adaptable to different wearers and environments, and can reduce the effect of uncertainties and disturbances. Yuan et al. (2020) proposed a trajectory-learning scheme for motion gene‑ration based on path integrals (PI2) combined with DMP. Huang et al. (2019) proposed coupled coopera‐tive primitives (based on DMP) to learn the motion, using policy improvement with PI2to update the para‑meters. Their results, and those of other studies, demo‑nstrated the stable performance of the system after the motion learning converged. However, too many itera‐tions are needed in the learning process. It may not be a serious issue if the exoskeleton maintains a steady walking pace, but this learning ability deals poorly with complex and variable walking situations. Besides, the need for almost 30 iterations every time the subject is changed is a challenge.

Almost all the motion generations of the lower limb exoskeleton can be learned by PI2, because PI2is an efficient and easy-to-implement algorithm of reinforcement learning (RL) (Theodorou et al., 2010; Schmidhuber, 2015). However, to further improve the performance, a more efficient algorithm is needed. We have shown that the algorithm PIBB, devised by Stulp and Sigaud (2012), outperforms PI2in terms of convergence speed and final cost. As a modifica‐tion of PI2, PIBBsimplifies the exploration and parame‐ter update methods of PI2. In essence, PIBBis a kind of black-box optimization (BBO) algorithm.

The convergence speed depends not only on the learning algorithm, but also on the representative ability of MP. DMP is a commonly used trajectory-based rep‐resentation approach (Yang et al., 2019). However, DMP is more suitable for learning a point-to-point trajectory because of the convergence nature of its attrac‑tor. Generalization to new and unseen situations in DMPs is limited, so further work is needed for repre‐senting optimal behaviors (Stulp and Sigaud, 2012). Moreover, traditional MP may have the problems of a relatively low speed and low accuracy. Therefore, in this paper, some new MP concepts are presented and implemented. A novel probabilistic movement primitive (ProMP) was proposed by Paraschos et al. (2013, 2018). In ProMP, probability distribution is used to encode the movement, as it is often a requirement for representing optimal behaviors. In contrast, deter‐ministic approaches such as DMP can represent only the suboptimal ones. Moreover, capturing the variance of the movement leads to better generalization capa‐bilities (Todorov and Jordan, 2002; Schaal et al., 2005). Most importantly, unlike past approaches (d’Avella and Bizzi, 2005) to learn movements from a single demonstration, ProMP can be learned from multiple demonstrations by incorporating the variance. This increases flexibility and enhances the advantageous properties of the representation. For exoskeletons, this kind of representation learned from multiple motion habits can generate a more general human motion trajectory.

In this paper, we propose a novel motion learn‐ing scheme for a lower limb exoskeleton. For motion generation, a powerful motion representative tool, ProMP, is used to model exoskeleton motion trajecto‐ries from multiple demonstrations. To our knowledge, ProMP has not previously been used for motion plan‐ning of an exoskeleton. Then, for motion adaptation, the optimization algorithm PIBBis adopted to learn and update the ProMP motion model online, so the exo‐skeleton can adapt to different wearers and variable environments. Simulations and experiments demon‐strate the effectiveness of the proposed methods. The motion learning can quickly adapt to a new wearer and generate a trajectory in parallel with human intention. The convergence speed is higher than that of the exist‐ing methods. The human-exoskeleton system can obtain better flexibility and faster movement coordination.

Fig. 1 shows the working framework of the exo‐skeleton system with the proposed motion learning strategy. The major subsystems in this generalized framework include a hierarchical control structure (Tucker et al., 2015), the wearer, and the exoskeleton. The proposed motion learning strategy occupies a high-level layer of the structure as shown in the solid box, consisting of initial motion generation shown by the dotted line, and motion adaptation shown by the dashed box. The motion model is first generated off‑line from trajectory demonstration, and then updated online by an optimization algorithm. The trajectory learned offline is regarded as the reference trajectory for online working. The trajectory is changed and up‐dated when the subject begins to move. The data from the actual joint trajectories are used in the PIBBalgo‐rithm to calculate the corresponding cost value, and then the parameters are updated. Next, the ProMP algorithm with the updated parameters generates new desired trajectories for the lower limb exoskeleton. In the following subsections, we explain how the motion is represented and learned by ProMP, and provide details about the trajectory adaptation.

Fig. 1 The working framework of the exoskeleton system with the proposed motion learning strategy

2.1 Motion representation

In terms of ProMP, a probabilistic model based on the basis function is introduced to represent the trajectory. The trajectory distribution of the lower limb exoskeleton in this research focuses on the joint space.qtandare used to represent the joint angular position and joint angular velocity, respectively, of each degree of freedom (DOF) at timet.ωis used to pro‐duce a single trajectory as an underlying weight vec‐tor. A linear basis function is used to model the state of the joint:

whereΦt=[φt,φt] is theN×2 dimensional timedependent basis function matrix, andNdefines the number of basis functions.εt~N(0,Σt) is Gaussian noise with 0 mean.

Based on Spiegelhalter et al. (2002) and Tucker et al. (2015), withωmaintaining a Gaussian distri‐butionω~p(ω;θ) =N(ω|μω,Σω) with parametersθ, the trajectory distribution is introduced as

The distributionp(yt;θ) defines the hierarchical Bayesian model whose parameters are given by the parametersθand the observation noise varianceΣy.

Temporal modulation is needed for adapting to changes in walking speed. A phase variablezis intro‐duced to separate the movement from the time signal. The phase can be any function that monotonically increases with timet, and the speed of the movement can be modulated by modifying the rate of the phase variableα. In this study,ztis adopted as

At the beginning of the movement, the phasez0is defined as 0, and in the end, the phase iszE= 1. The basis functionφtnow directly depends on the phase instead of the time:

2.2 Motion initial learning

The probabilistic model represents the trajectory distribution based on a basis function. For human walking motion, the Von-Mises basis functionsbi(Jenison and Fissell, 1995) for rhythmic movement are used to model periodicity in the phase variablez:

wherehdenotes the width of the basis, andciis the center of theithbasis function. Then, it is normalized by

The distributionp(yt;θ) for time steptis shown in Eq. (7), by which the mean and the variance for any time pointtcan be evaluated:

To generate motion,p(ω;θ) needs to be learned from multiple demonstrations. Assuming that there areMdemonstration trajectories, the weight for each trajectory is estimated using linear ridge regression:parameter. Then the parametersθ={μω,Σω} are ob‐tained using the maximum likelihood estimation algo‐rithm. The meanμωand covarianceΣωare computed from samplesωm:

2.3 Motion adaptation

The optimization algorithm adopted in this study is PIBB. This kind of policy improvement algorithm is updated based on each improved execution or “rollout.” Based on a total ofKalternative trajectories with slight differences, policy improvement methods then update the parameter vectorω→ωnewsuch that the policy is expected to incur lower costs.

The policy perturbation during a roll-out is gen‐erated from the model of the trajectory with noise

Then, based on Paraschos et al. (2018), the cost function formula in the roll-out policy of each of thekthroll-out trajectories is

whereMt,kis a projection matrix onto the range space ofΦtunder the metricJ-1, andrt,kis the immediate cost of thekthtrajectory at timet.

For eachkthroll-out trajectory, the immediate cost function calculated from the sensing signal feed‐back is defined as follows:

whereqtrepresents the joint angle of the exoskele‐ton, andqtdthe desired position of the wearer. Then the overall trajectory costRis

whereYmrepresents the position of all steps for themthdemonstration trajectory, andλis a regression

The probability of thekthroll-out trajectory is obtained by mapping the cost of each trajectory to [0,1] through the softmax function, as shown in Eq. (15):

where the parameterγis a constant coefficient within (0, 1]. It can be seen from Eq. (15) that the higher the cost, the lower the probability, thus ensuring PIBBto converge to a value with low cost.

The final parameter is updated through rewardweighted averaging:

The process of PIBBfor motion model adaptation is shown in Algorithm 1, which corresponds to the clear display of the dashed box shown in Fig. 1. The index notations in this paper are listed in Table 1.

Algorithm 1 Motion adaptation Input: initial state of the parameter ω (δω is weighted averag‐ing), the basis function Φt, and desire trajectory yd Output: parameter vector ω 1 for k =1, 2, …, K 2 Sample εk~N (0, Σ)3 Roll-out: yt,k=ΦΤ t(ω+εk)4 rt,k=(qt,k-qd t,k)2 5 Compute trajectory cost: Mt,k=J-1Φt,kΦΤ t,k t,kJ-1Φt,k 6 Sk=∑t=0 ΦΤ E-1 rt,k+1 T E-1( )ω+Mt,kεk 2∑t=1 7 end for 8 for k =1, 2, …, K 9 Compute the probability of each roll-out:10 Pk=e-1 1 K e-γ Sk∑k=1 γ Sk 11 end for 12 Cost-weighted averaging: δω=Pk·εk 13 Update: ωnew←ω+δω 14 Overall trajectory cost: R=E rt 1 E∑t=1 15 until the overall trajectory R cost converges

Table 1 Definition of the index notations

In this section, we describe simulations and ex‐periments conducted on a lower limb exoskeleton to verify the proposed motion learning scheme. To test the feasibility of the proposed method before imple‐menting it on the hardware platform, we performed simulations.

3.1 Simulations

3.1.1 Motion generation

Simulations were implemented to first verify the representation ability of the proposed motion model, and compare it with the existing classical methods. First, a curve generated by the second-order Fourier series was used to imitate a human walking trajectory:

This reference trajectory consists of periodic sine waves with different frequencies and amplitudes. ProMP represents and learns the demonstration trajec‐tory based on Section 2.2. The regression parameterλis generally set to 0.01, and the basis function widthhis 0.05. The number of basis functions is crucial to the representative ability of primitives. Fig. 2 shows the trajectories learned from different numbersNof basis functions, and Fig. 2b is the root mean square error (RMSE) between the learned trajectory and the target trajectory. The representation ability is weak whenNis small, but grows extremely fast asNincrea‑ses. The trajectory learned by 10 basis functions, shown by the brown line in Fig. 2a, coincides exactly with the target, as shown by the dashed blue line, and the RMSE is within 0.003.

Fig. 2 Trajectory learning by ProMP under basis functions of different numbers: (a) learning curve; (b) learning cost

To reveal the representation ability of ProMP in this case, the commonly used trajectory representa‐tion DMP was also adopted to learn the reference tra‐jectory for comparison. The performance of DMP learning under different numbers of basis functions is shown in Fig. 3.

Fig. 3 Trajectory learning by DMP under basis functions of different numbers: (a) learning curve; (b) learning cost

For DMP, the RMSE was not as large as that of ProMP at the beginning, but as the number of func‐tions increased, the improvement in RMSE was very small. Even with 10 basis functions when the ProMP completely converged, the trajectory of DMP was far from the target trajectory. The final convergence curve was still clearly separate from the target trajectory, with the final RMSE being around 0.03 rad, which is 10 times that of ProMP. Using LWR to learn the weightsωof DMP, it was almost impossible to achieve the same RMSE as ProMP for this kind of trajectory based on our simulation. To some extent, the smaller number of basis functions needed indicated the stron‐ger representative ability of the approach. Besides, the number of basis functions was proportional to the com‐putational consumption, so fewer basis functions are friendly to the real-time effect of the strategy. There‐fore, ProMP achieved a better performance by repre‐senting a trajectory with great accuracy and efficiency. In addition, the superior performance of ProMP over DMP was confirmed in a study involving stroke-based movements (Paraschos et al., 2013).

3.1.2 Trajectory adaptation

For online motion generation of the exoskeleton, powerful representation ability is essential, but not suf‐ficient. The representative tool must be adaptive and can reproduce a new trajectory precisely as soon as possible. We conducted a simulation to demonstrate the online adaptation and updating of the proposed method, ProMP combined with PIBB(ProMP-PIBB). The performance of the baseline DMP combined with PI2(DMP-PI2) in the same situation was also evaluated. The adopted number of basis functions of ProMP was 10 according to Section 3.1.1. To ensure that the ini‐tial trajectories of DMP and ProMP were as similar as possible, the basis function number of DMP needed to be 150. It was assumed that Eq. (17) was the cur‐rent trajectory, and that the target trajectory was simi‐lar, but had different frequencies and amplitudes:

The initial value ofωcorresponded to the cur‐rent trajectory, and was updated every gait cycle. The time steps were normalized to 150 based on the time interval of each gait cycle. During learning,k= 50 roll-outs were performed for one update. Fig. 4 shows the trajectory adaptation processes of ProMP-PIBBand DMP-PI2. For brevity, only a few representa‐tive time nodes are shown. The trajectory updated by ProMP-PIBBwas very close to the target trajectory at only the 5thupdate, while the DMP-PI2trajectory was still close to the beginning. ProMP-PIBB’s trajectory adapted to the target perfectly from the 15thupdate. The fitting process of DMP-PI2kept a constant speed. In the end, 20 updates were needed to achieve conver‐gence to the target, and the final convergence perfor‐mance still had visible misfits compared to that of ProMP-PIBB.

Fig. 4 Trajectory adaptation processes of ProMP-PIBB and DMP-PI2

To fully evaluate the adaptation efficiency, the learning costs are shown in Fig. 5. The convergence speed of ProMP-PIBBwas distinctly higher than that of DMP-PI2. The trajectory of ProMP-PIBBtook only about 10 updates to converge. However, DMP-PI2took at least 20 updates to reach the lower cost, and the final cost was also worse than that of ProMP-PIBB. This confirmed that ProMP-PIBBoutperforms DMP-PI2in terms of convergence speed and final cost. Further‐more, DMP-PI2had a disadvantage in terms of compu‐tation time per update. First, the programming logic (i.e., the complexity of the code) of ProMP is simple, so it reduces the computational complexity. Second, DMP needs many more basis functions to achieve the same performance, which will cost more time. Therefore, in principle, the calculation time of DMP for each update is much longer than that of ProMP. The situation was the same for PIBBbecause it is essen‐tially a simplification of PI2. In the simulation, the ProMP-PIBBupdate time was about 0.46 s with 10 basis functions. With DMP-PI2, the update time was 1.86 s for 10 basis functions, and 10.14 s for 150 basis functions.

Fig. 5 Learning costs during trajectory adaptation of ProMP-PIBB and DMP-PI2

3.2 Experiments on lower limb exoskeleton

3.2.1 Hardware

In this subsection, we describe experiments imple‐mented on a lower limb exoskeleton system, HEXO, developed in our lab. HEXO is an anthropomorphic device, which has similar DOFs to the human lower limb. Fig. 6 shows the main components of HEXO. The backpack was equipped with an Advanced RISC Machines (ARM) control panel, power supply, and data acquisition card. There were four active DOFs for hip and knee flexion/extension. The actuation system was powered by a brushless DC motor. An incremental encoder was integrated into the motor. The motor was combined with a harmonic drive with a ratio of 1:100 in the hip joint, and 1:80 in the knee joint. Lower limb motion was measured by an inertial measurement unit (IMU). Torque sensors were placed at joints. Three six-axis force sensors (SFSs) were installed at the back, and sensing-shoes between the wearer and the exo‐skeleton were to perceive the human-robot interaction force. All sensor data were transmitted to the ARM panel through a controller area network (CAN) bus, whose transmission rate was up to 1 Mb/s.

Fig. 6 The hardware system of HEXO exoskeleton

3.2.2 Experimental protocol

As shown in Fig. 1, the motion trajectory first needed to be learned offline before the online experi‐ment was carried out on HEXO. Three voluntary sub‐jects 1, 2, and 3, whose characteristics are listed in Table 2, participated in the data acquisition. Sub‐jects 1, 2, and 3 were asked to perform level walking on a treadmill at their normal speed. The exoskele‐ton HEXO was working in zero-force mode with no enabled torque assistance, to obtain the most natural gait of the subjects when wearing the exoskeleton.

Table 2 Detailed information of the six subjects

In both offline and online experiments, trajecto‐ries of all four joints of the HEXO were generated simultaneously. Only the left leg data are shown in all figures of this paper, because the properties of the two legs are similar.

3.2.3 Motion initial learning

The trajectory data were first obtained, and the next step was to represent and learn the trajectory using ProMP. The simulation in Section 3.1.1 verified that ProMP is a powerful representative tool, but another beneficial property of ProMP is that it can concur‐rently activate multiple primitives, i.e., learning mul‐tiple trajectories. Fig. 7 shows the cut and normalized walking trajectories of subjects 1, 2, and 3 according to the gait cycle. Figs. 7a and 7b show the mean and covariance of the hip and knee data, respectively. The general trend of the curve was the same for each joint, but the shape of the curve differed, even when the sub‐jects had similar heights and weights. The red areas of Figs. 7c and 7d showed the trajectories learned by ProMP from all three subjects, and contained all the possibilities. The red line can be regarded as the aver‐age of all acquired trajectories, so it is more represen‐tative than any others. Besides, the more trajectories learned, the more general the reference trajectory.

Fig. 7 Motion initial learning by ProMP from subjects 1, 2, and 3: (a) mean and covariance of the hip trajectory data; (b) mean and covariance of the knee trajectory data; (c) hip trajectory learned by ProMP; (d) knee trajectory learned by ProMP

3.2.4 Motion online adaptation

The trajectory learned offline is regarded as the reference trajectory when working online. The exo‐skeleton took several gait cycles to learn the optimal parametersωbased on the initial one, which made the trajectory costRconverge. The experiment was implemented to test the effect of online adaptation of the proposed method. We included baseline DMP-PI2for comparison. Previous results could not be used for comparison because of different criteria and ex‐perimental conditions, so we reproduced the DMP-PI2on our platform with the same conditions as in our proposed ProMP-PIBB. To avoid too much calculation time, 20 commonly used basis functions were selected for DMP.

In this experiment, there were three new sub‐jects, 4, 5, and 6, as listed in Table 2. To evaluate the effectiveness of the method, subjects with differ‐ences were deliberately selected for validation. For example, subject 5 was a female with a lower height and subject 6 was much older than other subjects. The subjects were also asked to deliberately change their speeds several times when performing level walking, to test the adaptability of the method to different speeds.

Fig. 8 shows the online trajectories generated by ProMP-PIBBand DMP-PI2, and the actual trajectory for subject 4, that is, the process of motion adaptation. Comparing ProMP-PIBBwith DMP-PI2, the initial error of the first step before learning was almost the same. However, the trajectory generated by ProMP-PIBBconverged to the desired trajectory after about the fourth step for both the knee and hip joints. DMP-PI2did not converge until the sixth step for the hip and the seventh for the knee, and the fitting of the trajec‐tory was not good for the hip.

Motion adaptation also includes temporal modu‐lation. Temporal modulation is a valuable property as it enables the motion model to be applied to walking when the speed changes. After all, it is inevitable that speed changes during human walking. The speed of the first two steps was very stable and the curve fit‐ted well (Fig. 9). The speed was slightly lower from the third step and could be adjusted immediately. The gait changed slightly in the fifth step, the stride became larger, and the next step was adjusted to this quickly, as shown by the red curve. Starting from the seventh step, the speed increased, and the generated trajectory was constantly adjusted. When the speed started to stabilize after the ninth step, the trajectory was almost stable. Therefore, the trajectory can be adapted quickly when gait or walking speed changes.

Fig. 8 Motion adaptation process of ProMP-PIBB and DMP-PI2: (a) hip of ProMP-PIBB; (b) hip of DMP-PI2; (c) knee of ProMP-PIBB; (d) knee of DMP-PI2

Fig. 9 Trajectory adaptation performance of the proposed ProMP-PIBB when the walking speed changes

Table 3 summarizes the experiment results of the three subjects for DMP-PI2and ProMP-PIBB. It shows the RMSEs before adaptation (the first step) and after adaptation, the convergence step, and the improvement rate of the proposed method. Before updating, the RMSEs of ProMP were lower than those of DMP. In the end of the adaptation, the final RMSEs of ProMP-PIBBwere also smaller than those of DMP-PI2. The average improvement for the three subjects was 15.49%, indicating that the proposed strat‐egy achieved better performance. Although the final errors of the two methods were both small, a little mismatch between the desired trajectory and the gen‐erated trajectory will cause huge human-robot inter‐action resistance when the exoskeleton was working. Therefore, any improvement that can reduce the error is valuable. In addition, for ProMP and ProMP-PIBB, the error convergence was not so obvious. This was mainly because the initial reference trajectory learned by ProMP was already a general trajectory, so there was no need for much adjustment during trajectory adaptation. Besides, every step of a person in the walk‐ing process cannot be exactly the same, so the error of the generated trajectory must fluctuate, even after convergence. Fig. 8 shows that the converged trajec‐tory was very similar to the desired trajectory, but there were still slight mismatches.

Table 3 The adaptation experiment results (hip/knee) of ProMP-PIBB and DMP-PI2 for three subjects

Most importantly, the convergence time was the same for all experimental subjects. With ProMP-PIBBthe trajectory converged at around the fourth gait cycle, but DMP-PI2needed seven or more cycles. Whether for the hip joint or knee joint, the proposed method could generate a trajectory suitable for the current wearer in only three or four steps. Furthermore, the trajectory generation errors of the knee joint were a bit larger and more unstable than those of the hip joint, because the movement of the knee joint is more complicated.

The effect of trajectory generation was also aff‑ected by the characteristics of the subjects. Among the three subjects, 4, 5, and 6, the performance of subject 5 was the worst. The reason may be that subject 5 was a female who had the smallest height and weight, very different from the three subjects learning offline, 1, 2, and 3. The initial error of the first step of subject 6 was the lowest, which is reasonable as his physical characteristics were the closest to those of the three subjects. However, the error of subject 6 was the most unstable, perhaps because of his unstable gait.

The experiment results showed that our proposed method features faster convergence and a smaller cost compared with the baseline DMP-PI2. Moreover, as stated in Section 3.1, ProMP-PIBBhas a much lower calculation consumption for each update. Above all, the proposed motion learning scheme is a reliable highlevel approach for exoskeleton control. It generates trajectories in real time, in parallel with human inten‐tion, and can quickly react to different subjects and variable situations.

In this paper, we propose a novel motion learn‐ing scheme to generate a motion trajectory online for lower limb exoskeletons. There are two complemen‐tary aspects of this novel scheme: motion generation and motion adaptation. For motion generation, the motion is modeled by ProMP with offline initial learn‐ing using pre-collected trajectories. For motion adap‐tation, the motion model based on ProMP can be fur‐ther learned and updated online using the black-box optimization PIBB. This is the first time that ProMP has been adopted to model motion for an exoskeleton. The simulation and experiment results showed that this motion learning can generate a trajectory online in par‐allel with human intention quickly and accurately, and most importantly, the learning speed is much higher than those of the existing methods. The experiments verified that the proposed strategy has a better perfor‐mance than the existing popular strategies, not only with a higher convergence rate, but also a lower final cost. Therefore, the exoskeleton with the proposed motion learning is able to adapt to different wearers and variable environments in a timely manner. This human-exoskeleton system can co-work collabora‐tively faster and more consistently, and with a better human-robot interaction. The combination of ProMP and PIBBproduces an even better effect.

In the future, the motion learning will be tested under assistance mode to complete the exoskeleton function. The appropriate control method and corre‐sponding results will be analyzed in detail. For pre‐liminary testing of the effect of the proposed method, the motion modes tested in this study involved only ground-level walking. In the future, all basic rhythmic locomotion modes in daily living will be included, such as stair ascent, stair descent, ramp ascent, and ramp descent.

Contributors

Jiaqi WANG conducted the research and drafted the paper. Yongzhuo GAO, Dongmei WU, and Wei DONG revised and finalized the paper.

Compliance with ethics guidelines

Jiaqi WANG, Yongzhuo GAO, Dongmei WU, and Wei DONG declare that they have no conflict of interest.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

推荐访问:Based motion primitive
上一篇:永康:五金产业跨国联姻俄罗斯工程院
下一篇:基于高斯圆的导管扩口锥角非接触式测量

Copyright @ 2013 - 2018 优秀啊教育网 All Rights Reserved

优秀啊教育网 版权所有