An Open Access Dataset for Supervised Machine Learning to Estimate Gait Biomechanical Characteristics
There are limited gait datasets in the literature due to the complexity and accessibility of the motion capture system. Moreover, all of them only comprises of young adult datasets. This paper presents a comprehensive dataset of human gait that contains the kinematics and muscle activities of lower limb captured using Electromyography (EMG), and Inertial Measurement Unit (IMU). The data were collected from 65 male and female participants with ages ranging from 19 years old to 73 years old. A case study that utilizes the dataset and the supervised machine learning models i.e. Feedforward Neural Network (FNN) and Long Short-term Memory (LSTM) network is proposed to demonstrate the feasibility of the dataset to estimate the dynamics of human gait, particularly the lower extremity muscle activity. The models were tested on an unseen dataset and other online dataset. The results showed that LSTM outperformed FNN. The LSTM models achieved root mean square error (RMSE) below 11%, correlation coefficient (r) above 90%, and peak timing differences below 10% when predicting EMG signals in test dataset. The dataset is expected to accelerate the adoption of supervised machine learning in clinical and rehabilitation settings, particularly gait analysis.