pip install -r requirements.txt
pip install -e .
We aim to classify the users based on their keyborad typing rhythm patterns on particular sentence. To this end, we collect information such as hold time, key interval, and keycode from each user, and use LSTM and Transformer models to classify the users. Ultimately, by varing the dropout rates in the Transformer models in training and ensembling them, we get the best results.
We consider key interval above 20seconds as the anomalies. So we remove data samples with key intervals greater than 20seconds. After that, we plot the histogram and notice that the graph is right-skewed. Therefore, we applied the transformation 1/(x+0.1) to convert the data into a normal distribution shape. and applied standardization.
As above, we plot the histogram and notice that the graph is normal distribution. so we just applied standardization.
We transform the data below
df1 = pd.DataFrame(
{
'user_id': user_id[user],
'data_id': i,
'seq': seq,
'key_interval': key_interval,
'key_code': key_code,
'hold_time': hold_time[:, 1]
}
)
into below using one-hot encoding.
When unskewing and standardization are not applied, the training does not perform properly.
[Test] Loss: 5.1195, Acc: 0.2785
[Test] Loss: 0.6280, Acc: 0.8481
[Test] Loss: 0.1697, Acc: 0.9494
[Test] Loss: 0.0405, Acc: 0.9873
[Test] Loss: 0.0056, Acc: 1.0000
Experimentally, when ensembling Transformer models, the classification was performed without any errors.
The design elements of this project were adapted from the work of keytracer.








