Performance of different KNN models in prediction english language readability

Altay O.

Performance of different KNN models in prediction english language readability

Date

2022

Authors

Altay O.

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Assessing the readability of English, a universal language, is important in terms of meeting readers at different reading levels with texts at their own level. Presenting texts to readers at their own level will help them develop their learning, comprehension and reading capacities. In this study, a data set collected from BBC news was used to predict the readability of the English language. The data set consists of 17724 different sentences. Different k-nearest neighbor (KNN) models were used to predict the readability of English sentences. These models are basic KNN, two different weighted KNN and KNN base random subspace ensembles. KNN base random subspace ensemble has obtained superior results compared to other KNN models. KNN base random subspace ensemble accuracy was 0.9749 and f1-score 0.9692. © 2022 IEEE.