MetricHunter: A software metric dataset generator utilizing SourceMonitor upon public GitHub repositories
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Version control systems are pervasively consulted nowadays to obtain software metric datasets. Accordingly, machine learning is applied to predict different aspects of a software including quality monitoring, influence analysis, etc. However, construction of a metric dataset is challenging and the dataset content may affect the success of the learning-based models. In this study, we propose a dataset construction tool, MetricHunter, which is able to produce platform/language specific datasets that can be used for predicting the features of newly created software. The proposed tool is developed by C# programming language utilizing a known metric gathering tool, i.e. SourceMonitor, and the GitHub REST API for public repositories. Thus, one can construct a proper dataset from a graphical user interface by simply specifying the programming language or target platform. The outputs of the tool on a set of repositories are validated by investigating automatically generated attribute values and comparing them with the measurements of metric gathering tools as well as the GitHub metric values. © 2023 The Author(s)
Description
Keywords
Application programming interfaces (API) , Computer software selection and evaluation , Control systems , Graphical user interfaces , Information management , Learning systems , C++ programming , Dataset construction , Git version control system , Influence analysis , Learning Based Models , Machine-learning , Quality monitoring , Software metrics , Software Quality , Version control system , Quality control