Isolation of hidden variables from Big Data in institutional research of pedagogical engineers as an element of monitoring the professional competence of specialists
Abstract. The article describes a method of "compression" of information based on the allocation of latent variables with Big Data poll results. The peculiarity of the data used to compress information is the presence of various types of answers in free text form does not allow the use of existing methods, in which the assessment of respondents' answers is carried out only on the basis of the Likert scale. On the basis of the described methodology, it is possible to transform any database with survey results into a form suitable for statistical processing and provide compression of the original data array to sizes that allow comparative analysis of survey results over a long period of time. A three-level model of the student satisfaction index (IUS) has been built. The first-level latent variables of this model detail the composition of the second-level latent variables. Latent variables of the second level determine the ICS index. Latent variables of the second level are attitudes, educational process, relationships, conditions. Latent variables of the first level: attitude to the academy, attitude to the specialty, content of educational programs, organization of the educational process, practical training, information services, relationships with the administration, relationships with teachers and staff, relationships with students, living conditions, food, leisure. At the stage of filtering the data array with the results of the polls from the database, questions that are not related to the research topic and questions that were included in the questionnaires not constantly, but from time to time were excluded from the database. At the stage of data transformation and their presentation in a form suitable for processing, the response texts were converted into digital form. For all the questions, the questionnaires for transformations were highlighted: a dichotomous question, a question based on a Likert scale, a question with free-form answers. The article presents the results of testing the methodology on the example of constructing an index of student satisfaction. The results of data analysis confirmed the validity and reliability of the results.
Key words: institutional research, student surveys, student satisfaction index, latent variable, scale of questions, digitization of the questionnaire, checking the compatibility of answers, validity and reliability of the survey.