Identification, structuring and analysis of the flow of topics of scientific knowledge: Computational methods based on structures of academic genealogy
Contemporary society experiences the so-called Knowledge Era, where production, sharing, and access to information have never been so useful. In this context, structuring the academic-scientific knowledge provides the means to organize the formal knowledge in classes, in order to enable its management and contribute to its dissemination. However, the formal models of knowledge classification do not allow to observe the existing relations between the different categories, nor do they contribute to the identification of the flux of recognition among the members of scientific communities. Thus, we seek to develop a computational method whose objective is to structure the scientific knowledge, considering the hierarchical structure provided by Academic Genealogy (AG). The methods considered in this work include (i) genealogical data collection, (ii) AG graph structuring, (iii) prospecting biographical academic records, (iv) the inference of the topics of these academics' performance based on their biographies , (v) the topics propagation in the AG graph and (vi) the structuring of the graphs of topics. The methods were applied in a case study, in which we considered (i) the records of doctors in mathematics and related areas available from the Mathematics Genealogy Project platform, (ii) a set of biographies referring to these mathematicians, available from the repository Wikipedia and (iii) a dictionary of topics formed from a glossary of areas. The topic graphs, resulting from the case study, are presented as study objects covering structurally distinct approaches, which are accompanied by their respective descriptive analyzes. We believe that identifying and studying the flow of scientific knowledge among generations of researchers is an important task, still not explored by the scientific community due to the lack of data sets. We believe that this work will allow the definition of a new computational method for the study of the flow of scientific knowledge. The relevance of the work lies in the possibility of discovering new information that helps to identify, structure, and analyze the paths of science.