• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Thematic clusters represented by higher ranks in taxonomy of the filed

Speaker:

Boris Mirkin, Professor in the School of Applied Mathematics and Information Science, University – Higher School of Economics, Moscow, RF

Abstract

A two stage generalization method is described. It operates over a set of concepts C organized into a hierarchical structure, taxonomy, and a set of entities I described by fuzzy subsets of those concepts in C that are leaves of the taxonomy. Examples of I entities: members of a university department conducting research over concepts from C; papers from a journal publishing research papers related to C; university course module synopses related to the taxonomy.
The first stage builds fuzzy clusters of the taxonomy leaves to represent the entity set. For this purpose, an adequate fuzzy clustering method has been developed by combining the additive and spectral approaches. An additional advantage of the method is that the number of clusters can be decided upon in a process of sequential extraction of clusters. On the second stage, each cluster is generalized over the structure of the taxonomy in terms of ``head subject'' nodes on the upper layers of the taxonomy accompanied by their ``gaps'' and ``offshoots''. The criterion of the method is to minimize the total penalty, a weighted sum of the numbers of the annotations introduced over the taxonomy. The method is intended for an integral description of activities of an organization. Real world case studies are presented. Joint work with S. Nascimento, T. Fenner, L.M. Pereira, E. Chernyak, O. Chugunova, and J. Askarova.