OPTIMAL ALGORITHM OF LINGUISTIC INDEXATION
DOI:
https://doi.org/10.32782/folium/2025.6.6Keywords:
linguistic indexation, semantics and syntax, comparative analysis, language system, applied linguistics and corpus, sentence structure, mediatextAbstract
One of the core features of the linguistics development under the period of XXI century is the emergence of large volumes of documents, publications and other information sources which need to be sorted and unified. It was during this period that the first information retrieval systems were developed.At the first stages, such search was carried out exclusively manually, however, the rapid development of the computer industry and, accordingly, the subsequent processes automation significantly contributed to digitizing the text information format and, consequently, developing the automatic information retrieval systems.The article presents a comprehensive overview of the linguistic indexation phenomena, including the challenges posed by the unstructured textual data in the digital era. Highlighting the need for the improvement of information search process, the article deals with the low efficiency of existing information analysis systems, mainly caused by uncontrolled information overload. To pursue the key objective of this article, the authors provide a detailed outline of the existing systems for automatic language analysis to identify their main features and gaps to be further addressed.While developing the methodology for optimal linguistic indexation algorithm, the authors analyze and integrate all levels of language analysis: morphological, syntactic, and semantic analysis.The authors create a structured multi-step approach to enhance the quality of automated text analysis, encompassing grammatical parsing, morphological tagging, syntactic-semantic dependency analysis, and semantic modeling. The findings suggest that such approach enhances the accuracy of information analysis and contributes to structuring the information ecosystem more effectively and accurately.
References
Corazza, E. (2004). Reflecting the Mind: Indexicality and Quasi-Indexicality. Oxford : Oxford University Press.
Giorgi Alessandra. (2010). About the Speaker: Towards a Syntax of Indexicality. New York: Oxford University Press.
Steinbach, M. A. (2011). Comparison of Document Clustering Techniques. Minnesota : Minnesota Publishing.
Ticher, S., & Mejer, M. (2009). Methods for analyzing text and discourse. Oxford : Oxford University Press.
Лобановська, І.Г. (2011). Індексування документів ключовими словами. Київ: Нілан-ЛТД.
Сухий, О.Л., Міленін, В.М., & Тарадайнік, В.М. (2005). Алгоритми пошуку в інформаційних системах. Київ.