Abstract
The Internet has become a natural communication platform for modern society. Web archives, which began in the 1990s to capture and preserve changing web content, have thus become key sources for research in the recent past. The analysis of their data is complicated by, for example, insufficient competencies of researchers, the need for computing resources or legislation. One way to meet the needs of users is to develop tools and research interfaces that allow to work with data without the need for technological knowledge of advanced extraction and thus open it to researchers. The study addresses the issue of access to archival web data, approaches efforts to formulate a theoretical and methodological framework and proposes a design for access and further data processing, which is applied in a unique research interface for extracting large data from web archives using advanced machine learning to generate and categorization of text outputs.