API Reference
polonaexplorer.explorer module
Explorer for Polona corpus.
Finds files and text regions containing target words and generates dataframes for use in plotter.
- class polonaexplorer.explorer.PolonaExplorer(targetwords: list, data_path: str, out_path: str, metadata_file_path: str, part: str = 'region')[source]
Bases:
objectExplorer the polona2 corpus in METS/MODS format.
- generate_dataframe() Path[source]
Generate dataframe of all found page texts.
Uses metadata information to include publication date, title, place and more for the found periodicals. ID denotes the original identifier from the polona2 archive. Fragments contains the text data identified to contain fitting text by the original archive.
polonaexplorer.plotter module
Generates topic model maps for text containing target words.
- class polonaexplorer.plotter.Plotter(data_path: str, out_path: str, year_range: tuple[int, int], embedding_model: str = 'google/embeddinggemma-300m', topicname_llm_model: str = 'llama4:scout')[source]
Bases:
objectRun topic model and plotting.