Emesal proto

Origins of Emesal // University of Helsinki

Analysis
MorphologyDistribution of nominal morphemes over EG and ES vocabulary
Predictability by wordConcordances, 3D scatterplots and statistics on word embeddings
Predictability by textHeatmaps, Oracc links
More infoInformation about the process and data.

Acknowledgements: Niek Veldhuis, Steve Tinney, Noah Kröll, Sebastian Fink, Krister Lindén (PI)

generated with emesal_vectors.py -- asahala 2022




def demo():
    PREFIX = '2nd_mill_'
    threshold = 10
    #purge(PREFIX)

    window = 10 # prediction context window
    vector_window = 3 # vector window
    split_lines = True

    dataset = [text for text in EmesalFinder.find_texts()
               if text.millennium == '2nd' and
               text.word_count > 30 and
               text.emesal_ratio >= 0.1 and
               text.lacuna_ratio <= 0.33]

    #batch_process(n=5, threshold=threshold, data=dataset, filename=PREFIX, vector_window=vector_window, window=window, split_lines=split_lines)
    #general_vectors_makedata(PREFIX, vector_window=vector_window)
    #general_vectors_similarities(PREFIX)
    predict_emesal(PREFIX, dataset, window=window)
    generate_statistics(PREFIX, dataset, threshold)
    generate_morphology_table(PREFIX, dataset, threshold)

demo()