polifonia-lexicon-re-engineering

This repository collects the outcomes of the effort for re-engineering Polifonia Lexicon and adding it to Framester.

Polifonia Lexicon Re-engineering for enriching Framester

This repository contains the code and output produced to re-engineer the Polifonia Lexicon and prepare it for import into Framester in RDF format.

Repository Structure

The repository is structured as follows:

input/: This folder contains the input data required for the lexicon re-engineering process. It includes the following subfolders:
- excel/: Contains the lexicon excel files on which the annotators worked. The 'initial' ones are the ones before the work of the annotators began. The 'final' ones are the ones after the work of annotators ended.
- csv_to_compare/: Contains the CSV files extrapolated from the excel and used for comparing the initial and final versions of the lexicon, to separate the lexicon entries. They are created using 'data_preparation.py'.
- csv_for_rdf/: Contains the CSV files generated from the Polifonia Lexicon, which are then transformed into RDF following Framester's schema. They are created using 'data_preparation.py'.
- csv_for_wn_alignment: Contains the merged CSV files generated from finding best matches for the Polifonia Lexicon in Wordnet.
output/: This folder contains the output generated by the re-engineering process. It includes the following subfolder:
- rdf/: Contains the RDF files generated from the CSVs using the lexicon_to_framester.py script.
schema/: This folder contains the project (.drawio file) for manipulating the graphical diagrams of the re-engineered Polifonia Lexicon's schema and the OWL ontology file(s) derived from the re-engineered Polifonia Lexicon's RDFs.
script/: This folder contains the scripts required for the re-engineering process. It includes the following files:
- data_preparation.py: A script for transforming the Polifonia Lexicon files into CSV format.
- lexicon_to_framester.py: A script for converting the CSVs created from the Polifonia Lexicon into RDF data according to the Framester schema.
- lexicon_kg_to_onto.py: A script for converting Polifonia Lexicon's RDF files into an OWL ontology ready to be imported in Protegé for further elaboration.
- lexicon_to_wordnet.ipynb: A notebook for finding best matches for Wordnet synsets for the Polifonia Lexicon.
- enrich_input_file_EN_with_wordnet_synset_synsets.py: A script for enriching the CSVs created from the Polifonia Lexicon with the Wordnet synsets found in lexicon_to_wordnet.ipynb.
- wordnet_utils.py: Contains utility functions for working with Wordnet.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.idea		.idea
input		input
output		output
schema		schema
script		script
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

polifonia-lexicon-re-engineering

Polifonia Lexicon Re-engineering for enriching Framester

Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

polifonia-lexicon-re-engineering

Polifonia Lexicon Re-engineering for enriching Framester

Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages