Dataset Persistent ID
|
doi:10.12763/MLCFIE |
Publication Date
|
2022-03-16 |
Title
| Siganalogies - morphological analogies from Sigmorphon 2016 and 2019 |
Alternative URL
| https://github.com/EMarquer/siganalogies |
Author
| Marquer, Esteban (LORIA) - ORCID: 0000-0003-2315-7732
Couceiro, Miguel (LORIA) - ORCID: 0000-0003-2316-7623
Safa Alsaidi (Université de Lorraine) - ORCID: 0000-0002-4132-1068
Amandine Decker (Université de Lorraine) - ORCID: 0000-0001-6773-9983 |
Contact
|
Use email button above to contact.
Marquer, Esteban (LORIA)
Couceiro, Miguel (LORIA) |
Description
| The siganalogies dataset contains morphological analogies built upon Sigmorphon 2016 and Sigmorphon 2019 in PyTorch. An analogical proportion is defined as a 4-ary relation written A:B::C:D and which reads "A is to B as C is to D". In this dataset, we manipulate morphological analogies, i.e., on analogies involving character strings, where the transformations between the objects correspond to morphological transformations of words (e.g., conjugation or declension). In our dataset, A, B, C, and D are words. An example in English would be "dog : dogs :: cat : cats". The dataset contains: (i) a copy of Sigmorphon 2019 and Sigmorphon 2016 extended with Japanese data, (ii) serialized objects, one for each language, containing the indices of the analogies and other relevant data, and (iii) the code necessary to manipulate the dataset and serialized data. |
Subject
| Arts and Humanities; Computer and Information Science |
Keyword
| Morphology (linguistics) (UNESCO) http://vocabularies.unesco.org/browser/thesaurus/en/
Multilingualism (UNESCO) http://vocabularies.unesco.org/browser/thesaurus/en/
Analogy |
Related Publication
| Alsaidi, Safa & Decker, Amandine & Lay, Puthineath & Marquer, Esteban & Murena, Pierre-Alexandre & Couceiro, Miguel. (2021). On the Transferability of Neural Models of Morphological Analogies. AIMLAI 2021 - workshop on Advances in Interpretable Machine Learning and Artificial Intelligence hal-03313591 https://hal.univ-lorraine.fr/LORIA-NLPKD/hal-03313591v1
Alsaidi, Safa and Decker, Amandine and Lay, Puthineath and Marquer, Esteban and Murena, Pierre-Alexandre and Couceiro, Miguel. (2012). A Neural Approach for Detecting Morphological Analogies. DSAA 2021 - 8th IEEE International Conference on Data Science and Advanced Analytics, p. 1-10 hal-03313556 https://hal.inria.fr/hal-03313556 |
Language
| Albanian; Arabic; Armenian; Bashkir; Basque; Belarusian; Bengali, Bangla; Bulgarian; Czech; Danish; Dutch; English; Estonian; Finnish; French; Georgian; German; Greek (modern); Hebrew (modern); Hindi; Hungarian; Irish; Italian; Japanese; Kannada; Latin; Maltese; Navajo, Navaho; Persian (Farsi); Polish; Portuguese; Romanian; Russian; Sanskrit (Saṁskṛta); Slovak; Slovene; Spanish, Castilian; Swahili; Turkish; Urdu; Uzbek; Welsh; Zulu |
Contributor
| Project Leader : Miguel Couceiro
Project Leader : Esteban Marquer
Project Member : Amandine Decker
Project Member : Safa Alsaidi
Project Member : Putineath Lay
Project Member : Pierre-Alexandre Murena |
Depositor
| Marquer, Esteban |
Deposit Date
| 2022-03-14 |
Kind of Data
| Software; Text |
Other Kind of Data
| Pytorch serialized objects; Python code |
Software
| Python, Version: 3.8
PyTorch, Version: 1.10 |
Related Material
| https://github.com/EMarquer/siganalogies/blob/main/siganalogies_description.pdf |
Related Datasets
| https://github.com/ryancotterell/sigmorphon2016; https://github.com/sigmorphon/2019 |