featuredb

所属分类:数据库系统
开发工具:HTML
文件大小:0KB
下载次数:0
上传日期:2023-08-02 16:36:12
上 传 者sh-1993
说明:  WIBARAB是阿拉伯语方言学领域的一个项目。它由各种区域子项目(四个博士项目)和一个大型数据中心组成...,
(WIBARAB is a project in the field of Arabic dialectology. It consists of various regional sub-projects (four PhD projects) and a large database about bedouin-type dialects of Arabic. The Feature Database will be the main point of integrating the results of the sub-projects. In this repository we collect the primary data of the database in TEI/XML.)

文件列表:
.DS_Store (6148, 2023-12-18)
010_manannot/ (0, 2023-12-18)
010_manannot/fLib.xml (17227, 2023-12-18)
010_manannot/features/ (0, 2023-12-18)
010_manannot/features/feature_AKL_3sgm_ipfv.xml (157309, 2023-12-18)
010_manannot/features/feature_AKL_3sgm_pfv.xml (152640, 2023-12-18)
010_manannot/features/feature_IMP_2sg_m_3weak.xml (166755, 2023-12-18)
010_manannot/features/feature_IPFV_2sg_f.xml (191374, 2023-12-18)
010_manannot/features/feature_IPFV_3pl_m_c.xml (208341, 2023-12-18)
010_manannot/features/feature_IPFV_3sg_m_1w.xml (218057, 2023-12-18)
010_manannot/features/feature_JAA_3pl_mc.xml (127010, 2023-12-18)
010_manannot/features/feature_JAA_3sg_f.xml (145073, 2023-12-18)
010_manannot/features/feature_JAA_3sg_m.xml (144124, 2023-12-18)
010_manannot/features/feature_PFV_3pl_m_c.xml (203620, 2023-12-18)
010_manannot/features/feature_PFV_3pl_m_c_3wy.xml (270021, 2023-12-18)
010_manannot/features/feature_PFV_3sg_f.xml (220511, 2023-12-18)
010_manannot/features/feature_PFV_3sg_f_3wy.xml (254811, 2023-12-18)
010_manannot/features/feature_apophonic_passive.xml (102245, 2023-12-18)
010_manannot/features/feature_bound_pronoun_1ps_pl.xml (130173, 2023-12-18)
010_manannot/features/feature_bound_pronoun_1ps_sg_after_consonant.xml (135816, 2023-12-18)
010_manannot/features/feature_bound_pronoun_1ps_sg_after_vowel.xml (133443, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_pl_c.xml (71544, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_pl_f.xml (78827, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_pl_m.xml (87754, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_c_after_consonant.xml (37225, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_c_after_vowel.xml (34204, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_f_after_consonants.xml (135203, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_f_after_vowel.xml (112977, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_m_after_consonant.xml (125806, 2023-12-18)
010_manannot/features/feature_bound_pronoun_2ps_sg_m_after_vowel.xml (95112, 2023-12-18)
010_manannot/features/feature_bound_pronoun_3ps_pl_c.xml (82254, 2023-12-18)
010_manannot/features/feature_bound_pronoun_3ps_pl_f.xml (81462, 2023-12-18)
010_manannot/features/feature_bound_pronoun_3ps_pl_m.xml (97884, 2023-12-18)
010_manannot/features/feature_bound_pronoun_3ps_sg_c.xml (21589, 2023-12-18)
... ...

# WIBARAB feature database ## About WIBARAB WIBARAB is a very nice project in the field of Arabic dialectology. It consists of various regional sub-projects (four PhD projects) and a large database about bedouin-type dialects of Arabic. The *Feature Database* will be the main point of integrating the results of the sub-projects. In this repository we collect the primary data of the database in TEI/XML. Principal Investigator: Stephan Procházka (University of Vienna) National Cooperation Partner: Charly Mrth (Austrian Academy of Sciences) See for more information Contact us at [wibarab@oeaw.ac.at](https://github.com/wibarab/featuredb/blob/master/mailto:wibarab@oeaw.ac.at) or follow us on [Twitter](https://github.com/wibarab/featuredb/blob/master/https://twitter.com/wibarab). ## Status of the data **THIS IS PRELIMINARY DATA AND COPYRIGHTED MATERIAL!** If you want to use any material in this repository please contact us at [wibarab@oeaw.ac.at](https://github.com/wibarab/featuredb/blob/master/mailto:wibarab@oeaw.ac.at) This will change at the end of the project. ## Directory Structure | Directory | Content |Remarks | | --------------------- | -------------------------- | -------------- | | `001_src` | Original sources | Any external source data coming to the project | | `082_scripts_xsl` | XSLT scripts | various XSLT scripts to convert the data scripts | | `102_derived_TEI` | TEI-XML documents | TEI documents derived from a automatized conversion process (from `001_src` or elsewhere) | | `010_manannot` | manually annotated TEI-XML documents | TEI documents which are manually annotated / curated / edited. *Automated processed are not expected to write into this directory.* We want to make sure that a human curator has validated the data in this directory and that nothing manually curated is overwritten by some script. | | `802_tei_odd` | TEI customization (ODD) | This is the source of truth for the WIBARAB FeatureDB Schema and the HTML documentation generated from it. | | `804_xsd` | XML Schemas | These are derived from the ODD in `802_tei_odd`. Each version of the schema should bear its number in the file name. | | `850_docs` | Documentation | Further data documentation, encoding guidelines etc. | ## Schema Development At this point, the model of the *WIBARAB Feature Database* schema is still evolving to a certain extent while new data is being curated, existing data being curated etc. In order to make sure that transitioning from one version of the schema to the next happens in a structured manner, we set up the following rules: * Any development of the schema is done in `802_tei_odd/featuredb.odd`. This file might also contain unpublished, unfinished, backwards-incompatible changes not reflected in any derived schema or documentation. * **Naming conventions:** We follow the [Semantic Versioning Best Practices 2.0.0](https://github.com/wibarab/featuredb/blob/master/https://semver.org/) which - applied to our case - boil down to the following principles: * If a change potentially makes documents invalid which were previously valid, it is a new MAJOR version (i.e. increment the first number) * If a change does not break validity of existing documents (e.g. in that it only adds optional elements or attributes or adds a significant portion of prose to the documentation) it is a new MINOR version (i.e. increment second number) * If a change in the schema is merely a bug fix (typo etc.) or a minor addition to the documentation (change in wording, added examples etc.) this constitutes a PATCH version (i.e. the third number is incremented). ### Schema release workflow When a new version of the schema is to be released: * In the ODD document: * update `@n` on `` to only contain the exact version number (e.g. `2.1.3b`). * change `` to include the version number. These elements are treated only as labels and can thus include human-readble additions (like e.g. `Version 2.1.3 Beta`) * add a `` element with your editor ID and the current date, setting `@status="published"`. Ideally add a `` with all the changes you did in the ODD. * *Do not change the filename of the ODD document.* * In oXygen: * Generate the XSD schema from the ODD by right-clicking on `802_tei_odd/featuredb.odd` and selecting `Transform > Transform with > TEI ODD to XML Schema`. The resulting files are placed into a new directory `802_tei_odd/out`. * create a new subfolder named `{versionnumber}` in `804_xsd/`, e.g. `804_xsd/2.1.3b/` and move the files from `802_tei_odd/out` to that folder. * Generate the html documentation and place it under `850_docs/featuredb_{versionnumber}.html` * Afterwards delete `802_tei_odd/out`. * Write a conversion script to transform documents from the previous schema version to the current one. * Important: make sure that the conversion script updates the `@xsi:schemaLocation` in the migrated document instance. * Place the XSLT script under `082_scripts_xsl/migrations` and name it `migrate_to_{versionnumber}.xsl` (e.g. migrations/migrate_to_1.0.0b.xsl`). * Run the conversion script on the `oddtest.xml` document in `802_tei_odd` and check it does produce the wanted results. * Apply the conversion script to the files in `010_manannot`. They should be output to `102_derived_TEI` * Commit all changes to git and add a `tag` named after the schema version number. * Curators have to check the converted TEI documents and move them from `102_derived_TEI` to `010_manannot` to approve the change. ## About this file This README file has a long-wound and dark history of editing. If you dare, you can check it out [here](https://github.com/wibarab/featuredb/blob/master/https://github.com/wibarab/featuredb/commits/e5d4a768a1702403e8772a0085a3ac2c66c0cf3f/README.md).

近期下载者

相关文件


收藏者