/cnk/work/skoumal/INFRA/SYNv14(chomsky:)/mnt/ada/data/SYNv13202512, 18. 12. 10:53| Název | Velikost | Soubory | Stroj | CPU | Začátek | Konec | Trvání | final_corr | Poznámka |
|---|---|---|---|---|---|---|---|---|---|
| SYNv8_a-i_ | 6,1 GB | 27.564 | lovelace | 100 | 12-08 11:32:10 | 12-12 06:51:52 | 91,5 h. | 12-18 20:47 | |
| SYNv8_j-l | 1,8 GB | 14.821 | grimm | 40 | 12-11 14:11:20 | 12-13 09:17:05 | 43 h. | 12-18 18:21 | |
| SYNv8_m-o | 6,0 GB | 17.301 | jakobson | 80 | 12-10 15:19:25 | 12-13 04:30:42 | 63+1,5 h. | 12-19 17:13 | po restartu |
| SYNv8_p-z | 4,4 GB | 27.986 | lovelace2 | 60 | 12-10 17:25:27 | 12-13 15:44:53 | 70,5 h. | 12-19 16:52 | |
| NEWTON2015 | 1,1 GB | 6.331 | grimm | 45 | 12-05 01:14:03 | 12-06 03:45:22 | 26,5 h. | 12-19 14:33 | |
| NEWTON2016 | 1009 MB | 6.234 | sag | 10 | 12-05 01:21:55 | 12-07 17:48:50 | 40,5 h. | 12-19 13:43 | |
| NEWTON2017 | 872 MB | 6.198 | jakobson | 80 | 12-05 01:21:13 | 12-05 15:46:23 | 14,5 h. | 12-19 13:52 | |
| NEWTON2018 | 1,3 GB | 9.976 | lovelace2 | 50 | 12-05 01:26:48 | 12-06 03:04:35 | 25,5 h. | 12-19 14:11 | |
| NEWTON2019 | 804 MB | 6.308 | grimm | 45 | 12-08 15:57:26 | 12-09 11:54:16 | 20 h. | 12-19 15:20 | |
| NEWTON2020 | 832 MB | 7.136 | lovelace2 | 80 | 12-08 16:09:53 | 12-09 12:44:48 | 20,5 h. | 12-19 14:50 | |
| NEWTON2021 | 692 MB | 6.807 | sag | 10 | 12-08 16:03:03 | 12-10 17:05:22 | 49 h. | 12-19 17:48 | |
| NEWTON2022 | 731 MB | 6.564 | jakobson | 100 | 12-08 16:04:53 | 12-09 06:30:54 | 14,5 h. | 12-19 16:36 | |
| NEWTON2023 | 654 MB | 6.235 | lovelace | 100 | 12-12 15:39:27 | 12-13 07:21:02 | 16 h. | 12-19 16:42 | |
| NEWTON2024 | 609 MB | 5.941 | sag | 10 | 12-12 16:48:10 | 12-14 07:51:03 | 39 h. | 12-19 17:28 | |
| SYN2020 | 261 MB | 1.621 | sag | 10 | 12-11 14:59:40 | 12-12 10:19:57 | 19,5 h. | 12-19 17:34 | |
| SYN2025 | 488 MB | 3.747 | grimm | 45 | 12-02 12:26:11 | 12-03 01:21:12 | 7 h. | 12-19 17:39 | oprava na grimmovi |
| SYN2025-p | 6,9 MB | 305 | lovelace | 100 | 12-17 11:14:11 | 12-17 11:54:24 | 0,5 h. | 12-19 00:07 |
| Název | Stroj | CPU | Konec | tar.gz |
|---|---|---|---|---|
| SYNv8_a-i_ | lovelace2 | 100 | Jan 08 23:29 | chomsky |
| SYNv8_j-l | jakobson | 100 | Jan 09 01:52 | jakobson |
| SYNv8_m-o | lovelace2 | 45 | Jan 09 03:48 | lovelace2 |
| SYNv8_p-z | lovelace2 | 100 | Jan 09 00:47 | sag |
| SYN2020 | lovelace2 | 100 | Jan 09 01:37 | jakobson |
| SYN2025 | lovelace2 | 100 | Jan 09 01:41 | jakobson |
| SYN2025-p | lovelace2 | 100 | Jan 09 01:43 | sag |
| NEWTON2015 | lovelace2 | 100 | Jan 09 01:59 | sag |
| NEWTON2016 | lovelace2 | 100 | Jan 09 02:19 | jakobson |
| NEWTON2017 | lovelace2 | 100 | Jan 09 02:40 | sag |
| NEWTON2018 | lovelace2 | 100 | Jan 09 03:03 | jakobson |
| NEWTON2019 | grimm | 45 | Jan 09 04:39 | grimm |
| NEWTON2020 | lovelace2 | 100 | Jan 09 04:18 | sag |
| NEWTON2021 | lovelace2 | 100 | Jan 09 04:44 | lovelace |
| NEWTON2022 | lovelace2 | 100 | Jan 09 04:52 | sag |
| NEWTON2023 | jakobson | 100 | Jan 09 05:12 | jakobson |
| NEWTON2024 | lovelace2 | 100 | Jan 09 04:59 | lovelace |
/home/skoumal/cnk-work/INFRA/OPRAVAin-utf8 kromě SYN2025 a SYN2025-pcd /home/skoumal/cnk-work/INFRA/OPRAVA/<korpus> screen process_text.sh -v -tvrbtg8 -p<num>
cd .../<korpus> diffys -w200 -r vert-vrbtg8/ ../../SYNv14/<korpus>/vert-vrbtg8/ | grep -v "^diff -y" | cut -f2 | cut -f1 -d' ' | sort -u > ../<korpus>-diff.txt
spoutaný nespoutaný AA dbalý nedbalý AA volný nevolný AA otesaný neotesaný AA pokrytý nepokrytý AA pozorovaný nepozorovaný AA uvěřitelně neuvěřitelně Dg vázaný nevázaný AA zvyklý nezvyklý AA zúčastněný nezúčastněný AA zřízený nezřízený AA
a vygrepneme fajly, kterých se oprava týká
find_negation_v14.sh
repair_negation_v14.sh
Výsledky jsou v adresářích mwe_out-prod-c2-corr/ u každého korpusu.
mv mwe_out-prod-c2 mwe_out-prod-c2.sav mv mwe_out-prod-c2-corr mwe_out-prod-c2 mv vert-mwe-corr vert-mwe-corr.sav mkdir vert-mwe-corr
a provedeme kontrolu s generováním .ann.xml
cd .../<korpus> check-mwe-corpus.sh -p100 -v
vert-rules0-frazrl-rules-mdita-sublm-agr:cd <korpus>
mv vert-rules0-frazrl-rules-mdita-sublm-agr vert-rules0-frazrl-rules-mdita-sublm-agr.sav
mkdir vert-rules0-frazrl-rules-mdita-sublm-agr
cd mwe_out-prod-c2.sav
ls -S *.txt | parallel -j100 echo {}; "cut -f1-6 {} > ../vert-rules0-frazrl-rules-mdita-sublm-agr/{}"
cd ../mwe_out-prod-c2
ls -S *.txt | parallel -j100 echo {}; "cut -f1-6 {} > ../vert-rules0-frazrl-rules-mdita-sublm-agr/{}"
anebo spustíme skript
mwe_new_input.sh
screen mwe_annotate_v14.sh
cd mwe_out-prod-c2
mkdir ../mwe_out-prod-c2-opr
ls -S | parallel -j45 "perl -pe 's/([\t\|])-(..____)/\1g\2/g' {} > ../mwe_out-prod-c2-opr/{}"
] - naštěstí to Přemek opravil.
screen check-mwe-corpus.sh -p100 -v