Blame SPECS/tesseract-tessdata.spec

0b3baf
#global commit 7274cfad453d770f36b53ec5a2294ddd6d905703
0b3baf
#global shortcommit %(c=%{commit}; echo ${c:0:7})
0b3baf
0b3baf
#global pre beta.1
0b3baf
0b3baf
Name:          tesseract-tessdata
0b3baf
Version:       4.1.0
0b3baf
Release:       3%{?pre:.%pre}%{?commit:.git%{shortcommit}}%{?dist}
0b3baf
Summary:       Trained models for the Tesseract Open Source OCR Engine
0b3baf
BuildArch:     noarch
0b3baf
0b3baf
License:       ASL 2.0
0b3baf
URL:           https://github.com/tesseract-ocr/tessdata_fast
0b3baf
%if 0%{?commit:1}
0b3baf
Source0:       https://github.com/tesseract-ocr/tessdata_fast/archive/%{commit}/tessdata_fast-%{shortcommit}.tar.gz
0b3baf
%else
0b3baf
Source0:       https://github.com/tesseract-ocr/tessdata_fast/archive/%{version}%{?pre:-%pre}/tessdata_fast-%{version}%{?pre:-%pre}.tar.gz
0b3baf
%endif
0b3baf
0b3baf
0b3baf
%description
0b3baf
This package contains fast integer versions of trained models for the Tesseract
0b3baf
Open Source OCR Engine.
0b3baf
0b3baf
These models only work with the LSTM OCR engine of Tesseract 4.
0b3baf
0b3baf
0b3baf
%package        doc
0b3baf
Summary:        Documentation for %{name}
0b3baf
0b3baf
%description    doc
0b3baf
The %{name}-doc package contains the documentation for %{name}.
0b3baf
0b3baf
0b3baf
%package -n tesseract-osd
0b3baf
Summary:       Orientation & Script Detection Data for tesseract
0b3baf
BuildArch:     noarch
0b3baf
Requires:      tesseract
0b3baf
Requires:      %{name}-doc = %{version}-%{release}
0b3baf
0b3baf
%description -n tesseract-osd
0b3baf
Orientation & Script Detection data for the Tesseract Open Source OCR Engine.
0b3baf
0b3baf
0b3baf
%package -n tesseract-equ
0b3baf
Summary:       Equation traineddata for tesseract
0b3baf
BuildArch:     noarch
0b3baf
Requires:      tesseract
0b3baf
Requires:      %{name}-doc = %{version}-%{release}
0b3baf
0b3baf
%description -n tesseract-equ
0b3baf
Data for processing images of mathematics with the Tesseract Open Source OCR Engine.
0b3baf
0b3baf
0b3baf
# define lang_subpkg macro
0b3baf
# m: 3 letter macrolanguage code
0b3baf
# l: langcode used in Provides and Supplements tags
0b3baf
# n: language name
0b3baf
# -m and -n is needed for subpackages, -l is optional
0b3baf
#
0b3baf
%define lang_subpkg(l:m:n:) \
0b3baf
%define macrolang %{-m:%{-m*}}%{!-m:%{error:3 letter Language code not defined}} \
0b3baf
%define langcode %{-l:%{-l*}}%{!-l:%{error:Language code not defined}} \
0b3baf
%define langname %{-n:%{-n*}}%{!-n:%{error:Language name not defined}} \
0b3baf
\
0b3baf
%package -n tesseract-langpack-%{macrolang}\
0b3baf
Summary:       %{langname} language data for %{name}\
0b3baf
BuildArch:     noarch\
0b3baf
Requires:      tesseract\
0b3baf
Requires:      %{name}-doc = %{version}-%{release}\
0b3baf
%{-l:Provides:      %{name}-langpack-%{langcode} = %{version}-%{release}\
0b3baf
Supplements:   (tesseract and langpacks-%{langcode})}\
0b3baf
\
0b3baf
%description -n tesseract-langpack-%{macrolang}\
0b3baf
This package contains the fast integer version of the %{langname} language \
0b3baf
trained models for the Tesseract Open Source OCR Engine.\
0b3baf
\
0b3baf
%files -n tesseract-langpack-%{macrolang}\
0b3baf
%{_datadir}/tesseract/tessdata/%{macrolang}.*
0b3baf
0b3baf
# define script_subpkg macro
0b3baf
# s: script name
0b3baf
# n: package name
0b3baf
#
0b3baf
%define script_subpkg(s:n:) \
0b3baf
%define scriptname %{-s:%{-s*}}%{!-s:%{error:Script name defined}} \
0b3baf
%define filename %{-n:%{-n*}}%{!-n:%{error:Package name not defined}} \
0b3baf
%define pkgname %(echo %filename | tr '[:upper:]' '[:lower:]') \
0b3baf
\
0b3baf
%package -n tesseract-script-%{pkgname}\
0b3baf
Summary:       %{scriptname} script data for %{name}\
0b3baf
BuildArch:     noarch\
0b3baf
Requires:      tesseract\
0b3baf
Requires:      %{name}-doc = %{version}-%{release}\
0b3baf
\
0b3baf
%description -n tesseract-script-%{pkgname}\
0b3baf
This package contains the fast integer version of the %{scriptname} script \
0b3baf
trained models for the Tesseract Open Source OCR Engine.\
0b3baf
\
0b3baf
%files -n tesseract-script-%{pkgname}\
0b3baf
%dir %{_datadir}/tesseract/tessdata/script/\
0b3baf
%{_datadir}/tesseract/tessdata/script/%{filename}.*
0b3baf
0b3baf
# see https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
0b3baf
# and https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes
0b3baf
%lang_subpkg -m afr -l af -n Afrikaans
0b3baf
%lang_subpkg -m amh -l an -n Amharic
0b3baf
%lang_subpkg -m ara -l ar -n Arabic
0b3baf
%lang_subpkg -m asm -l as -n Assamese
0b3baf
%lang_subpkg -m aze -l az -n Azerbaijani
0b3baf
%lang_subpkg -m aze_cyrl -n %{quote:Azerbaijani (Cyrillic)}
0b3baf
%lang_subpkg -m bel -l bel -n Belarusian
0b3baf
%lang_subpkg -m ben -l bn -n Bengali
0b3baf
%lang_subpkg -m bod -l bo -n %{quote:Tibetan (Standard)}
0b3baf
%lang_subpkg -m bos -l bs -n Bosnian
0b3baf
%lang_subpkg -m bre -l br -n Breton
0b3baf
%lang_subpkg -m bul -l bg -n Bulgarian
0b3baf
%lang_subpkg -m cat -l ca -n Catalan
0b3baf
%lang_subpkg -m ceb -n Cebuano
0b3baf
%lang_subpkg -m ces -l cs -n Czech
0b3baf
%lang_subpkg -m chi_sim -l zh_CN -n %{quote:Chinese (Simplified)}
0b3baf
%lang_subpkg -m chi_sim_vert -l zh_CN -n %{quote:Chinese (Simplified, Vertical)}
0b3baf
%lang_subpkg -m chi_tra -l zh_TW -n %{quote:Chinese (Traditional)}
0b3baf
%lang_subpkg -m chi_tra_vert -l zh_TW -n %{quote:Chinese (Traditional, Vertical)}
0b3baf
%lang_subpkg -m chr -n Cherokee
0b3baf
%lang_subpkg -m cos -l co -n Corsican
0b3baf
%lang_subpkg -m cym -l cy -n Welsh
0b3baf
%lang_subpkg -m dan -l da -n Danish
0b3baf
%lang_subpkg -m deu -l de -n German
0b3baf
%lang_subpkg -m div -l dv -n %{quote:Dhivehi; Maldivian}
0b3baf
%lang_subpkg -m dzo -n Dzongkha
0b3baf
%lang_subpkg -m ell -l el -n Greek
0b3baf
%lang_subpkg -m eng -n English
0b3baf
%lang_subpkg -m enm -n %{quote:Middle English (1100-1500)}
0b3baf
%lang_subpkg -m epo -l eo -n Esperanto
0b3baf
%lang_subpkg -m est -l et -n Estonian
0b3baf
%lang_subpkg -m eus -l eu -n Basque
0b3baf
%lang_subpkg -m fao -l fo -n %{quote:Faroese}
0b3baf
%lang_subpkg -m fas -l fa -n %{quote:Persian (Farsi)}
0b3baf
%lang_subpkg -m fil -n %{quote:Filipino; Pilipino}
0b3baf
%lang_subpkg -m fin -l fi -n Finnish
0b3baf
%lang_subpkg -m fra -l fr -n French
0b3baf
%lang_subpkg -m frk -n Fraktur
0b3baf
%lang_subpkg -m frm -n %{quote:Middle French (ca. 1400-1600)}
0b3baf
%lang_subpkg -m fry -l fy -n %{quote:Western Frisian}
0b3baf
%lang_subpkg -m gla -l gd -n %{quote:Gaelic; Scottish Gaelic}
0b3baf
%lang_subpkg -m gle -l ga -n Irish
0b3baf
%lang_subpkg -m glg -l gl -n Galician
0b3baf
%lang_subpkg -m grc -n %{quote:Ancient Greek}
0b3baf
%lang_subpkg -m guj -l gu -n Gujarati
0b3baf
%lang_subpkg -m hat -l ht -n Haitian
0b3baf
%lang_subpkg -m heb -l he -n Hebrew
0b3baf
%lang_subpkg -m hin -l hi -n Hindi
0b3baf
%lang_subpkg -m hrv -l hr -n Croatian
0b3baf
%lang_subpkg -m hun -l hu -n Hungarian
0b3baf
%lang_subpkg -m hye -l hy -n Armenian
0b3baf
%lang_subpkg -m iku -l iu -n Inuktitut
0b3baf
%lang_subpkg -m ind -l id -n Indonesian
0b3baf
%lang_subpkg -m isl -l is -n Icelandic
0b3baf
%lang_subpkg -m ita -l it -n Italian
0b3baf
%lang_subpkg -m ita_old -n %{quote:Italian (Old)}
0b3baf
%lang_subpkg -m jav -l jav -n Javanese
0b3baf
%lang_subpkg -m jpn -l ja -n Japanese
0b3baf
%lang_subpkg -m jpn_vert -l ja -n "Japanese (Vertical)"
0b3baf
%lang_subpkg -m kan -l kn -n Kannada
0b3baf
%lang_subpkg -m kat -l ka -n Georgian
0b3baf
%lang_subpkg -m kat_old -n %{quote:Georgian (Old)}
0b3baf
%lang_subpkg -m kaz -l kk -n Kazakh
0b3baf
%lang_subpkg -m khm -l km -n Khmer
0b3baf
%lang_subpkg -m kir -l ky -n Kyrgyz
0b3baf
%lang_subpkg -m kor -l ko -n Korean
0b3baf
%lang_subpkg -m kor_vert -l ko -n "Korean (Vertical)"
0b3baf
%lang_subpkg -m kmr -l ku -n Kurmanji
0b3baf
%lang_subpkg -m lao -l lo -n Lao
0b3baf
%lang_subpkg -m lat -l lat -n Latin
0b3baf
%lang_subpkg -m lav -l lv -n Latvian
0b3baf
%lang_subpkg -m lit -l lt -n Lithuanian
0b3baf
%lang_subpkg -m ltz -l lb -n Luxembourgish
0b3baf
%lang_subpkg -m mal -l ml -n Malayalam
0b3baf
%lang_subpkg -m mar -l mr -n Marathi
0b3baf
%lang_subpkg -m mkd -l mk -n Macedonian
0b3baf
%lang_subpkg -m mlt -l mt -n Maltese
0b3baf
%lang_subpkg -m mon -l mn -n Mongolian
0b3baf
%lang_subpkg -m mri -l mi -n Maori
0b3baf
%lang_subpkg -m msa -l ms -n Malay
0b3baf
%lang_subpkg -m mya -l my -n Burmese
0b3baf
%lang_subpkg -m nep -l ne -n Nepali
0b3baf
%lang_subpkg -m nld -l nl -n Dutch
0b3baf
%lang_subpkg -m nor -l no -n Norwegian
0b3baf
%lang_subpkg -m oci -l oc -n Occitan
0b3baf
%lang_subpkg -m ori -l or -n Oriya
0b3baf
%lang_subpkg -m pan -l pa -n Panjabi
0b3baf
%lang_subpkg -m pol -l pl -n Polish
0b3baf
%lang_subpkg -m por -l pt -n Portuguese
0b3baf
%lang_subpkg -m pus -l ps -n Pashto
0b3baf
%lang_subpkg -m que -l qu -n Quechuan
0b3baf
%lang_subpkg -m ron -l ro -n Romanian
0b3baf
%lang_subpkg -m rus -l ru -n Russian
0b3baf
%lang_subpkg -m san -l sa -n Sanskrit
0b3baf
%lang_subpkg -m sin -l si -n Sinhala
0b3baf
%lang_subpkg -m slk -l sk -n Slovakian
0b3baf
%lang_subpkg -m slv -l sl -n Slovenian
0b3baf
%lang_subpkg -m snd -l sd -n Sindhi
0b3baf
%lang_subpkg -m spa -l es -n Spanish
0b3baf
%lang_subpkg -m spa_old -n %{quote:Spanish (Old)}
0b3baf
%lang_subpkg -m sqi -l sq -n Albanian
0b3baf
%lang_subpkg -m srp -l sr -n Serbian
0b3baf
%lang_subpkg -m srp_latn -n %{quote:Serbian (Latin)}
0b3baf
%lang_subpkg -m sun -l su -n Sundanese
0b3baf
%lang_subpkg -m swa -l sw -n Swahili
0b3baf
%lang_subpkg -m swe -l sv -n Swedish
0b3baf
%lang_subpkg -m syr -l ar_SY -n Syriac
0b3baf
%lang_subpkg -m tam -l ta -n Tamil
0b3baf
%lang_subpkg -m tat -l tt -n Tatar
0b3baf
%lang_subpkg -m tel -l te -n Telugu
0b3baf
%lang_subpkg -m tgk -l tg -n Tajik
0b3baf
%lang_subpkg -m tha -l th -n Thai
0b3baf
%lang_subpkg -m tir -l ti -n Tigrinya
0b3baf
%lang_subpkg -m ton -l to -n Tongan
0b3baf
%lang_subpkg -m tur -l tr -n Turkish
0b3baf
%lang_subpkg -m uig -l ug -n Uyghur
0b3baf
%lang_subpkg -m ukr -l uk -n Ukrainian
0b3baf
%lang_subpkg -m urd -l ur -n Urdu
0b3baf
%lang_subpkg -m uzb -l uz -n Uzbek
0b3baf
%lang_subpkg -m uzb_cyrl -n %{quote:Uzbek (Cyrillic)}
0b3baf
%lang_subpkg -m vie -l vi -n Vietnamese
0b3baf
%lang_subpkg -m yid -l yi -n Yiddish
0b3baf
%lang_subpkg -m yor -l yo -n Yoruba
0b3baf
0b3baf
%script_subpkg -n Arabic -s Arabic
0b3baf
%script_subpkg -n Armenian -s Armenian
0b3baf
%script_subpkg -n Bengali -s Bengali
0b3baf
%script_subpkg -n Canadian_Aboriginal -s %{quote:Canadian (Aboriginal)}
0b3baf
%script_subpkg -n Cherokee -s Cherokee
0b3baf
%script_subpkg -n Cyrillic -s Cyrillic
0b3baf
%script_subpkg -n Devanagari -s Devanagari
0b3baf
%script_subpkg -n Ethiopic -s Ethiopic
0b3baf
%script_subpkg -n Fraktur -s Fraktur
0b3baf
%script_subpkg -n Georgian -s Georgian
0b3baf
%script_subpkg -n Greek -s Greek
0b3baf
%script_subpkg -n Gujarati -s Gujarati
0b3baf
%script_subpkg -n Gurmukhi -s Gurmukhi
0b3baf
%script_subpkg -n HanS -s %{quote:Han (Simplified)}
0b3baf
%script_subpkg -n HanS_vert -s %{quote:Han (Simplified, Vertical)}
0b3baf
%script_subpkg -n HanT -s %{quote:Han (Traditional)}
0b3baf
%script_subpkg -n HanT_vert -s %{quote:Han (Traditional, Vertical)}
0b3baf
%script_subpkg -n Hangul -s Hangul
0b3baf
%script_subpkg -n Hangul_vert -s %{quote:Hangul (Vertical)}
0b3baf
%script_subpkg -n Hebrew -s Hebrew
0b3baf
%script_subpkg -n Japanese -s Japanese
0b3baf
%script_subpkg -n Japanese_vert -s %{quote:Japanese (Vertical)}
0b3baf
%script_subpkg -n Kannada -s Kannada
0b3baf
%script_subpkg -n Khmer -s Khmer
0b3baf
%script_subpkg -n Lao -s Lao
0b3baf
%script_subpkg -n Latin -s Latin
0b3baf
%script_subpkg -n Malayalam -s Malayalam
0b3baf
%script_subpkg -n Myanmar -s Myanmar
0b3baf
%script_subpkg -n Oriya -s Oriya
0b3baf
%script_subpkg -n Sinhala -s Sinhala
0b3baf
%script_subpkg -n Syriac -s Syriac
0b3baf
%script_subpkg -n Tamil -s Tamil
0b3baf
%script_subpkg -n Telugu -s Telugu
0b3baf
%script_subpkg -n Thaana -s Thaana
0b3baf
%script_subpkg -n Thai -s Thai
0b3baf
%script_subpkg -n Tibetan -s Tibetan
0b3baf
%script_subpkg -n Vietnamese -s Vietnamese
0b3baf
0b3baf
0b3baf
%prep
0b3baf
%if 0%{?commit:1}
0b3baf
%autosetup -p1 -n tessdata_fast-%{commit}
0b3baf
%else
0b3baf
%autosetup -p1 -n tessdata_fast-%{version}%{?pre:-%pre}
0b3baf
%endif
0b3baf
0b3baf
0b3baf
%build
0b3baf
# Nothing to build
0b3baf
0b3baf
0b3baf
%install
0b3baf
mkdir -p %{buildroot}/%{_datadir}/tesseract/tessdata/
0b3baf
cp -a * %{buildroot}/%{_datadir}/tesseract/tessdata/
0b3baf
0b3baf
# Install these through %%license and %%doc
0b3baf
rm -f %{buildroot}/%{_datadir}/tesseract/tessdata/LICENSE
0b3baf
rm -f %{buildroot}/%{_datadir}/tesseract/tessdata/README.md
0b3baf
0b3baf
# https://github.com/tesseract-ocr/tessdata_fast/issues/27
0b3baf
rm -f %{buildroot}/%{_datadir}/tesseract/tessdata/configs
0b3baf
rm -f %{buildroot}/%{_datadir}/tesseract/tessdata/pdf.ttf
0b3baf
0b3baf
0b3baf
0b3baf
%files doc
0b3baf
%license LICENSE
0b3baf
%doc README.md
0b3baf
0b3baf
%files -n tesseract-osd
0b3baf
%{_datadir}/tesseract/tessdata/osd.traineddata
0b3baf
0b3baf
%files -n tesseract-equ
0b3baf
%{_datadir}/tesseract/tessdata/equ.traineddata
0b3baf
0b3baf
0b3baf
%changelog
0b3baf
* Tue Aug 10 2021 Mohan Boddu <mboddu@redhat.com> - 4.1.0-3
0b3baf
- Rebuilt for IMA sigs, glibc 2.34, aarch64 flags
0b3baf
  Related: rhbz#1991688
0b3baf
0b3baf
* Fri Apr 16 2021 Mohan Boddu <mboddu@redhat.com> - 4.1.0-2
0b3baf
- Rebuilt for RHEL 9 BETA on Apr 15th 2021. Related: rhbz#1947937
0b3baf
0b3baf
* Wed Feb 17 2021 Sandro Mani <manisandro@gmail.com> - 4.1.0-1
0b3baf
- Update to 4.1.0
0b3baf
0b3baf
* Wed Jan 27 2021 Fedora Release Engineering <releng@fedoraproject.org> - 4.0.0-10
0b3baf
- Rebuilt for https://fedoraproject.org/wiki/Fedora_34_Mass_Rebuild
0b3baf
0b3baf
* Tue Sep 29 2020 Sandro Mani <manisandro@gmail.com> - 4.0.0-9
0b3baf
- Fix supplements
0b3baf
0b3baf
* Wed Jul 29 2020 Fedora Release Engineering <releng@fedoraproject.org> - 4.0.0-8
0b3baf
- Rebuilt for https://fedoraproject.org/wiki/Fedora_33_Mass_Rebuild
0b3baf
0b3baf
* Fri Jan 31 2020 Fedora Release Engineering <releng@fedoraproject.org> - 4.0.0-7
0b3baf
- Rebuilt for https://fedoraproject.org/wiki/Fedora_32_Mass_Rebuild
0b3baf
0b3baf
* Sat Jul 27 2019 Fedora Release Engineering <releng@fedoraproject.org> - 4.0.0-6
0b3baf
- Rebuilt for https://fedoraproject.org/wiki/Fedora_31_Mass_Rebuild
0b3baf
0b3baf
* Wed Jul 17 2019 Sandro Mani <manisandro@gmail.com> - 4.0.0-5
0b3baf
- Improve subpackage descriptions
0b3baf
- Make script subpackages own the script directory
0b3baf
- Bump release to -5
0b3baf
0b3baf
* Wed Jul 17 2019 Sandro Mani <manisandro@gmail.com> - 4.0.0-2
0b3baf
- Make all langpack / script subpackages require tesseract for tessdata dir ownership
0b3baf
- Fix tesseract-osd requires
0b3baf
- Fix typo cirilic -> cyrillic
0b3baf
0b3baf
* Tue Jul 16 2019 Sandro Mani <manisandro@gmail.com> - 4.0.0-1
0b3baf
- Initial package split from the tesseract package