tiny-lang-detect

Generate tiny models for language detection  https://p.ce9e.org/tiny-lang-detect/demo/
git clone https://git.ce9e.org/tiny-lang-detect.git

commit
5698b03845de35c9d8a99ede35bc7a090d5f40d2
parent
60da7d4b74c54273652a4bbdb84408bb9f055886
Author
Tobias Bengfort <tobias.bengfort@posteo.de>
Date
2025-05-06 06:29
include example model in README

Diffstat

M README.md 35 +++++++++++++++++++++++++++++++++++

1 files changed, 35 insertions, 0 deletions


diff --git a/README.md b/README.md

@@ -12,10 +12,45 @@ Example usage:
   12    12 
   13    13 ```sh
   14    14 $ ./download_data.sh
   -1    15 $ python gen_model.py en de -n 10
   15    16 $ python gen_model.py en de -n 10 > en_de.json
   16    17 $ python test.py en_de.json
   17    18 overall correctness 96.3% (1000)
   18    19 ```
   19    20 
   -1    21 A model might look like this:
   -1    22 
   -1    23 ```json
   -1    24 {
   -1    25   "ngrams": ["ei", "en", " t", "ch", " th", "er", "en ", "a", "e", "o"],
   -1    26   "freq": {
   -1    27     "en": [
   -1    28       0.0008847549205632559,
   -1    29       0.007865767270512856,
   -1    30       0.01639325502081986,
   -1    31       0.0035863210810589343,
   -1    32       0.016136794462706813,
   -1    33       0.01354675763365741,
   -1    34       0.002292672343996773,
   -1    35       0.0897445255534594,
   -1    36       0.10672365966622427,
   -1    37       0.07156346253706898
   -1    38     ],
   -1    39     "de": [
   -1    40       0.015897498950157848,
   -1    41       0.023261162650169673,
   -1    42       0.0005690935513966353,
   -1    43       0.019468205994060662,
   -1    44       0.00021883618283788822,
   -1    45       0.02992300137058795,
   -1    46       0.02022536188476834,
   -1    47       0.057449835679986086,
   -1    48       0.14656171354570646,
   -1    49       0.031128414709526073
   -1    50     ]
   -1    51   }
   -1    52 }
   -1    53 ```
   -1    54 
   20    55 For examples how to use a model to classify languages, see `test.py` and
   21    56 `demo/demo.js`.