diff --git a/.gitignore b/.gitignore index 6aedb482aaf032249e159194ad6a44b9b5f0dc0c..e15d1dda0e7894302f582be2551e1f8b06703805 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,6 @@ data/words/ data/words.txt +data/corpus.txt src/__pycache__/ model/checkpoint model/snapshot-* diff --git a/README.md b/README.md index c7f072614ea0e7e464a31165fab7f2368d76f21c..070fc079545995d05ab52edb0a2dfd0ebe7f1d3f 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words as shown in the illustration below. As these word-images are smaller than images of complete text-lines, the NN can be kept small and training on the CPU is feasible. -2/3 of the words from the validation-set are correctly recognized and the character error rate is around 13%. +3/4 of the words from the validation-set are correctly recognized and the character error rate is around 10%. I will give some hints how to extend the model in case you need larger input-images (e.g. to recognize text-lines) or want better recognition accuracy.  @@ -20,10 +20,10 @@ The input image and the expected output is shown below. ``` > python main.py -Validation character error rate of saved model: 13.956289% -Init with stored values from ../model/snapshot-32 +Validation character error rate of saved model: 10.624916% +Init with stored values from ../model/snapshot-38 Recognized: "little" -Probability: 0.86143184 +Probability: 0.96625507 ``` Tested with: @@ -63,7 +63,7 @@ The dictionary is created (in training and validation mode) by using all words c Further, the (manually created) list of word-characters can be found in the file `model/wordCharList.txt`. Beam width is set to 50 to conform with the beam width of vanilla beam search decoding. -Using this configuration, a character error rate of 10% and a word accuracy of 81% is achieved. +Using this configuration, a character error rate of 8% and a word accuracy of 84% is achieved. ## Train model @@ -143,7 +143,7 @@ The illustration below gives an overview of the NN (green: operations, pink: dat ### Improve accuracy -Around 68% of the words from the IAM dataset are correctly recognized by the NN when using vanilla beam search decoding. +74% of the words from the IAM dataset are correctly recognized by the NN when using vanilla beam search decoding. If you need a better accuracy, here are some ideas how to improve it \[2\]: * Data augmentation: increase dataset-size by applying further (random) transformations to the input images. At the moment, only random distortions are performed. diff --git a/model/accuracy.txt b/model/accuracy.txt index 4b45119c28a4848d9a5aae2f5ca47eacc4db772f..8cc6f94bcaf3f3606ca98a685ff64bd9047f1093 100644 --- a/model/accuracy.txt +++ b/model/accuracy.txt @@ -1 +1 @@ -Validation character error rate of saved model: 13.956289% \ No newline at end of file +Validation character error rate of saved model: 10.624916% \ No newline at end of file