I've probably spent four hours training my glyph recognition algorithm, and I think it probably identifies glyphs correctly about 80-90% of the time. Right now, I can process a single line of text in under a second, but it takes me about 15-30 seconds to manually verify the transcription and fix any errors that crop up. That seems pretty fast, but when you multiply it out by 4285 lines, it comes to about 25 hours of manual work. I need to pare that down, because it'll take me forever to scrape together 25 hours of free time.
A lot of this project has involved dividing labor between me and the machine, making the most of what the machine can do without my intervention, and making the best use of my feedback on good and bad matches. The code has been very fluid but very stable, basically organized around building a powerful set of core functionality, but using the simplest and most ergonomic user interface for each task.
No comments:
Post a Comment