Hello HN! I became frustrated with the unpredictible/poor match quality and opaqueness of “relevance scores” in existing fuzzy and fulltext search libs, so I tried something different and this is the result. The main selling point is the result quality / ordering, with best-in-class memory overhead and excellent performance being bonuses. The API is pretty stable at this point, but looking for feedback before committing to 1.0. TL;DR The test corpus is a 4MB json file with 162k words/phrases, so give it a second for initial download. You can also drag/drop your own text/json corpus into the UI to try it against your own dataset. Live demo/compare with a few other libs (there are many more in the codebase, in various states of completion, WIP): https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF… In isolation for perf assessment: https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF… To increase fuzziness and get broader results, try setting intraMax=1 (core) and enable outOfOrder (userland): https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF… Also play with the sortPreset selector to swap out the default Array.sort() for one in userland that prioritizes typehead-ness (the resultset remains identical). Still TODO: – Example of stripping diacritics
– Example of using non-latin charsets
– Example of prefix-caching to improve typeahead perf even further
– Example of poor man’s document search (matching multiple object properties)

That’s all, thanks!
Story Published at: September 30, 2022 at 03:44PM

Leave a Reply

Your email address will not be published. Required fields are marked *