Let me describe my problem. There is an input string, and a table containing many thousands of strings. I am looking for best way to search for the most similar* strings to the input string. The search should return a list of ~10 suggested strings, sorted by degree of similarity. Strings also have numerical weights (popularity) associated with them in database, in another column, so the ones with higher weights should have higher chance of appearing in results, if possible.
What is the best library to achieve this? I am looking for something similar to Elasticsearch, I guess. I don't have much experience with these kinds of libraries, so I would need something easy to include in my project and preferably open-source. I am using Python (Flask and SQLAlchemy) and Postgresql, but could also use e.g. Node.js, if needed.
*I also want to clarify what kind of similarity I am looking for. Ideally, it would be semantic similarity, but lexical similarity is fine as well. I would be happy with anything that works okay, is easy to implement, and is as scalable and performant as possible.
Example input sentence:
- I don't like cangaroos.
Example suggestions from the database:
- Cangaroos are not my favorite.
- Cangaroos are evil.
- I once had a cangaroo. Never again.
These suggestions should appear first because 'cangaroo' is not a frequent word in my database, so any string with the word 'cangaroo' should have a high chance appearing in results. It is probably much harder to detect 'don't like', so that part is completely optional for me.
P.s. Could PostgreSQL's full text search do something like this?