How do I do a fuzzy search? If I have a table of full names, I'd like the user to be able to do a search and find the record, "Charles Montgomery Burns" with "Monty Burns" or "Montgomry" (mispelling).
Every major web site does this kind of thing (Amazon, Google, etc).
Someone suggested SOUNDEX, but this really doesn't fit the bill. Misspellings often don't use the same sound signature as the originals. Plus, that doesn't handle multi-word searchable texts very well.
Others have suggested tries or suffix trees. If I went this route, wouldn't I have to preload all data out of the database and into this custom structure upon app startup? Is there any way around that? Also, this solution seems like it would require a lot of dev time (building a custom suffix tree with fuzzy lookup capabilities).
Is there a commonly known and acceptable solution to this?
(sorry, also posted to MySQL group; I'm using both databases so a solution in either would be satisfactory)I did something recently where I used DIFFERENCE (I know you already said soundex is bad) on first names and lastnames with a user defined function that weighted results with a point system. An exact match on strings got a lot of points (say 200) a Difference of 4 recieved 100, a Difference of 3 recieved 75 etc... and then I ordered the results by the score, last name and first name. Works pretty well. I would post it but it is part someone else's code who could not get it to work until I fixed it for him.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment