PHP / SQL - Improving search feature / fuzzy search

full-text search mysql w3schools
mysql full text search vs elasticsearch
mysql full-text search multiple columns
mysql full-text search performance
fuzzy logic search in php
mysql full-text search email address
mysql search query
php full-text search

I am trying to create a product search for my site, where a user can search for products in multiple languages and (hopefully) get fuzzy search results if there is no exact match.

  • I have a pro_search table that has columns id, pro_id, en, de, es, fr, it.
  • The pro_id column refers to the id of the products in their own table.
  • The en, de, es, fr, it columns have the translated meta of each product in various languages.
  • The meta is just keywords seperated by spaces
  • $term is the search term.
  • $lang refers to the users chosen language

So first I do a basic 'LIKE' SQL query to see if there are matches, if there are no results from this I query all the products and create an array sorted by their similarity using the similar_text() function

For example I search 'shirt' this is fine if the meta for this product just includes the word 'shirt', but if the meta includes 'blue branded tshirt' this is being more descriptive and gives the user a chance to search by brand but means that the search will more than likely go fuzzy rather than be found with a LIKE SQL query.

This is kind of working but I was wondering how this could be improved, is there a better way of searching or how do people normally do it? Should I be splitting the meta into each individual keywords and try to see how many words match rather than matching the term to the whole meta?

    $ids = [];

    $params = ['%'.$term.'%'];
    $sql = "SELECT * FROM pro_search WHERE $lang LIKE ?";
    $stmt = DB::run($sql,$params);

    $count = $stmt->rowCount();
    if($count > 0){

        // product search
        while ($row = $stmt->fetch(PDO::FETCH_ASSOC)){
            $id = $row["pro_id"];
            array_push($ids,$id);
        }
        show_products($ids);

    }else{

        // product fuzzy search
        $sql = "SELECT * FROM pro_search";
        $stmt = DB::run($sql);
        while ($row = $stmt->fetch(PDO::FETCH_ASSOC)){
            $id = $row["pro_id"];
            $result = $row[$lang];
            similar_text($term,$result,$similarity);
            $similar_array[$similarity][] = $id;
        }

        $closest_match = array_keys($similar_array);
        rsort($closest_match);
        $match_count = count($closest_match);

        for($i=0; $i<$match_count; $i++){
            foreach($similar_array[$closest_match[$i]] as $id){
                array_push($ids,$id);
            }
        }
        show_products($ids);
    }

I have asked similar questions before and people have pointed me to different ways of comparing the term against the meta (such as levenshtein), but everything I've seen has been comparing two simple words (like apples and oranges) and this just isn't good enough for a real life application with thousands of products and a user could search for literally anything (as in $term='literally anything';)

Key Questions:

  • Should my meta have just the product name or multiple relevant keywords (too many keywords means an individual word is less similar to the whole)?
  • If I have multiple keywords in the meta should I be taking each individual keyword and comparing it against the search term?
  • Also would it be possible to maybe have negative keywords for individual products.

How To Improve Database Searches with Full-Text Search in , Now that you can use FTS functions in SQL queries to find the rows relevant to a search input, you can make those results more relevant. Using Azure Search, I'm struggling to combine fuzzy search with custom scoring profile. If I search without fuzziness, the score is calculated properly. If I add fuzziness to search query it looks like my custom rules are not applied. Example scenario (api-version=2015-02-28) Create new Index (C#

How do I do a fuzzy match of company names in MYSQL with PHP , In this article, we will learn about SQL fuzzy match logic in SQL Server using Master Data MySQL Fuzzy Text Searching Using the SOUNDEX Function, You can How To Improve Database Searches with Full-Text Search in , Full-text� PHP MySQL Ajax Live Search. In this tutorial you'll learn how to create a live MySQL database search feature using PHP and Ajax. Ajax Live Database Search. You can create a simple live database search functionality utilizing the Ajax and PHP, where the search results will be displayed as you start typing some character in search input box.

You can use SOUNDEX in sql

SELECT * FROM users 
           WHERE SOUNDEX(job) 
LIKE CONCAT('%',SUBSTRING(SOUNDEX('Manual worker'),2),'%');

And things like Manual worka will work. You just need to ajust the value (currently 2) to match your need.

I see that you already tried the Levenshtein algorithm but you should have an eye on this adaptation (which is also compatible with UTF-8 strings)

For my case, the soundex was more efficient, it will depend on how your users will interact with your application.


But as said in the comment, third parties like ElasticSearch or Algolia can be much more efficient.

For ma part I never used it because the company do not allow us to use third parties softwares. That's why I tried both Levensthein and Soundex

Create an easy fuzzy/loose search with Laravel 5 - DEV, I give you my tips to enable fuzzy (or loose) search. Tagged with webdev, php, laravel, search. This is a great feature that let you retrieve results that partly match So the trick here is to use the LIKE SQL operator to do this. One improvement would be to allow spaces in the search, so that routes like:. The Azure search api offers a fuzzy paramter for suggestions. like this: https://blssuggestions.search.windows.net/indexes/cities/docs/suggest?api-version=2015-02-28

TomLingham/Laravel-Searchy: Laravel Searchy makes , Laravel Searchy makes user driven searching easy with fuzzy search, basic config/app.php file if you want to have quick access to it in your application: There is an SQL file with a suitable function in the res folder - feel free to use If you would like to improve on the code that is here, feel free to submit a pull request. @BrainzJnr you need to create the database in mysql or if the case is different check your permission settings but if that doesn't work just make a database with another name of your choice and modify the mysqli codes in order to fit your database. by just changing any where you see brainztech to the name of the database you created. if that doesnt work then go on some research on youtube on

Fuzzy search mysql, The online food ordering develop in PHP MySQL using HTML, java Script, CSS, J Query Posted by: Andrew Zdybak Fuzzy Matching String Function. Full-Text Search Functions CONTAINSTABLE (Transact-SQL) Returns a table of zero, need only code in PHP to implement AJAX, Feedback will be good to improve this� Approximate/fuzzy string search in PHP. This PHP class, approximate-search.php, provides non-exact text search (often called fuzzy search or approximate matching).

Mysql fuzzy search, Fuzzy searching inside a MySQL DB If I have a search box on my page which The IBM� Netezza�SQL language supports two fuzzy string search functions: need only code in PHP to implement AJAX, Feedback will be good to improve� How to create search option or feature using PHP Previous Next In this example i am going to explain how to create search option or feature using PHP and MySQLi .In this example we search the students data by roll no.

Comments
  • Have you thought about using a database only for this requirement? Elasticsearch might be able to handle all this with higher performance than writing custom code. Having multiple concurrent visits on that search will otherwise result in really high loading times
  • "How do people normally do it?" - the best choice is to build an Elastic index, it has a lot of features you simply cannot achieve in SQL. We've tried SQL in our project, but struggled with typos, aliases, rankings, etc, Elastic has built-in features for natural language processing.
  • I've never used elastic search before but will look into it although I don't like having too many dependencies
  • Which RDBMS are you using? Some engines offer functionalities that can help you out if you don't want to pull in and pay for 3rd parties. SQL Server has Full-Text Search, so does PostgreSQL, and MySQL.
  • I am using mySQL, thanks I will look Full-Text Search as well. I will probably end up testing a combination of things
  • Hi, thanks for your answer, this sounds great in theory but I'm not sure how well it would work in practice. For example, if I add terms to the meta based on the click through rate, I might end up getting bad spellings etc being added to the meta, this will just mean that the original exact search is more likely to go fuzzy because there's so many random misspellings in the meta. Or how would you do the initial search to begin with? Like I am comparing a term against a, comma, seperated, list, of, keywords is there a better way to compare the term against individual keywords
  • actually misspellings are a good example. misspelled items I think should go in meta. Your objective is not to be a 5th grade English teacher but that you provide results to customers efficiently. So if enough people misspell the same way - you have helped all of them. I would not even consider this as "meta-data" - more like search-data. meta-data can be more formal categorization of the product. Just see google, amazon, etc. how misspelled searches work quite well too.
  • Thanks again for your comment, but how would you do the initial search then if what i'm searching contains multiple misspellings? A SQL LIKE query will return less actual products the more extra terms there are in the DB. As in for the initial search before it goes fuzzy?
  • If your users agree with this "poor man AI approach", then say that misspelled searches will become valid "over time organically". All inventory systems have some categorization of items. Your organization probably has that too - so your default lookup can be simply exact searches on those. So no misspelled items are ok on day 1. As a person like me starts using the system your "alternate spellings" and association with the products will start growing along with "confidence level".
  • How would you go about storing the search terms and what product the user clicks on. All I can think of is like an associative array where each word is added to the product when they click on it from a search and then a score added to each word depending on how many times it gets clicked, but this brings me back to my original issue of when i do a new search I need to query all the products to get all of the arrays of terms that have matched the products and that include that word and then order them by the score which is not going to be really efficient
  • I'm not sure you get what I'm asking, I've looked into levenshtein and soundex previously but that article on soundex is like what I said just comparing two simple words (SOUNDEX('Sure') ,SOUNDEX('Shore')). Say I have products: manual worker, electric worker, manual gearbox, instruction manual Each of these has meta which includes that title and other relevant keywords. I want a user to be able to search for manual or worker in a different language and when it is misspelled.
  • That's what I tried to explain. By combining Levenshtein and Soundex this could work with your use case. But maybe you will have too much false positive. But if you have the possibility, you will have better result by using Algolia or 3rd parties like that. Sorry if my answer was not clear
  • Sorry, yeah it's not a very clear question to begin with but don't know how I can be clearer. I'm not sure about comparing an individual term against an array of keywords or against all keywords as a string in such a way that I get the best results without completely slowing down my site