📜 ⬆️ ⬇️

Sphinx for ASP.NET via jTemplates


We have a hobby - to develop an online store selling beverages and food products in bulk.
Our products appear by attracting suppliers and placing their products in the store.
Customers - owners of restaurants and cafes who buy goods in bulk with the next day delivery.
When the number of positions on the goods exceeded 20 thousand, the search through like in MS SQL became too wrong, especially when the suppliers loaded the goods with errors or the names of the goods were Latin / Cyrillic. After a month of various tweaks in the search procedure with latin-cyrilic-latin conversion, correcting grammatical errors, we eventually realized that this was a dead-end path of search development.

Finding a solution


Frowning eyebrows, we decided to peep how these problems are solved in other projects, say on the same wikimart. To our envy, the search worked well for them, even correcting our mistakes in the words of the goods. For example, at the request of Coco Cola, we could find both Coca Cola and Cocain. What kind of DBMS they have such a magical one, we exclaimed.
After a brief search on the Internet technical solutions, we realized that we need the FullText Search Engine. Having flown in the clouds, that we can, probably very soon, realize a “search for people”, and even as a free pie, we may have facets filters, we began to look for what to do with it.
And as it turned out, FullText Search is in MS SQL 2008 Advanced Services already built into our DBMS! Having picked it up in MS SQL for a week and not finding a free pie in the form of facets, we came across an article about magic Lucene.NET , Solr , Sphinx .

Engine selection


After small tests of the engines above, we selected Sphinx by the following criteria:
  1. Works under Microsoft Windows
  2. There is developer support and a big FAQ
  3. Works with MS SQL
  4. There is a ready-made adapter for .NET to communicate with the engine
  5. There are facets

To business


Sphinx configuration

Our config , in which, in fact, nothing special.
The use of morphologies stem_enru, soundex, metaphone (what it is, is well described here , thanks to Puma for that).
Connecting to MS SQL and using View for products that Sphinx periodically pulls. But we went a little further and expanded the Sphinx field to search not only for products, but also for brands, suppliers and categories.

Working with Sphinx in ASP.NET

To work with Sphinx, we use open source Sphinx.Client .
Our SphinxHelper helper class for working with Sphinx via Sphinx.Client.
SphinxHelper
using System; using System.Collections.Generic; using System.Linq; using System.Text; using Sphinx.Client.Connections; using Sphinx.Client.Commands.Search; using System.Collections; using Sphinx.Client.Commands.Collections; using Sphinx.Client.Commands.Attributes.Filters; namespace Project.Helpers { public class SphinxHelper { private static ConnectionBase CreateConnection() { PersistentTcpConnection p_connection = new PersistentTcpConnection("127.0.0.1", 9312); p_connection.ConnectionTimeout = 10000; return p_connection; } public static IList<SearchQueryResult> Query(string queryText, string indexes, int limitPerIndex) { return Query("", queryText, indexes, null, limitPerIndex, 0, MatchMode.Extended2, MatchRankMode.WordCount, ResultsSortMode.Extended, "@weight DESC", ""); } public static IList<SearchQueryResult> Query(string select, string match, string indexes, AttributeFilterList filters, int pageSize, int offset, MatchMode matchMode, MatchRankMode rankingMode, ResultsSortMode sortMode, string sortBy, string groupBy) { IEnumerable<string> p_idxArray = indexes.Split(','); SearchQuery p_query = null; pageSize = pageSize <= 0 ? 99999999 : pageSize; IList<SearchQueryResult> p_ret = new System.Collections.Generic.List<SearchQueryResult>(); using (ConnectionBase connection = CreateConnection()) { SearchCommand p_search = new SearchCommand(connection); foreach (string p_idx in p_idxArray) { p_query = new SearchQuery(match, p_idx); p_query.Select = select; p_query.MatchMode = matchMode; p_query.RankingMode = rankingMode; p_query.SortMode = sortMode; p_query.SortBy = sortBy; if (!String.IsNullOrEmpty(groupBy)) { p_query.GroupBy = groupBy; p_query.GroupSort = sortBy; p_query.GroupFunc = ResultsGroupFunction.Attribute; if (!String.IsNullOrEmpty(sortBy)) { p_query.SortBy = string.Empty; } } p_query.Limit = pageSize; p_query.Offset = offset; //   ,   if (filters != null && filters.Count > 0) foreach (AttributeFilterBase p_filter in filters) p_query.AttributeFilters.Add(p_filter); p_query.Select = select; //  ,     //search.QueryList.Add(p_query); p_search.QueryList.Clear(); p_search.QueryList.Add(p_query); p_search.Execute(); foreach (SearchQueryResult p_result in p_search.Result.QueryResults) p_ret.Add(p_result); } return p_ret; } } } } 


From SearchQueryResult to the user's browser

When generating a search page with brands and products, we use jTemplates , which receives data from the web service, which in turn pulls SphinxHelper.
Generate search page
  //   //brandId -    //sellerId -    //categoryId -    //specIds -  facets //searchText -    this.GetGroups = function (brandId, sellerId, categoryId, specIds, searchText) { var waiter = $('#waiter_' + brandId); waiter.css({ visibility: 'visible' }); var brandGroups = $('#brandGroups_' + brandId); brandGroups.attr('loaded', true); $.ajax({ type: "POST", context: { brandId: brandId, sellerId: sellerId, categoryId: categoryId, specIds: specIds, searchText: searchText }, url: currentHost() + "WebServices/Products.asmx/GetBrandGroups", data: "{brandID:'" + brandId + "',sellerID:'" + sellerId + "', categoryID:'" + categoryId + "', specIds:'" + specIds + "',searchText:'" + searchText + "'}", contentType: "application/json; charset=utf-8", dataType: "json", success: brandsInRow2.GroupCallSuccess, error: brandsInRow2.GroupCallError }); } this.GroupCallError = function (request, status, error) { alert(request.responseText); } this.GroupCallSuccess = function (data, status) { var data_decoded = $.parseJSON(data.d); var brandGroups = $('#brandGroups_' + this.brandId); brandGroups.setTemplate($("#templateProducts").html()); brandGroups.setParam('GetProductPriceActuality', brandsInRow2.GetProductPriceActuality); brandGroups.setParam('GetProductPriceActuality1', brandsInRow2.GetProductPriceActuality1); brandGroups.setParam('GetSpecDescription', brandsInRow2.GetSpecDescription); brandGroups.setParam('GetBrandPriceName', brandsInRow2.GetBrandPriceName); brandGroups.setParam('GetOrderProductFrameLink', brandsInRow2.GetOrderProductFrameLink); brandGroups.setParam('GetSellerInfoFrameLink', brandsInRow2.GetSellerInfoFrameLink); brandGroups.setParam('GetMessageSendFrameLink', brandsInRow2.GetMessageSendFrameLink); brandGroups.processTemplate(data_decoded); brandGroups.css({ display: 'block' }); //   brandsInRow2.InitTips(); var waiter = $('#waiter_' + this.brandId); waiter.css({ visibility: 'hidden' }); } this.GetLinkOfferName = function (prodCnt, sellersCnt) { var p_offers = prodCnt + ' ' + formatToRussian1(prodCnt, ""); if (sellersCnt > 1) p_offers = p_offers + ' ' + formatToRussian(sellersCnt, ""); return p_offers; } this.GetBrandCountName = function (brandsCnt) { return brandsCnt.toString() + ' ' + formatToRussian(brandsCnt, ""); } this.GetBrandPriceName = function (minPrice, maxPrice) { if (minPrice == maxPrice) return formatPrice(minPrice); return ' ' + formatPrice(minPrice) + '  ' + formatPrice(maxPrice) } this.GetProductPriceActuality = function (product) { return product.DaysUpdated > 30 ? "      ,       " + product.CompanyName + "." : ""; } this.GetProductPriceActuality1 = function (product) { return product.DaysUpdated > 30 ? "" : "hidden"; } this.GetSpecDescription = function (specs) { var escaped = specs; var findReplace = [[/&/g, "&"], [/</g, "<"], [/>/g, ">"], [/"/g, '"'], [/'/g, "'"]] for (var item in findReplace) escaped = escaped.replace(findReplace[item][0], findReplace[item][1]); return escaped; } this.GetOrderProductFrameLink = function (productId) { return currentHost() + "OrderProductFrame.aspx?ProductID=" + productId; } this.GetSellerInfoFrameLink = function (sellerId) { return currentHost() + "SellerShortInfoFrame.aspx?SellerID=" + sellerId; } this.GetMessageSendFrameLink = function (productId) { return currentHost() + "MessageSendFrame.aspx?ProductID=" + productId; } 


It was also very easy to implement AutoComplete for the search string, which also receives data from the web service.
Autocomplete
 function setAutoComplete(s) { var elem = $("#searchTextBox"); elem.autocomplete({ minLength: 2, source: function (request, response) { $.ajax({ type: "POST", url: currentHost() + "WebServices/Common.asmx/GetSearchComplete", data: "{searchTerm:'" + elem.val() + "'}", contentType: "application/json; charset=utf-8", success: function (msg) { if (msg.d != "") response($.parseJSON(msg.d)); else response('') } }); }, select: function (event, ui) { elem.addClass('ui-autocomplete-loading'); window.location.href = ui.item.linkUrl; elem.selected = ui.item; return false; }, dataType: "json" }) .data("autocomplete")._renderItem = function (ul, item) { return $("<li></li>") .data("item.autocomplete", item) .append("<a href='" + item.linkUrl + "'><ul class='search-complete'><li class='pict'><img src='" + item.pictUrl + "'/></li><li class='name'>" + item.label + "</li></ul></a>") .appendTo(ul); }; elem.keydown(function (e) { if (e.keyCode == 13) { if (typeof (elem.selected) == 'undefined') { elem.addClass('ui-autocomplete-loading'); SearchClick(); //     e.preventDefault(); } } }); } 


From the transition to Sphinx, we got the following patties:
  1. Human search
  2. Generation of search pages - on the client
  3. Unloading database

I can not give accurate measurements of the time of page generation before and with the help of Sphinx, because they forgot to write down what happened before. But I will say that now we have more than 20 thousand products in our database and everything works just very quickly (generation less than a second).
')
To maintain the index of products up to date (and they are current in the database), we will use a delta index , updated every 10 minutes. In the meantime, our main index is completely rebuilt in 5 seconds every half hour. Unloading MS SQL helped avoid table locks during long queries that occur when a supplier imports a price list.

PS Links to the project, like, cleaned up, as it will not survive habraeffekt.

Source: https://habr.com/ru/post/208236/


All Articles