如何使用Trie树,设计实践Google一样的输入提示功能
生活随笔
收集整理的這篇文章主要介紹了
如何使用Trie树,设计实践Google一样的输入提示功能
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
來源 |?搜索技術
責編 | 小白
Google和百度都支持輸入提示功能,輔助你快速準確的輸入想要的內容。
如下:輸入“五一”,會提示“五一勞動節”等。
那如何實現谷歌這樣的輸入提示功能呢?
分析下輸入提示的功能需求
當輸入前面的詞A,希望提示出前綴為A的所有高相關性的詞。這個特性屬于前綴匹配,trie樹被稱為前綴樹,是一種搜索排序樹,很適合用作輸入提示的實踐。
下面以python3為例,使用Trie樹,構建輸入提示服務。
# Python3 program to demonstrate auto-complete # feature using Trie data structure. # Note: This is a basic implementation of Trie # and not the most optimized one. class TrieNode(): def __init__(self):# Initialising one node for trie self.children = {} self.last = False class Trie(): def __init__(self):# Initialising the trie structure. self.root = TrieNode() self.word_list = []def formTrie(self, keys):# Forms a trie structure with the given set of strings # if it does not exists already else it merges the key # into it by extending the structure as required for key in keys: self.insert(key) # inserting one key to the trie.def insert(self, key):# Inserts a key into trie if it does not exist already. # And if the key is a prefix of the trie node, just # marks it as leaf node. node = self.rootfor a in list(key): if not node.children.get(a): node.children[a] = TrieNode()node = node.children[a]node.last = Truedef search(self, key):# Searches the given key in trie for a full match # and returns True on success else returns False. node = self.root found = Truefor a in list(key): if not node.children.get(a): found = False breaknode = node.children[a]return node and node.last and founddef suggestionsRec(self, node, word):# Method to recursively traverse the trie # and return a whole word. if node.last: self.word_list.append(word)for a,n in node.children.items(): self.suggestionsRec(n, word + a)def printAutoSuggestions(self, key):# Returns all the words in the trie whose common # prefix is the given key thus listing out all # the suggestions for autocomplete. node = self.root not_found = False temp_word = ''for a in list(key): if not node.children.get(a): not_found = True breaktemp_word += a node = node.children[a]if not_found: return 0 elif node.last and not node.children: return -1self.suggestionsRec(node, temp_word)for s in self.word_list: print(s) return 1 # Driver Codekeys = ["五一", "五一勞動節", "五一放假安排", "五一勞動節圖片", "五一勞動節圖片 2020", "五一勞動節快樂", "五一晚會", "五一假期", "五一快樂","五一節快樂", "五花肉", "五行", "五行相生"] # keys to form the trie structure.key = "五一" # key for autocomplete suggestions.status = ["Not found", "Found"] # creating trie objectt = Trie() # creating the trie structure with the# given set of strings.t.formTrie(keys) # autocompleting the given key using# our trie structure.comp = t.printAutoSuggestions(key) if comp == -1: print("No other strings found with this prefix\n")elif comp == 0: print("No string found with this prefix\n") # This code is contributed by amurdia輸入:五一,輸入提示結果如下:
結果都實現了,但我們實現后的輸入提示順序跟Google有點不一樣,那怎么辦呢?
一般構建輸入提示的數據源都是用戶輸入的query詞的日志數據,并且會統計每個輸入詞的次數,以便按照輸入詞的熱度給用戶提示。
現在我們把日志詞庫加上次數,來模擬Google的輸入效果。
日志庫的查詢詞及個數示例如下:
五一勞動節 10五一勞動節圖片 9五一假期 8五一勞動節快樂 7五一放假安排 6五一晚會 5五一 4五一快樂 3五一勞動節圖片2020 2五一快樂 1把輸入提示的代碼調整下,支持查詢詞次數的支持:
# Python3 program to demonstrate auto-complete # feature using Trie data structure. # Note: This is a basic implementation of Trie # and not the most optimized one. import operatorclass TrieNode(): def __init__(self): # Initialising one node for trie self.children = {} self.last = False class Trie(): def __init__(self): # Initialising the trie structure. self.root = TrieNode() #self.word_list = [] self.word_list = {} def formTrie(self, keys): # Forms a trie structure with the given set of strings # if it does not exists already else it merges the key # into it by extending the structure as required for key in keys: self.insert(key) # inserting one key to the trie. def insert(self, key): # Inserts a key into trie if it does not exist already. # And if the key is a prefix of the trie node, just # marks it as leaf node. node = self.root for a in list(key): if not node.children.get(a): node.children[a] = TrieNode() node = node.children[a] node.last = True def search(self, key): # Searches the given key in trie for a full match # and returns True on success else returns False. node = self.root found = True for a in list(key): if not node.children.get(a): found = False break node = node.children[a] return node and node.last and found def suggestionsRec(self, node, word): # Method to recursively traverse the trie # and return a whole word. if node.last: #self.word_list.append(word) ll = word.split(',') if(len(ll) >= 2): self.word_list[ll[0]] = int(ll[1]) else: self.word_list[ll[0]] = 0 for a,n in node.children.items(): self.suggestionsRec(n, word + a) def printAutoSuggestions(self, key): # Returns all the words in the trie whose common # prefix is the given key thus listing out all # the suggestions for autocomplete. node = self.root not_found = False temp_word = '' for a in list(key): if not node.children.get(a): not_found = True break temp_word += a node = node.children[a] if not_found: return 0 elif node.last and not node.children: return -1 self.suggestionsRec(node, temp_word) #sort sorted_d = dict(sorted(self.word_list.items(), key=operator.itemgetter(1),reverse=True)) for s in sorted_d.keys(): print(s) return 1 # Driver Codekeys = ["五一,4", "五一勞動節,10", "五一放假安排,6", "五一勞動節圖片,9", "五一勞動節圖片 2020,2", "五一勞動節快樂,7", "五一晚會,5", "五一假期,8", "五一快樂,3","五一節快樂,1", "五花肉,0", "五行,0", "五行相生,0"] # keys to form the trie structure.key = "五一" # key for autocomplete suggestions.status = ["Not found", "Found"] # creating trie objectt = Trie() # creating the trie structure with the# given set of strings.t.formTrie(keys) # autocompleting the given key using# our trie structure.comp = t.printAutoSuggestions(key) if comp == -1: print("No other strings found with this prefix\n")elif comp == 0: print("No string found with this prefix\n") # This code is contributed by amurdia輸出結果跟Google一模一樣:
總結:
以上是使用Trie樹,實踐Google輸入提示的功能。除了Trie樹實踐,我們還有其他辦法么,搜索中有沒有其他的索引能很好實現輸入提示的功能呢?
更多閱讀推薦
云原生體系下的技海浮沉與理論探索
如何通過 Serverless 輕松識別驗證碼?
5G與金融行業融合應用的場景探索
打破“打工人”魔咒,RPA 來狙擊!
使用 SQL 語句實現一個年會抽獎程序
總結
以上是生活随笔為你收集整理的如何使用Trie树,设计实践Google一样的输入提示功能的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: “蚂蚁漫步”背后的定位原理思考
- 下一篇: 对话阿里云:开源与自研如何共处?