當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Aiml中文包含英文（字母，特殊符号）识别问题的解决

發布時間：2023/12/31 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 Aiml中文包含英文（字母，特殊符号）识别问题的解决小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

????????aiml對識別純英文是沒問題的，但是，如果語句中包含了中文和英文字母就識別不了，主要原因是在每個漢子、字母中會加空格，與樣本中的標題不匹配，故找不到答案。

? ? ? ? 網上很多寫法是改寫aiml包Kernel類中_check_contain_english的方法，這種方法可行，但不方便，如果程序每部署到一個地方，就要改寫下環境包，實則不是明智之舉，另外，如果哪天環境一升級，問題又來了。

? ? ? ? 既然不能改變別人，那就改變我們自己，我們改造下自己程序就可以了。創建一個類，我們重寫下aiml.Kernel就可以了，我重新的是learn、respond兩個方法，不讓他們加空格就是了。當然，應該根據自己的實際情況而定，因為我的項目只是拿來作為智能客服系統檢索答案用，對話中也不會用到全英文（語義識別、實體抽取等用的是NLP其他模型），上代碼：

# !/usr/bin/env python # -*- coding: UTF-8 –*-import sys import os import aiml import time import glob import xml.sax from aiml.Kernel import create_parser from aiml import Utilsclass myAiml(aiml.Kernel):def __init__(self):super(myAiml, self).__init__()def learn(self, filename):"""Load and learn the contents of the specified AIML file.If filename includes wildcard characters, all matching fileswill be loaded and learned."""for f in glob.glob(filename):if self._verboseMode: print("Loading %s..." % f, end="")start = time.clock()# Load and parse the AIML file.parser = create_parser()handler = parser.getContentHandler()handler.setEncoding(self._textEncoding)try:parser.parse(f)except xml.sax.SAXParseException as msg:err = "\nFATAL PARSE ERROR in file %s:\n%s\n" % (f, msg)sys.stderr.write(err)continue# store the pattern/template pairs in the PatternMgr.em_ext = os.path.splitext(filename)[1]for key, tem in handler.categories.items():new_key = keyif key and key[0] and key[1] and key[2] and em_ext == '.aiml' and (not self._check_contain_english(key[0])):new_key = (''.join(key[0]), key[1], key[2])elif key and key[0] and key[1] and key[2] and em_ext == '.aiml' and self._check_contain_english(key[0]):new_key = (key[0].upper(), key[1], key[2])self._brain.add(new_key, tem)# Parsing was successful.if self._verboseMode:print("done (%.2f seconds)" % (time.clock() - start))def respond(self, input_, sessionID=aiml.Kernel._globalSessionID):"""Return the Kernel's response to the input string."""if len(input_) == 0:return u""# Decode the input (assumed to be an encoded string) into a unicode# string. Note that if encoding is False, this will be a no-optry:input_ = self._cod.dec(input_)except UnicodeError:passexcept AttributeError:pass# prevent other threads from stomping all over us.self._respondLock.acquire()try:# Add the session, if it doesn't already existself._addSession(sessionID)# split the input into discrete sentencessentences = Utils.sentences(input_)finalResponse = u""for index, s in enumerate(sentences):if not self._check_contain_english(s):s = ''.join(s)# Add the input to the history list before fetching the# response, so that <input/> tags work properly.inputHistory = self.getPredicate(self._inputHistory, sessionID)inputHistory.append(s)while len(inputHistory) > self._maxHistorySize:inputHistory.pop(0)self.setPredicate(self._inputHistory, inputHistory, sessionID)# Fetch the responseresponse = self._respond(s, sessionID)# add the data from this exchange to the history listsoutputHistory = self.getPredicate(self._outputHistory, sessionID)outputHistory.append(response)while len(outputHistory) > self._maxHistorySize:outputHistory.pop(0)self.setPredicate(self._outputHistory, outputHistory, sessionID)# append this response to the final response.finalResponse += (response + u" ")finalResponse = finalResponse.strip()# print( "@ASSERT", self.getPredicate(self._inputStack, sessionID))assert (len(self.getPredicate(self._inputStack, sessionID)) == 0)# and return, encoding the string into the I/O encodingreturn self._cod.enc(finalResponse)finally:# release the lockself._respondLock.release()

以上代碼，主要是把空格去掉（標紅色），然后就可以對中文和英文混合句子進行識別了。

調用如下：

from .myAiml import myAiml

self.__alice__ = myAiml() ?# 創建機器人alice對象
self.__alice__.learn('startup.xml') ?# 加載startup.xml
self.__alice__.respond('這里是目錄') ?# 加載目錄下的語料庫

跟正常一樣調用。

總結

以上是生活随笔為你收集整理的Aiml中文包含英文（字母，特殊符号）识别问题的解决的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

Aiml中文包含英文（字母，特殊符号）识别问题的解决

總結