用Kotlin开发android平台语音识别语义理解应用
?
?用Kotlin開(kāi)發(fā)android平臺(tái)語(yǔ)音識(shí)別,語(yǔ)義理解應(yīng)用(olamisdk)
轉(zhuǎn)載請(qǐng)注明CSDN博文地址:http://blog.csdn.net/ls0609/article/details/75084994
本文使用Kotlin開(kāi)發(fā)Android平臺(tái)的一個(gè)語(yǔ)音識(shí)別方面的應(yīng)用,用的是歐拉密開(kāi)放平臺(tái)olamisdk。
1.Kotlin簡(jiǎn)介
Kotlin是由JetBrains創(chuàng)建的基于JVM的編程語(yǔ)言,IntelliJ正是JetBrains的杰作,而android?Studio是基于IntelliJ修改而來(lái)的。Kotlin是一門(mén)包含很多函數(shù)式編程思想的面向?qū)ο缶幊陶Z(yǔ)言。后來(lái)了解到Kotlin原來(lái)是以一個(gè)島的名字命名的(Котлин),它是一門(mén)靜態(tài)類(lèi)型編程語(yǔ)言,支持JVM平臺(tái),android平臺(tái),瀏覽器js運(yùn)行環(huán)境,本地機(jī)器碼等。支持與Java,Android 100%?完全互操作。Kotlin生來(lái)就是為了彌補(bǔ)Java缺失的現(xiàn)代語(yǔ)言的特性,并極大的簡(jiǎn)化了代碼,使得開(kāi)發(fā)者可以編寫(xiě)盡量少的樣板代碼。
?
2.Kotlin,java,Swift簡(jiǎn)單比較
1.輸出Hello,World!
???????JAVA:?System.out.println("Hello,World!");
??????? Kotlin:?println("Hello,World!")
??????? Swift:??print("Hello,World!")
2.變量和常量
???????Java:??int? mVariable =10;
??????????????? mVariable =20;
????????????????static?final?int?mConstant =?10;
??????? Kotlin:var?mVariable =?10
??????????????? mVariable =?20
??????????????? val mConstant =?10?????
??????? Swift:var?mVariable =?10
?????????????? mVariable =?20
???????????????let?mConstant =?10???????????
????????感覺(jué)Swift和Kotlin比Java簡(jiǎn)潔,Kotlin和swift很像。
?
3.強(qiáng)制類(lèi)型轉(zhuǎn)換
??????Swift:
???????????????let?label =?"Hello world "
???????????????let?width =?80
???????????????let?widthLabel = label + String(width)
????? ?Kotlin:
???????????????val?label =?"Hello world? "
???????????????val?width =?80
???????????????val?widthLabel =label + width??????
?
4數(shù)組
???????Swift?:
????????????????var?tempList = ["one",?"two","three"]
??????????????? tempList[1] =?"zero"
??? ????Kotlin?:
???????????????val?tempList = arrayOf("one",?"two","three")
?????????????? tempList[1] =?"zero"
?
5.函數(shù)
???????Swift?:func greet(_ name:?String,_day:?String)?->?String?{
?????????????????????????????????????return"Hello\(name),today is \(day)."?}
??????????????????? greet("Bob",?"Tuesday")
?
??????????? Kotlin?:???
?????????????????? fun greet(name:?String, day:?String):?String?{
??????????????????????????????????????return"Hello$name, today is $day."}
?????????????????? greet("Bob",?"Tuesday")???
?
6.類(lèi)聲明及用法
?
?????Swift:
?
???????聲明:classShape?{
??????????????????? var numberOfSides =?0
??????????????????? func simpleDescription()?-> String {
?????????????????????????return"A shapewith \(numberOfSides) sides."
??????????????????? }
????????????? }
???????用法:varshape = Shape()
????????????? shape.numberOfSides =?7
????????????? var shapeDescription =shape.simpleDescription()
??? Kotlin?:
?
????????聲明:classShape?{
??????????????????? var numberOfSides =?0
??????????????????? fun simpleDescription() =?"A shapewith $numberOfSides sides."
????????????? }
????????用法:var shape = Shape()
?????????????? shape.numberOfSides =?7
?????????????? var shapeDescription= shape.simpleDescription()
?
可見(jiàn),Kotlin和Swift好像,現(xiàn)代語(yǔ)言的特征,比java這樣的高級(jí)語(yǔ)言更加簡(jiǎn)化,更貼近自然語(yǔ)言。
?
3.開(kāi)發(fā)環(huán)境
本文使用的是android studio2.0版本,啟動(dòng)androd studio。?
如下圖在configure下拉菜單中選擇plugins,在搜索框中搜索Kotlin,找到結(jié)果列表中的”Kotlin”插件。
?
?
如下圖,找了一張還沒(méi)有安裝kotlin插件的圖
?
?
點(diǎn)擊右側(cè)intall,安裝后重啟studio.
?
4.新建android項(xiàng)目
你可以像以前使用android stuio一樣新建一個(gè)andoid項(xiàng)目,建立一個(gè)activity。本文用已經(jīng)完成的一個(gè)demo來(lái)做示范。
如下圖是一個(gè)stuio的demo工程?
?
?
選擇MainActivity和MessageConst兩個(gè)java文件,然后選擇導(dǎo)航欄上的code,在下拉菜單中選擇convert?Java?file to kotlin file?
?
?
系統(tǒng)會(huì)自動(dòng)進(jìn)行轉(zhuǎn)化,轉(zhuǎn)化完后會(huì)生成對(duì)應(yīng)的MainActivity.kt MessageConst.kt文件,打開(kāi)MainActivity.kt,編譯器上方會(huì)提示”Kotlin not configured”,點(diǎn)擊一下Configure按鈕,IDE就會(huì)自動(dòng)幫我們配置好了!
將兩個(gè)kt文件復(fù)制到src/kotlin目錄下,如下圖
?
?
?
轉(zhuǎn)化后的文件,也許有些語(yǔ)法錯(cuò)誤,需要按照kotlin的語(yǔ)法修改。
環(huán)境配置好后,來(lái)看下gradle更新有哪些區(qū)別
project的gradle代碼如下:
?
buildscript {
??? ext.kotlin_version =?'1.1.3-2'
??? repositories {
??????? jcenter()
??? }
??? dependencies {
??????? classpath?'com.android.tools.build:gradle:2.0.0'
??????? //此處多了kotlin插件依賴(lài)
??????? classpath?"org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
??? }
}
?
allprojects {
??? repositories {
??????? jcenter()
??? }
}
?
再來(lái)看看某個(gè)module的gradle代碼:
?
apply plugin:?'com.android.application'
apply plugin:?'kotlin-android'//此處多了這條插件聲明
?
android {
??? compileSdkVersion?14
??? buildToolsVersion?"24.0.0"
?
??? defaultConfig {
??????? applicationId?"com.olami"
??????? minSdkVersion?8
??????? targetSdkVersion?14
??? }
?
??? buildTypes {
??????? release {
??????????? minifyEnabled?false
??????????? proguardFilesgetDefaultProguardFile('proguard-android.txt'),?'proguard-rules.txt'
??????? }
??? }
??? sourceSets {
??????? main.java.srcDirs +=?'src/main/kotlin'//生成的***.kt文件需要copy到對(duì)應(yīng)的目錄
??? }
}
?
dependencies {
??? compile?'com.android.support:support-v4:18.0.0'
??? compile files('libs/voicesdk_android.jar')
??? compile?"org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"//此處多了kotlin包的依賴(lài)
}
repositories {
??? mavenCentral()
}
?
如上所示,如果不是通過(guò)轉(zhuǎn)化的方式新建kotlin工程,則需要自己按照上面的gradle中增加的部分配置好。
?
5.olami語(yǔ)音識(shí)別應(yīng)用
先貼一張識(shí)別后的效果圖:?
?
?
在MainActivity.kt中
?
override funonCreate(savedInstanceState: Bundle?) {
??????? super.onCreate(savedInstanceState)
??????? setContentView(R.layout.activity_main)
?
??????? initHandler()//初始化handler用于處理消息
?
??????? initView()//初始化view控件,比如點(diǎn)擊開(kāi)始錄音的button
?
??????? initViaVoiceRecognizerListener()//初始化語(yǔ)音識(shí)別回調(diào),用于返回錄音狀態(tài)和識(shí)別結(jié)果
?
??????? init()//初始化語(yǔ)音識(shí)別對(duì)象
}
?
fun init()
{
??????? initHandler()
????????//定義olami語(yǔ)音識(shí)別對(duì)象
??????? mOlamiVoiceRecognizer =OlamiVoiceRecognizer(this@MainActivity)
????????val?telephonyManager =?this.getSystemService(
???????????????????????????????????Context.TELEPHONY_SERVICE) as TelephonyManager
????????val?imei = telephonyManager.deviceId
?
??????? mOlamiVoiceRecognizer!!.init(imei)
????????//set null if you do not want to notifyolami server.
?
????????//設(shè)置回調(diào),用于更新錄音狀態(tài)和數(shù)據(jù)等的界面
???????mOlamiVoiceRecognizer!!.setListener(mOlamiVoiceRecognizerListener)
?
????????//設(shè)置支持的語(yǔ)言類(lèi)型,默認(rèn)請(qǐng)?jiān)O(shè)置簡(jiǎn)體中文
???????mOlamiVoiceRecognizer!!.setLocalization(
????????????????????? ???????????OlamiVoiceRecognizer.LANGUAGE_SIMPLIFIED_CHINESE)
???????mOlamiVoiceRecognizer!!.setAuthorization("51a4bb56ba954655a4fc834bfdc46af1",??
???????????????????????????????????"asr",?"68bff251789b426896e70e888f919a6d",?"nli")
?
????????//注冊(cè)Appkey,在olami官網(wǎng)注冊(cè)應(yīng)用后生成的appkey
????????//注冊(cè)api,請(qǐng)直接填寫(xiě)“asr”,標(biāo)識(shí)語(yǔ)音識(shí)別類(lèi)型
????????//注冊(cè)secret,在olami官網(wǎng)注冊(cè)應(yīng)用后生成的secret
?
???????mOlamiVoiceRecognizer!!.setVADTailTimeout(2000)
????????//錄音時(shí)尾音結(jié)束時(shí)間,建議填//2000ms
?
???????mOlamiVoiceRecognizer!!.setLatitudeAndLongitude(
?????????????????????????????????????????????31.155364678184498,?121.34882432933009)
????????//設(shè)置經(jīng)緯度信息,不愿上傳位置信息,可以填0
?}
?
代碼比較簡(jiǎn)單,點(diǎn)擊開(kāi)始錄音button后,啟動(dòng)錄音,在OlamiVoiceRecognizerListener中回調(diào)處理,然后通過(guò)handler發(fā)送消息用于更新界面。
來(lái)看一下初始化view的代碼,看看跟java方式書(shū)寫(xiě)有哪些不同
?
?
private fun initView()
{
??????? mBtnStart = findViewById(R.id.btn_start) asButton
??????? mBtnStop = findViewById(R.id.btn_stop) as Button
??????? mBtnCancel = findViewById(R.id.btn_cancel) asButton
??????? mBtnSend = findViewById(R.id.btn_send) as Button
??????? mInputTextView = findViewById(R.id.tv_inputText) asTextView
??????? mEditText = findViewById(R.id.et_content) asEditText
??????? mTextView = findViewById(R.id.tv_result) asTextView
??????? mTextViewVolume = findViewById(R.id.tv_volume) asTextView
?
??????? mBtnStart!!.setOnClickListener?{
??????????? if (mOlamiVoiceRecognizer != null)
??????????????? mOlamiVoiceRecognizer!!.start()
??????? }
?
??????? mBtnStop!!.setOnClickListener?{
??????????? if (mOlamiVoiceRecognizer != null)
??????????????? mOlamiVoiceRecognizer!!.stop()
??????????? mBtnStart!!.text?=?"開(kāi)始"
??????????? Log.i("led",?"MusicActivity mBtnStop onclick?開(kāi)始")
??????? }
?
??????? mBtnCancel!!.setOnClickListener?{
??????????? if (mOlamiVoiceRecognizer != null)
??????????????? mOlamiVoiceRecognizer!!.cancel()
??????? }
?
??????? mBtnSend!!.setOnClickListener?{
??????????? if (mOlamiVoiceRecognizer != null)
??????????????? mOlamiVoiceRecognizer!!.sendText(mEditText!!.text.toString())
??????????? mInputTextView!!.text?=?"輸入: "?+ mEditText!!.text
??????? }
?
?
}
?
是不是感覺(jué)代碼更簡(jiǎn)練了??
下面兩句賦值,效果相同,第二句可以用id之間進(jìn)行文本賦值,比以前簡(jiǎn)練好多。
mInputTextView!!.text?=?"輸入: "?+ mEditText!!.text
tv_inputText.text?="輸入: "?+ et_content.text
?
再來(lái)看看handler:
private funinitHandler() {
??????? mHandler = object : Handler() {
??????????? override fun handleMessage(msg:Message) {
??????????????? when (msg.what) {
??????????????????? MessageConst.CLIENT_ACTION_START_RECORED-> mBtnStart!!.text
???????????????????????????????????????????????????????????????=?"錄音中"
???????????? ???????MessageConst.CLIENT_ACTION_STOP_RECORED -> mBtnStart!!.text
???????????????????????????????????????????????????????????????=?"識(shí)別中"
??????????????????? MessageConst.CLIENT_ACTION_CANCEL_RECORED-> {
??????????????????????? mBtnStart!!.text?=?"開(kāi)始"
??????????????????????? mTextView!!.text?=?"已取消"
??????????????????? }
??????????????????? MessageConst.CLIENT_ACTION_ON_ERROR-> {
??????????????????????? mTextView!!.text?=?"錯(cuò)誤代碼:"?+ msg.arg1
??????????????????????? mBtnStart!!.text?=?"開(kāi)始"
????????????? ??????}
??????????????????? MessageConst.CLIENT_ACTION_UPDATA_VOLUME-> mTextViewVolume!!.text
???????????????????????????????????????????????????????????????=?"音量: "?+ msg.arg1
??????????????????? MessageConst.SERVER_ACTION_RETURN_RESULT-> {
??????????????????????? if (msg.obj?!= null)
??????????????????????????? mTextView!!.text?=?"服務(wù)器返回: "?+ msg.obj.toString()
??????????????????????? mBtnStart!!.text?=?"開(kāi)始"
??????????????????????? try {
??????????????????????????? val message = msg.obj?as String
??????????????????????????? var input: String?= null
??????????????????????????? val jsonObject =JSONObject(message)
??????????????????????????? val jArrayNli =
????????????????????????????????? jsonObject.optJSONObject("data").optJSONArray("nli")
??????????????????????????? val jObj =jArrayNli.optJSONObject(0)
??????????????????????????? var jArraySemantic:JSONArray? = null
??????????????????????????? if (message.contains("semantic")) {
??????????????????????????????? jArraySemantic= jObj.getJSONArray("semantic")
??????????????????????????????? input =
??????????????????????????????????jArraySemantic!!.optJSONObject(0).optString("input")
??????????????????????????? } else {
??????????????????????????????? input =?? jsonObject.optJSONObject("data")
??????????????????????????????????????????????.optJSONObject("asr").optString("result")
??????????????????????????? }
??????????????????????????? if (input != null)
???????????????????????????????mInputTextView!!.text?="輸入: "?+ input
???????? ???????????????} catch (e: Exception) {
??????????????????????????? e.printStackTrace()
??????????????????????? }
?
??????????????????? }
??????????????? }
??????????? }
??????? }
?}
?
原來(lái)的switch case的方式,變成了when***,代碼不僅簡(jiǎn)練,更貼近現(xiàn)代語(yǔ)言,更容易理解。
上面的MessageConst.SERVER_ACTION_RETURN_RESULT時(shí),獲取了服務(wù)器返回的結(jié)果,緊接著對(duì)這段語(yǔ)義進(jìn)行了簡(jiǎn)單的解析
?
?
{
??? "data": {
??????? "asr": {
??????????? "result":?"我要聽(tīng)三國(guó)演義",
??????????? "speech_status":?0,
??????????? "final":?true,
??????????? "status":?0
??????? },
??????? "nli": [
????? ??????{
??????????????? "desc_obj": {
??????????????????? "result":?"正在努力搜索中,請(qǐng)稍等",
??????????????????? "status":?0
??????????????? },
??????????????? "semantic": [
??????????????????? {
??????????????????????? "app":?"musiccontrol",
?????????????????????? ?"input":?"我要聽(tīng)三國(guó)演義",
??????????????????????? "slots": [
??????????????????????????? {
???????????????????????????????"name":?"songname",
???????????????????????????????"value":?"三國(guó)演義"
??????????????????????????? }
??????????????????????? ],
??????????????????????? "modifier": [
????????????????????????????"play"
??????????????????????? ],
??????????????????????? "customer":?"58df512384ae11f0bb7b487e"
??????????????????? }
??????????????? ],
??????????????? "type":?"musiccontrol"
??????????? }
??????? ]
??? },
??? "status":?"ok"
}
?
1)解析出nli中type類(lèi)型是musiccontrol,這是語(yǔ)法返回app的類(lèi)型,而這個(gè)在線(xiàn)聽(tīng)書(shū)的demo只關(guān)心musiccontrol這個(gè)app類(lèi)型,其他的忽略。
2)用戶(hù)說(shuō)的話(huà)轉(zhuǎn)成文字是在asr中的result中獲取?
3)在nli中的semantic中,input值是用戶(hù)說(shuō)的話(huà),同asr中的result。?
modifier代表返回的行為動(dòng)作,此處可以看到是play就是要求播放,slots中的數(shù)據(jù)表示歌曲名稱(chēng)是三國(guó)演義。?
那么動(dòng)作是play,內(nèi)容是歌曲名稱(chēng)是三國(guó)演義,在這個(gè)demo中調(diào)用?
mBookUtil.searchBookAndPlay(songName,0,0);會(huì)先查詢(xún),查詢(xún)到結(jié)果會(huì)再發(fā)播放消息要求播放,我要聽(tīng)三國(guó)演義這個(gè)流程就走完了。
這段是在線(xiàn)聽(tīng)書(shū)應(yīng)用中的語(yǔ)義解析,詳情請(qǐng)看博客:http://blog.csdn.net/ls0609/article/details/71519203
?
6.代碼下載
用Koltlin實(shí)現(xiàn)android平臺(tái)語(yǔ)音識(shí)別語(yǔ)義理解
?
7.相關(guān)博客
語(yǔ)音在線(xiàn)聽(tīng)書(shū)博客:http://blog.csdn.net/ls0609/article/details/71519203
語(yǔ)音記賬demo:http://blog.csdn.net/ls0609/article/details/72765789
基于JavaScript用olamisdk實(shí)現(xiàn)web端語(yǔ)音識(shí)別語(yǔ)義理解(speex壓縮)
http://blog.csdn.net/ls0609/article/details/73920229
olami開(kāi)放平臺(tái)語(yǔ)法編寫(xiě)簡(jiǎn)介:http://blog.csdn.net/ls0609/article/details/71624340
olami開(kāi)放平臺(tái)語(yǔ)法官方介紹:https://cn.olami.ai/wiki/?mp=nli&content=nli2.html
?
轉(zhuǎn)載于:https://blog.51cto.com/ls0609/1954834
總結(jié)
以上是生活随笔為你收集整理的用Kotlin开发android平台语音识别语义理解应用的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: Windows Server 2016软
- 下一篇: Docker 常见问题 (FAQ)-20