當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

04.elasticsearch_get操作

發布時間：2024/2/28 编程问答 26 豆豆

生活随笔收集整理的這篇文章主要介紹了 04.elasticsearch_get操作小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- 1. Get API簡介
- 2. Realtime
- 3. Source filtering
- 4. Stored Fields
- 5. 只獲取_source字段
- 6. Routing
- 7. Preference
- 8. Refresh
- 9. Distributed
- 10. Versioning support

1. Get API簡介

get api 允許從一個基于其id的 index 中獲取一個 JSON格式的 document，下面的示例是從一個在名稱為tweet的 type 下的id為1，名稱為twitter的 index 中獲取一個JSON格式的 document。

curl -XGET 'http://localhost:9200/twitter/_doc/1'

以上 get 操作的結果如下

{"_index" : "twitter","_type" : "_doc","_id" : "1","_version" : 1,"found": true,"_source" : {"user" : "kimchy","postDate" : "2009-11-15T14:12:12","message" : "trying out Elasticsearch"} }

以上結果包括 document 的 _index，_type，_id以及_version等我們想要檢索的，包括實際的 _source 如果它可以被發現（相應結果中的found字段）

API還可以檢查 document 是否使用 HEAD，例如：

curl -XHEAD -i 'http://localhost:9200/twitter/_doc/1'

2. Realtime

默認情況下，get API 是實時的，而且它不受 index refresh頻率的影響（當數據對search操作可見）。如果 document 已經修改完但沒還有刷新，get API將會執行 in-place刷新操作使得 document 可見。這也會導致其他發生改變的 docuemnt 可見。若要禁止 GET的實時操作，可以設置 realtime 參數為false。

正常的_search api 的query則不是實時的，只有refresh的doc才能被搜索到。

3. Source filtering

默認情況下，get 操作返回 _source字段的內容，除非你使用stored_fields參數或執行_source=false。你可以使用_source=false參數來關閉_source檢索。

curl -XGET 'http://localhost:9200/twitter/_doc/1?_source=false'

如果你只需要從完整的_source中獲取一個或兩個字段，你可以使用_source_include &_source_exclude 參數用來包含或過濾其他部分。這個功能很有用在大文件 document 部分檢索的時候可以節省網絡開銷。所有的參數可以用普通的分隔符連接或者通配符表達式。示例如下：

curl -XGET 'http://localhost:9200/twitter/_doc/1?_source_include=*.id&_source_exclude=entities'

如果你只是想要指定包含的，你可以使用比較剪短的表達式：

curl -XGET 'http://localhost:9200/twitter/_doc/1?_source=*.id,retweeted'

4. Stored Fields

get 操作允許指定一系列的stored 字段，這些字段將會被返回通過傳遞stored_fields參數。如果請求的字段沒有被儲存，將會被忽略。參考以下示例：

PUT twitter {"mappings": {"properties": {"counter": {"type": "integer","store": false},"tags": {"type": "keyword","store": true}}}}

現在我們可以添加 document：

PUT twitter/_doc/1 {"counter" : 1,"tags" : ["red"] }

1.嘗試去檢索：

GET twitter/_doc/1?stored_fields=tags,counter

以上get操作的結果是：

{"_index": "twitter","_type": "_doc","_id": "1","_version": 1,"found": true,"fields": {"tags": ["red"]} }

從 document 中獲取的字段的值通常是array。由于counter字段沒有存儲，當嘗試獲取stored_fields時get會將其忽略。

可以對元數據字段進行檢索，比如_routing和_parent：

PUT twitter/_doc/2?routing=user1 {"counter" : 1,"tags" : ["white"] }GET twitter/_doc/2?routing=user1&stored_fields=tags,counter

以上get操作的結果是：

{"_index": "twitter","_type": "_doc","_id": "2","_version": 1,"_routing": "user1","found": true,"fields": {"tags": ["white"]} }

只有leaf的字段可以通過stored_field選項返回。所以object字段無法返回并且這個請求會失敗

5. 只獲取_source字段

GET twitter/_source/1 GET twitter/_source/1/?_source_includes=*.id&_source_excludes=entities HEAD twitter/_source/1

6. Routing

當創建索引時想要控制路由，為了獲取 document，routing 的值也因該提供，例如：

curl -XGET 'http://localhost:9200/twitter/_doc/1?routing=kimchy'

以上的操作會獲取id為1的tweet，但是是基于用戶被路由的。注意，如果沒有設置正確的路由，將會導致 document 無法被獲取。

7. Preference

控制共享的副本去執行get請求的優先權。默認情況下，是在共享的副本中隨機操作的。

preference 可以設置為：

_primary: 操作會在主要的共享副本執行。

_local: 操作會優先在本地的共享副本執行

Custom(String)value: 自定義值會被用來保證相同的共享副本使用相同的自定義值。這個幫助 “jumping values” 當不同的共享副本在不同的refresh states。這個值類似于web session id或者user name。

8. Refresh

refresh 參數可以設置為true，為了使其能在get操作和使其可檢索前刷新相關的共享副本。將其設置為true應該要謹慎，應為這將導致系統資源負載增大（也會減慢索引的創建）。

9. Distributed

get操作會經過hash得到一個指定的replica id。然后會重定向到那個shard id的副本集中一個副本上并返回結果。副本集是對應shard id的primary shard和replica。這意味著副本數越多，GET的性能越好。

10. Versioning support

你可以使用 version 參數去檢索 document，es會返回version 和你指定的version 相同的doc 。這個特性同樣適用于所有的 version 類型（internal,external）。

在內部，Elasticsearch 已經標記了已經刪除的舊的 document 并且增加了新的 document。舊版本的 document 不會馬上刪除，但是你也不能訪問。Elasticsearch 會在后臺清理已經刪除的document 以便可以索引更多的數據。

總結

以上是生活随笔為你收集整理的04.elasticsearch_get操作的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： 03.elasticsearch_ind
下一篇： 05.doc_delete操作