當前位置：首頁 > news >正文

政府網(wǎng)站公眾號建設方案/谷歌瀏覽器安卓下載2023版

news 2025/7/1 14:21:03

政府網(wǎng)站公眾號建設方案,谷歌瀏覽器安卓下載2023版,如何構建一個電子商務網(wǎng)站,上海什么做網(wǎng)站的公司比較好一、初識 elasticsearch 1. 了解 ES ① elasticsearch 是一款非常強大的開源搜索引擎，可以幫助我們從海量數(shù)據(jù)中快速找到需要的內容 ② elasticsearch 結合 kibana、Logstash、 Beats，也就是 elastic stack (ELK)，被廣泛應用在日志數(shù)據(jù)分…

一、初識 elasticsearch?

1. 了解 ES

① elasticsearch 是一款非常強大的開源

? 搜索引擎，可以幫助我們從海量數(shù)據(jù)中

? 快速找到需要的內容

②?elasticsearch 結合 kibana、Logstash、

? Beats，也就是 elastic stack (ELK)，被

? 廣泛應用在日志數(shù)據(jù)分析、實時監(jiān)控等

? 領域

③?elasticsearch 是elastic stack的核心，

? ?負責存儲、搜索、分析數(shù)據(jù)

?(2)?Lucene 與 elasticsearch 的區(qū)別

Lucene 是一個Java語言的搜索引擎類庫

Lucene的優(yōu)勢：

① 易擴展

② 高性能 (基于倒排索引)

?Lucene的缺點：

① 只限于 Java 語言開發(fā)

② 學習曲線陡峭

③ 不支持水平擴展

相比于 lucene，elasticsearch 具備下列

優(yōu)勢：

① 支持分布式，可水平擴展

② 提供 Restful 接口，可被任何語言

? ? 調用??

2. 倒排索引

傳統(tǒng)數(shù)據(jù)庫 (如MySQL) 采用正向索引，

局部搜索會在表上逐條數(shù)據(jù)進行掃描，

非常的繁瑣

elasticsearch 采用倒排索引：

會形成一個新的表，由兩部分構成，進

行兩次搜索，先搜詞條再搜文檔

文檔 (document)：每條數(shù)據(jù)就是一個文檔

詞條 (term)：文檔按照語義分成的詞語

倒排索引中包含兩部分內容：

詞條詞典 (Term Dictionary)：記錄所有詞條，

以及詞條與倒排列表 (Posting List) 之間的關

系，會給詞條創(chuàng)建索引，提高查詢和插入效

率

倒排列表 (Posting List)：記錄詞條所在的文

檔 id、詞條出現(xiàn)頻率、詞條在文檔中的位置

等信息

????????文檔 id：用于快速獲取文檔

????????詞條頻率 (TF)：文檔在詞條出現(xiàn)的次數(shù)，

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?用于評分?

3. es 的一些概念

(1) es 與 mysql 對比

(2) 架構

Mysql：擅長事務類型操作，可以確保

? ? ? ? ? ? ?數(shù)據(jù)的安全和一致性

Elasticsearch：擅長海量數(shù)據(jù)的搜索、

? ? ? ? ? ? ? ? ? ? ? ? ?分析、計算

4. 安裝 es、kibana

(1) 部署單點 es

(2) 部署?kibana

kibana 可以提供一個 elasticsearch 的可

視化界面

(3) 安裝 IK 分詞器

??1) 分詞器的作用

① 創(chuàng)建倒排索引時對文檔分詞

② 用戶搜索時，對輸入的內容分詞

? 2)?默認的分詞語法說明：

在 kibana 的 DevTools 中測試：

POST?/_analyze
{"analyzer":?"standard","text":?"床前明月光，疑是地上霜！"
}

① POST：請求方式

② /_analyze：請求路徑，這里省略了，

???????????????????????有 kibana 幫我們補充

③ 請求參數(shù)，json風格：

????????analyzer：分詞器類型，這里是默

? ? ? ? ? ? ? ? ? ? ? ? ?認的 standard 分詞器

????????text：要分詞的內容?

默認將文字拆除一個字一個字的，對中

文分詞很不友好，所以用 IK 分詞器

? 3) ik 分詞器包含兩種模式：

ik_smart：最少切分，粗粒度

ik_max_word：最細切分，細粒度?

一般情況下，為了提高搜索的效果，

需要這兩種分詞器配合使用，既建

索引時用 ik_max_word 盡可能多的

分詞，而搜索時用 ik_smart 盡可能

提高匹配準度，讓用戶的搜索盡可

能的準確

? ?4) ik 分詞器擴展詞條

要拓展ik分詞器的詞庫，只需要修改一

個 ik 分詞器目錄中的 config 目錄中的

IkAnalyzer.cfg.xml 文件：

<?xml?version="1.0"?encoding="UTF-8"?>
<!DOCTYPE?properties?SYSTEM?"http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK?Analyzer?擴展配置</comment><!--用戶可以在這里配置自己的擴展字典?***?添加擴展詞典--><entry?key="ext_dict">ext.dic</entry>
</properties>

然后在名為 ext.dic 的文件中，添加想要

拓展的詞語即可

? 5) 停用詞條

在 stopword.dic 文件中，添加想要拓展的

詞語即可：

<?xml?version="1.0"?encoding="UTF-8"?>
<!DOCTYPE?properties?SYSTEM?"http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK?Analyzer?擴展配置</comment><!--用戶可以在這里配置自己的擴展字典--><entry?key="ext_dict">ext.dic</entry><!--用戶可以在這里配置自己的擴展停止詞字典??***?添加停用詞詞典--><entry?key="ext_stopwords">stopword.dic</entry>
</properties>

(4) 部署 es 集群

直接使用 docker-compose 來完成

二、索引庫操作

1. mapping 映射屬性

(1) mapping 是對索引庫中文檔的約束，常

? ? 見的 mapping 屬性包括：

① type：字段數(shù)據(jù)類型，常見的簡單類型有：

字符串：text (可分詞的文本)、keyword

? ?(精確值，例如：品牌、國家、ip 地址)

數(shù)值：long、integer、short、byte、

? ? ? ? ? ?double、float

布爾：boolean

日期：date

對象：object

② index：是否創(chuàng)建索引，默認為 true

③ analyzer：使用哪種分詞器

④ properties：該字段的子字段

2. 索引庫的 CRUD

(1) 創(chuàng)建索引庫

ES 中通過 Restful 請求操作索引庫、

文檔，請求內容用 DSL 語句來表示

創(chuàng)建索引庫和 mapping 的 DSL 語法如下：

PUT /索引庫名稱

PUT /索引庫名稱
{"mappings": {"properties": {"字段名":{"type": "text","analyzer": "ik_smart"},"字段名2":{"type": "keyword","index": "false"},"字段名3":{"properties": {"子字段": {"type": "keyword"}}},// ...略}}
}

(2) 查看索引庫

GET /索引庫名

(3) 修改索引庫

索引庫和 mapping 一旦創(chuàng)建無法修改，

但是可以添加新的字段，語法如下：

PUT /索引庫名/_mapping

PUT /索引庫名/_mapping
{"properties": {"新字段名":{"type": "integer"}}
}

(4) 刪除索引庫

DELETE /索引庫名

三、文檔操作

1. 新增文檔

POST /索引庫名/_doc/文檔id

POST /索引庫名/_doc/文檔id
{"字段1": "值1","字段2": "值2","字段3": {"子屬性1": "值3","子屬性2": "值4"},// ...
}

2. 查詢文檔

GET /索引庫名/_doc/文檔id

3. 刪除文檔

DELETE /索引庫名/_doc/文檔id

4. 修改文檔

(1) 全量修改

刪除舊文檔，添加新文檔

本質是：根據(jù)指定的 id 刪除文檔，新增

? ? ? ? ? ? ? 一個相同 id 的文檔

PUT /{索引庫名}/_doc/文檔id
{"字段1": "值1","字段2": "值2",// ... 略
}

(2)?增量修改

修改指定字段值

POST /{索引庫名}/_update/文檔id
{"doc": {"字段名": "新的值",}
}

5.?Dynamic Mapping

我們向 ES 中插入文檔時，如果文檔中

字段沒有對應的 mapping，ES 會幫助

我們字段設置 mapping

JSON類型	Elasticsearch類型
字符串	① 日期格式字符串：mapping 為 date 類型 ② 普通字符串：mapping 為 text 類型，并添加 ? ? ?keyword 類型子字段
布爾值	boolean
浮點數(shù)	float
整數(shù)	long
對象嵌套	object，并添加 properties
數(shù)組	由數(shù)組中的第一個非空類型決定
空值	忽略

四、RestClient 操作索引庫

RESTClient 是一款用于測試各種 Web

服務的插件，它可以向服務器發(fā)送各種

HTTP請求(用戶也可以自定義請求方式)，

并顯示服務器響應

本質就是組裝 DSL 語句，通過 http請求

發(fā)送給 ES

1. 創(chuàng)建索引庫

(1) 導入數(shù)據(jù)庫

(2) 分析數(shù)據(jù)結構

mapping 要考慮的問題：

字段名、數(shù)據(jù)類型、是否參與搜索、是

否分詞，如果分詞，分詞器是什么

(3)?初始化 JavaRestClient

① 引入依賴

<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency><properties><java.version>1.8</java.version><elasticsearch.version>7.12.1</elasticsearch.version>
</properties>

② 初始化

RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.150.101:9200")
));

(4) 創(chuàng)建索引庫代碼

@Testvoid testCreateHotelIndex() throws IOException {// 1.創(chuàng)建Request對象CreateIndexRequest request = new CreateIndexRequest("hotel");// 2.請求參數(shù)，MAPPING_TEMPLATE是靜態(tài)常量字符串，內容是創(chuàng)建索引庫的DSL語句      request.source(MAPPING_TEMPLATE, XContentType.JSON);// 3.發(fā)起請求, indices 返回的對象中包含索引庫操作的所有方法client.indices().create(request, RequestOptions.DEFAULT);
}

2. 刪除索引庫代碼

@Test
void testDeleteHotelIndex() throws IOException {// 1.創(chuàng)建Request對象DeleteIndexRequest request = new DeleteIndexRequest("hotel");// 2.發(fā)起請求client.indices().delete(request, RequestOptions.DEFAULT);
}

3. 判斷索引庫是否存在

@Test
void testExistsHotelIndex() throws IOException {// 1.創(chuàng)建Request對象GetIndexRequest request = new GetIndexRequest("hotel");// 2.發(fā)起請求 boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);// 3.輸出System.out.println(exists);
}

五、RestClient 操作文檔

1. 初始化

public class ElasticsearchDocumentTest {   // 客戶端private RestHighLevelClient client;@BeforeEachvoid setUp() {client = new RestHighLevelClient(RestClient.builder(                       HttpHost.create("http://192.168.150.101:9200")));}@AfterEachvoid tearDown() throws IOException {client.close();}
}

2. 新增文檔

@Test
void testIndexDocument() throws IOException {// 1.創(chuàng)建request對象IndexRequest request = new IndexRequest("indexName").id("1");// 2.準備JSON文檔request.source("{\"name\": \"Jack\", \"age\": 21}", XContentType.JSON);// 3.發(fā)送請求client.index(request, RequestOptions.DEFAULT);
}

3. 查詢文檔

@Test
void testGetDocumentById() throws IOException {// 1.創(chuàng)建request對象GetRequest request = new GetRequest("indexName", "1");// 2.發(fā)送請求，得到結果GetResponse response = client.get(request, RequestOptions.DEFAULT);// 3.解析結果String json = response.getSourceAsString();System.out.println(json);
}

4. 修改文檔

@Test
void testUpdateDocumentById() throws IOException {// 1.創(chuàng)建request對象UpdateRequest request = new UpdateRequest("indexName", "1");// 2.準備參數(shù)，每2個參數(shù)為一對 key valuerequest.doc("age", 18,"name", "Rose");// 3.更新文檔client.update(request, RequestOptions.DEFAULT);
}

5. 刪除文檔

@Test
void testDeleteDocumentById() throws IOException {// 1.創(chuàng)建request對象DeleteRequest request = new DeleteRequest("indexName", "1");// 2.刪除文檔 client.delete(request, RequestOptions.DEFAULT);
}

6. 批量導入文檔

@Test
void testBulk() throws IOException {// 1.創(chuàng)建Bulk請求BulkRequest request = new BulkRequest();// 2.添加要批量提交的請求：這里添加了兩個新增文檔的請求request.add(new IndexRequest("hotel").id("101").source("json source", XContentType.JSON));request.add(new IndexRequest("hotel").id("102").source("json source2", XContentType.JSON));// 3.發(fā)起bulk請求client.bulk(request, RequestOptions.DEFAULT);
}

查看全文

http://www.risenshineclean.com/news/259.html