網(wǎng)站電子地圖怎么做百度保障中心人工電話
需求
產(chǎn)品希望我們這邊能夠?qū)崿F(xiàn)用戶上傳PDF、WORD、TXT之內(nèi)得文本內(nèi)容,然后用戶可以根據(jù)附件名稱或文件內(nèi)容模糊查詢文件信息,并可以在線查看文件內(nèi)容。
一、環(huán)境
項目開發(fā)環(huán)境:
后臺管理系統(tǒng)springboot+mybatis_plus+mysql+es
搜索引擎:elasticsearch7.9.3 +kibana圖形化界面
二、功能實現(xiàn)
1.搭建環(huán)境
es+kibana的搭建這里就不介紹了,網(wǎng)上多的是
后臺程序搭建也不介紹,這里有一點很重要,Java使用的連接es的包的版本一定要和es的版本對應上,不然你會有各種問題
2.文件內(nèi)容識別
第一步: 要用es實現(xiàn)文本附件內(nèi)容的識別,需要先給es安裝一個插件:Ingest Attachment Processor Plugin
這知識一個內(nèi)容識別的插件,還有其它的例如OCR之類的其它插件,有興趣的可以去搜一下了解一下
Ingest Attachment Processor Plugin是一個文本抽取插件,本質(zhì)上是利用了Elasticsearch的ingest node功能,提供了關鍵的預處理器attachment。在安裝目錄下運行以下命令即可安裝。
到es的安裝文件bin目錄下執(zhí)行
elasticsearch-plugin install ingest-attachment
因為我們這里es是使用docker安裝的,所以需要進入到es的docker鏡像里面的bin目錄下安裝插件
[root@iZuf63d0pqnjrga4pi18udZ plugins]# docker exec -it es bash
[root@elasticsearch elasticsearch]# ls
LICENSE.txt NOTICE.txt README.asciidoc bin config data jdk lib logs modules plugins
[root@elasticsearch elasticsearch]# cd bin/
[root@elasticsearch bin]# ls
elasticsearch elasticsearch-certutil elasticsearch-croneval elasticsearch-env-from-file elasticsearch-migrate elasticsearch-plugin elasticsearch-setup-passwords elasticsearch-sql-cli elasticsearch-syskeygen x-pack-env x-pack-watcher-env
elasticsearch-certgen elasticsearch-cli elasticsearch-env elasticsearch-keystore elasticsearch-node elasticsearch-saml-metadata elasticsearch-shard elasticsearch-sql-cli-7.9.3.jar elasticsearch-users x-pack-security-env
[root@elasticsearch bin]# elasticsearch-plugin install ingest-attachment
-> Installing ingest-attachment
-> Downloading ingest-attachment from elastic
[=================================================] 100%??
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: plugin requires additional permissions @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.lang.RuntimePermission accessClassInPackage.sun.java2d.cmm.kcms
* java.lang.RuntimePermission accessDeclaredMembers
* java.lang.RuntimePermission getClassLoader
* java.lang.reflect.ReflectPermission suppressAccessChecks
* java.security.SecurityPermission createAccessControlContext
* java.security.SecurityPermission insertProvider
* java.security.SecurityPermission putProviderProperty.BC
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.Continue with installation? [y/N]y
-> Installed ingest-attachment
顯示installed 就表示安裝完成了,然后重啟es,不然第二步要報錯
第二步:創(chuàng)建一個文本抽取的管道
主要是用于將上傳的附件轉(zhuǎn)換成文本內(nèi)容,支持(word,PDF,txt,excel沒試,應該也支持)
{"description": "Extract attachment information","processors": [{"attachment": {"field": "content","ignore_missing": true}},{"remove": {"field": "content"}}]
}
第三步:定義我們內(nèi)容存儲的索引
{"mappings": {"properties": {"id":{"type": "keyword"},"fileName":{"type": "text","analyzer": "my_ana"},"contentType":{"type": "text","analyzer": "my_ana"},"fileUrl":{"type": "text"},"attachment": {"properties": {"content":{"type": "text","analyzer": "my_ana"}}}}},"settings": {"analysis": {"filter": {"jieba_stop": {"type": "stop","stopwords_path": "stopword/stopwords.txt"},"jieba_synonym": {"type": "synonym","synonyms_path": "synonym/synonyms.txt"}},"analyzer": {"my_ana": {"tokenizer": "jieba_index","filter": ["lowercase","jieba_stop","jieba_synonym"]}}}}
}
mapping:定義的是存儲的字段格式
setting:索引的配置信息,這邊定義了一個分詞(使用的是jieba的分詞)
注意:內(nèi)容檢索的是attachment.content字段,一定要使用分詞,不使用分詞的話,檢索會檢索不出來內(nèi)容
第四步:測試
{"id":"1","name":"進口紅酒","filetype":"pdf","contenttype":"文章","content":"文章內(nèi)容"
}
測試內(nèi)容需要將附件轉(zhuǎn)換成base64格式
在線轉(zhuǎn)換文件的地址:https://www.zhangxinxu.com/sp/base64.html
查詢剛剛上傳的文件:
{"took": 861,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 5,"relation": "eq"},"max_score": 1.0,"hits": [{"_index": "fileinfo","_type": "_doc","_id": "lkPEgYIBz3NlBKQzXYX9","_score": 1.0,"_source": {"fileName": "測試_20220809164145A002.docx","updateTime": 1660034506000,"attachment": {"date": "2022-08-09T01:38:00Z","content_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document","author": "DELL","language": "lt","content": "內(nèi)容","content_length": 2572},"createTime": 1660034506000,"fileUrl": "http://localhost:8092/fileInfo/profile/upload/fileInfo/2022/08/09/測試_20220809164145A002.docx","id": 1306333192,"contentType": "文章","fileType": "docx"}},{"_index": "fileinfo","_type": "_doc","_id": "mUPHgYIBz3NlBKQzwIVW","_score": 1.0,"_source": {"fileName": "測試_20220809164527A001.docx","updateTime": 1660034728000,"attachment": {"date": "2022-08-09T01:38:00Z","content_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document","author": "DELL","language": "lt","content": "內(nèi)容","content_length": 2572},"createTime": 1660034728000,"fileUrl": "http://localhost:8092/fileInfo/profile/upload/fileInfo/2022/08/09/測試_20220809164527A001.docx","id": 1306333193,"contentType": "文章","fileType": "docx"}},{"_index": "fileinfo","_type": "_doc","_id": "JDqshoIBbkTNu1UgkzFK","_score": 1.0,"_source": {"fileName": "txt測試_20220810153351A001.txt","updateTime": 1660116831000,"attachment": {"content_type": "text/plain; charset=UTF-8","language": "lt","content": "內(nèi)容","content_length": 804},"createTime": 1660116831000,"fileUrl": "http://localhost:8092/fileInfo/profile/upload/fileInfo/2022/08/10/txt測試_20220810153351A001.txt","id": 1306333194,"contentType": "告示","fileType": "txt"}}]}
}
我們調(diào)用上傳的接口,可以看到文本內(nèi)容已經(jīng)抽取到es里面了,后面就可以直接分詞檢索內(nèi)容,高亮顯示了
三.代碼
介紹下代碼實現(xiàn)邏輯:文件上傳,數(shù)據(jù)庫存儲附件信息和附件上傳地址;調(diào)用es實現(xiàn)文本內(nèi)容抽取,將抽取的內(nèi)容放到對應索引下;提供小程序全文檢索的api實現(xiàn)根據(jù)文件名稱關鍵詞聯(lián)想,文件名稱內(nèi)容全文檢索模糊匹配,并高亮顯示分詞匹配字段;直接貼代碼
yml配置文件:
# 數(shù)據(jù)源配置
spring:# 服務模塊devtools:restart:# 熱部署開關enabled: true# 搜索引擎elasticsearch:rest:url: 127.0.0.1uris: 127.0.0.1:9200connection-timeout: 1000read-timeout: 3000username: elasticpassword: 123456
elsticsearchConfig(連接配置)
package com.yj.rselasticsearch.domain.config;import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;import java.time.Duration;@Configuration
public class ElasticsearchConfig {@Value("${spring.elasticsearch.rest.url}")private String edUrl;@Value("${spring.elasticsearch.rest.username}")private String userName;@Value("${spring.elasticsearch.rest.password}")private String password;@Beanpublic RestHighLevelClient restHighLevelClient() {//設置連接的用戶名密碼final BasicCredentialsProvider credentialsProvider = new BasicCredentialsProvider();credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(userName, password));RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost(edUrl, 9200,"http")).setHttpClientConfigCallback(httpClientBuilder -> {httpClientBuilder.disableAuthCaching();//保持連接池處于鏈接狀態(tài),該bug曾導致es一段時間沒使用,第一次連接訪問超時httpClientBuilder.setKeepAliveStrategy(((response, context) -> Duration.ofMinutes(5).toMillis()));return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);}));return client;}
}
文件上傳保存文件信息并抽取內(nèi)容到es
實體對象FileInfo
package com.yj.common.core.domain.entity;import com.baomidou.mybatisplus.annotation.TableField;
import com.yj.common.core.domain.BaseEntity;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.Setter;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;import java.util.Date;@Setter
@Getter
@Document(indexName = "fileinfo",createIndex = false)
public class FileInfo {/*** 主鍵*/@Field(name = "id", type = FieldType.Integer)private Integer id;/*** 文件名稱*/@Field(name = "fileName", type = FieldType.Text,analyzer = "jieba_index",searchAnalyzer = "jieba_index")private String fileName;/*** 文件類型*/@Field(name = "fileType", type = FieldType.Keyword)private String fileType;/*** 內(nèi)容類型*/@Field(name = "contentType", type = FieldType.Text)private String contentType;/*** 附件內(nèi)容*/@Field(name = "attachment.content", type = FieldType.Text,analyzer = "jieba_index",searchAnalyzer = "jieba_index")@TableField(exist = false)private String content;/*** 文件地址*/@Field(name = "fileUrl", type = FieldType.Text)private String fileUrl;/*** 創(chuàng)建時間*/private Date createTime;/*** 更新時間*/private Date updateTime;
}
controller類
package com.yj.rselasticsearch.controller;import com.yj.common.core.controller.BaseController;
import com.yj.common.core.domain.AjaxResult;
import com.yj.common.core.domain.entity.FileInfo;
import com.yj.rselasticsearch.service.FileInfoService;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;import javax.annotation.Resource;/*** (file_info)表控制層** @author xxxxx*/
@RestController
@RequestMapping("/fileInfo")
public class FileInfoController extends BaseController {/*** 服務對象*/@Resourceprivate FileInfoService fileInfoService;@PutMapping("uploadFile")public AjaxResult uploadFile(String contentType, MultipartFile file) {return fileInfoService.uploadFileInfo(contentType,file);}
}
serviceImpl實現(xiàn)類
package com.yj.rselasticsearch.service.impl;import com.alibaba.fastjson.JSON;
import com.baomidou.mybatisplus.core.conditions.query.LambdaQueryWrapper;
import com.yj.common.config.RuoYiConfig;
import com.yj.common.core.domain.AjaxResult;
import com.yj.common.utils.FastUtils;
import com.yj.common.utils.StringUtils;
import com.yj.common.utils.file.FileUploadUtils;
import com.yj.common.utils.file.FileUtils;
import com.yj.framework.config.ServerConfig;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.stereotype.Service;
import javax.annotation.Resource;
import com.yj.common.core.domain.entity.FileInfo;
import com.yj.rselasticsearch.mapper.FileInfoMapper;
import com.yj.rselasticsearch.service.FileInfoService;
import org.springframework.web.multipart.MultipartFile;import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Base64;@Service
@Slf4j
public class FileInfoServiceImpl implements FileInfoService{@Resourceprivate ServerConfig serverConfig;@Autowired@Qualifier("restHighLevelClient")private RestHighLevelClient client;@Resourceprivate FileInfoMapper fileInfoMapper;/*** 上傳文件并進行文件內(nèi)容識別上傳到es* @param contentType* @param file* @return*/@Overridepublic AjaxResult uploadFileInfo(String contentType, MultipartFile file) {if (FastUtils.checkNullOrEmpty(contentType,file)){return AjaxResult.error("請求參數(shù)不能為空");}try {// 上傳文件路徑String filePath = RuoYiConfig.getUploadPath() + "/fileInfo";FileInfo fileInfo = new FileInfo();// 上傳并返回新文件名稱String fileName = FileUploadUtils.upload(filePath, file);String prefix = fileName.substring(fileName.lastIndexOf(".")+1);File files = File.createTempFile(fileName, prefix);file.transferTo(files);String url = serverConfig.getUrl() + "/fileInfo" + fileName;fileInfo.setFileName(FileUtils.getName(fileName));fileInfo.setFileType(prefix);fileInfo.setFileUrl(url);fileInfo.setContentType(contentType);int result = fileInfoMapper.insertSelective(fileInfo);if (result > 0) {fileInfo = fileInfoMapper.selectOne(new LambdaQueryWrapper<FileInfo>().eq(FileInfo::getFileUrl,fileInfo.getFileUrl()));byte[] bytes = getContent(files);String base64 = Base64.getEncoder().encodeToString(bytes);fileInfo.setContent(base64);IndexRequest indexRequest = new IndexRequest("fileinfo");//上傳同時,使用attachment pipline進行提取文件indexRequest.source(JSON.toJSONString(fileInfo), XContentType.JSON);indexRequest.setPipeline("attachment");IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);log.info("indexResponse:" + indexResponse);}AjaxResult ajax = AjaxResult.success(fileInfo);return ajax;} catch (Exception e) {return AjaxResult.error(e.getMessage());}}/*** 文件轉(zhuǎn)base64** @param file* @return* @throws IOException*/private byte[] getContent(File file) throws IOException {long fileSize = file.length();if (fileSize > Integer.MAX_VALUE) {log.info("file too big...");return null;}FileInputStream fi = new FileInputStream(file);byte[] buffer = new byte[(int) fileSize];int offset = 0;int numRead = 0;while (offset < buffer.length&& (numRead = fi.read(buffer, offset, buffer.length - offset)) >= 0) {offset += numRead;}// 確保所有數(shù)據(jù)均被讀取if (offset != buffer.length) {throw new ServiceException("Could not completely read file "+ file.getName());}fi.close();return buffer;}
}
高亮分詞檢索
參數(shù)請求WarningInfoDto
package com.yj.rselasticsearch.domain.dto;import com.yj.common.core.domain.entity.WarningInfo;
import io.swagger.annotations.ApiModel;
import io.swagger.annotations.ApiModelProperty;
import lombok.Data;import java.util.List;/*** 前端請求數(shù)據(jù)傳輸* WarningInfo* @author luoY*/
@Data
@ApiModel(value ="WarningInfoDto",description = "告警信息")
public class WarningInfoDto{/*** 頁數(shù)*/@ApiModelProperty("頁數(shù)")private Integer pageIndex;/*** 每頁數(shù)量*/@ApiModelProperty("每頁數(shù)量")private Integer pageSize;/*** 查詢關鍵詞*/@ApiModelProperty("查詢關鍵詞")private String keyword;/*** 內(nèi)容類型*/private List<String> contentType;/*** 用戶手機號*/private String phone;
}
controller類
package com.yj.rselasticsearch.controller;import com.baomidou.mybatisplus.core.metadata.IPage;
import com.yj.common.core.controller.BaseController;
import com.yj.common.core.domain.AjaxResult;
import com.yj.common.core.domain.entity.FileInfo;
import com.yj.common.core.domain.entity.WarningInfo;
import com.yj.rselasticsearch.service.ElasticsearchService;
import com.yj.rselasticsearch.service.WarningInfoService;
import io.swagger.annotations.Api;
import io.swagger.annotations.ApiImplicitParam;
import io.swagger.annotations.ApiImplicitParams;
import io.swagger.annotations.ApiOperation;
import org.springframework.web.bind.annotation.*;
import com.yj.rselasticsearch.domain.dto.WarningInfoDto;import javax.annotation.Resource;
import javax.servlet.http.HttpServletRequest;
import java.util.List;/*** es搜索引擎** @author luoy*/
@Api("搜索引擎")
@RestController
@RequestMapping("es")
public class ElasticsearchController extends BaseController {@Resourceprivate ElasticsearchService elasticsearchService;/*** 告警信息關鍵詞聯(lián)想** @param warningInfoDto* @return*/@ApiOperation("關鍵詞聯(lián)想")@ApiImplicitParams({@ApiImplicitParam(name = "contenttype", value = "文檔類型", required = true, dataType = "String", dataTypeClass = String.class),@ApiImplicitParam(name = "keyword", value = "關鍵詞", required = true, dataType = "String", dataTypeClass = String.class)})@PostMapping("getAssociationalWordDoc")public AjaxResult getAssociationalWordDoc(@RequestBody WarningInfoDto warningInfoDto, HttpServletRequest request) {List<String> words = elasticsearchService.getAssociationalWordOther(warningInfoDto,request);return AjaxResult.success(words);}/*** 告警信息高亮分詞分頁查詢** @param warningInfoDto* @return*/@ApiOperation("高亮分詞分頁查詢")@ApiImplicitParams({@ApiImplicitParam(name = "keyword", value = "關鍵詞", required = true, dataType = "String", dataTypeClass = String.class),@ApiImplicitParam(name = "pageIndex", value = "頁碼", required = true, dataType = "Integer", dataTypeClass = Integer.class),@ApiImplicitParam(name = "pageSize", value = "頁數(shù)", required = true, dataType = "Integer", dataTypeClass = Integer.class),@ApiImplicitParam(name = "contenttype", value = "文檔類型", required = true, dataType = "String", dataTypeClass = String.class)})@PostMapping("queryHighLightWordDoc")public AjaxResult queryHighLightWordDoc(@RequestBody WarningInfoDto warningInfoDto,HttpServletRequest request) {IPage<FileInfo> warningInfoListPage = elasticsearchService.queryHighLightWordOther(warningInfoDto,request);return AjaxResult.success(warningInfoListPage);}
}
serviceImpl實現(xiàn)類
package com.yj.rselasticsearch.service.impl;import com.alibaba.fastjson.JSON;
import com.baomidou.mybatisplus.core.conditions.query.LambdaQueryWrapper;
import com.baomidou.mybatisplus.core.metadata.IPage;
import com.baomidou.mybatisplus.extension.plugins.pagination.Page;
import com.yj.common.constant.DataConstants;
import com.yj.common.constant.HttpStatus;
import com.yj.common.core.domain.entity.FileInfo;
import com.yj.common.core.domain.entity.WarningInfo;
import com.yj.common.core.domain.entity.WhiteList;
import com.yj.common.core.redis.RedisCache;
import com.yj.common.exception.ServiceException;
import com.yj.common.utils.FastUtils;
import com.yj.rselasticsearch.domain.dto.RetrievalRecordDto;
import com.yj.rselasticsearch.domain.dto.WarningInfoDto;
import com.yj.rselasticsearch.domain.vo.MemberVo;
import com.yj.rselasticsearch.service.*;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.Operator;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Pageable;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.SearchHits;
import org.springframework.data.elasticsearch.core.query.*;
import org.springframework.stereotype.Service;import javax.annotation.Resource;
import javax.servlet.http.HttpServletRequest;
import java.util.*;
import java.util.stream.Collectors;@Service
@Slf4j
public class ElasticsearchServiceImpl implements ElasticsearchService {@Resourceprivate WhiteListService whiteListService;@Autowired@Qualifier("restHighLevelClient")private RestHighLevelClient client;@Autowiredprivate RedisCache redisCache;@Resourceprivate TokenService tokenService;/*** 文檔信息關鍵詞聯(lián)想(根據(jù)輸入框的詞語聯(lián)想文件名稱)** @param warningInfoDto* @return*/@Overridepublic List<String> getAssociationalWordOther(WarningInfoDto warningInfoDto, HttpServletRequest request) {//需要查詢的字段BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().should(QueryBuilders.matchBoolPrefixQuery("fileName", warningInfoDto.getKeyword()));//contentType標簽內(nèi)容過濾boolQueryBuilder.must(QueryBuilders.termsQuery("contentType", warningInfoDto.getContentType()));//構建高亮查詢NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().withQuery(boolQueryBuilder).withHighlightFields(new HighlightBuilder.Field("fileName")).withHighlightBuilder(new HighlightBuilder().preTags("<span style='color:red'>").postTags("</span>")).build();//查詢SearchHits<FileInfo> search = null;try {search = elasticsearchRestTemplate.search(searchQuery, FileInfo.class);} catch (Exception ex) {ex.printStackTrace();throw new ServiceException(String.format("操作錯誤,請聯(lián)系管理員!%s", ex.getMessage()));}//設置一個最后需要返回的實體類集合List<String> resultList = new LinkedList<>();//遍歷返回的內(nèi)容進行處理for (org.springframework.data.elasticsearch.core.SearchHit<FileInfo> searchHit : search.getSearchHits()) {//高亮的內(nèi)容Map<String, List<String>> highlightFields = searchHit.getHighlightFields();//將高亮的內(nèi)容填充到content中searchHit.getContent().setFileName(highlightFields.get("fileName") == null ? searchHit.getContent().getFileName() : highlightFields.get("fileName").get(0));if (highlightFields.get("fileName") != null) {resultList.add(searchHit.getContent().getFileName());}}//list去重List<String> newResult = null;if (!FastUtils.checkNullOrEmpty(resultList)) {if (resultList.size() > 9) {newResult = resultList.stream().distinct().collect(Collectors.toList()).subList(0, 9);} else {newResult = resultList.stream().distinct().collect(Collectors.toList());}}return newResult;}/*** 高亮分詞搜索其它類型文檔** @param warningInfoDto* @param request* @return*/@Overridepublic IPage<FileInfo> queryHighLightWordOther(WarningInfoDto warningInfoDto, HttpServletRequest request) {//分頁Pageable pageable = PageRequest.of(warningInfoDto.getPageIndex() - 1, warningInfoDto.getPageSize());//需要查詢的字段,根據(jù)輸入的內(nèi)容分詞全文檢索fileName和content字段BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().should(QueryBuilders.matchBoolPrefixQuery("fileName", warningInfoDto.getKeyword())).should(QueryBuilders.matchBoolPrefixQuery("attachment.content", warningInfoDto.getKeyword()));//contentType標簽內(nèi)容過濾boolQueryBuilder.must(QueryBuilders.termsQuery("contentType", warningInfoDto.getContentType()));//構建高亮查詢NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().withQuery(boolQueryBuilder).withHighlightFields(new HighlightBuilder.Field("fileName"), new HighlightBuilder.Field("attachment.content")).withHighlightBuilder(new HighlightBuilder().preTags("<span style='color:red'>").postTags("</span>")).build();//查詢SearchHits<FileInfo> search = null;try {search = elasticsearchRestTemplate.search(searchQuery, FileInfo.class);} catch (Exception ex) {ex.printStackTrace();throw new ServiceException(String.format("操作錯誤,請聯(lián)系管理員!%s", ex.getMessage()));}//設置一個最后需要返回的實體類集合List<FileInfo> resultList = new LinkedList<>();//遍歷返回的內(nèi)容進行處理for (org.springframework.data.elasticsearch.core.SearchHit<FileInfo> searchHit : search.getSearchHits()) {//高亮的內(nèi)容Map<String, List<String>> highlightFields = searchHit.getHighlightFields();//將高亮的內(nèi)容填充到content中searchHit.getContent().setFileName(highlightFields.get("fileName") == null ? searchHit.getContent().getFileName() : highlightFields.get("fileName").get(0));searchHit.getContent().setContent(highlightFields.get("content") == null ? searchHit.getContent().getContent() : highlightFields.get("content").get(0));resultList.add(searchHit.getContent());}//手動分頁返回信息IPage<FileInfo> warningInfoIPage = new Page<>();warningInfoIPage.setTotal(search.getTotalHits());warningInfoIPage.setRecords(resultList);warningInfoIPage.setCurrent(warningInfoDto.getPageIndex());warningInfoIPage.setSize(warningInfoDto.getPageSize());warningInfoIPage.setPages(warningInfoIPage.getTotal() % warningInfoDto.getPageSize());return warningInfoIPage;}
}
代碼測試:
--請求jason
{"keyword":"全庫備份","contentType":["告示"],"pageIndex":1,"pageSize":10
}--響應
{"msg": "操作成功","code": 200,"data": {"records": [{"id": 1306333194,"fileName": "txt測試_20220810153351A001.txt","fileType": "txt","contentType": "告示","content": "?\t秒級快速<span style='color:red'>備份</span>\r\n不論多大的數(shù)據(jù)量,<span style='color:red'>全庫</span><span style='color:red'>備份</span>只需30秒,而且<span style='color:red'>備份過程</span>不會對數(shù)據(jù)庫加鎖,對應用程序幾乎無影響,全天24小時均可進行<span style='color:red'>備份</span>。","fileUrl": "http://localhost:8092/fileInfo/profile/upload/fileInfo/2022/08/10/txt測試_20220810153351A001.txt","createTime": "2022-08-10T15:33:51.000+08:00","updateTime": "2022-08-10T15:33:51.000+08:00"}],"total": 1,"size": 10,"current": 1,"orders": [],"optimizeCountSql": true,"searchCount": true,"countId": null,"maxLimit": null,"pages": 1}
}
返回的內(nèi)容將分詞檢索到匹配的內(nèi)容,并將匹配的詞高亮顯示。