朱宏亮
  • 首页
  • 说明

Tag ik中文分词

个人站点

  • 首页   /  
  • 标签: "ik中文分词"
elasticsearch, Spring 9月 21,2019

springboot2.1.8+elasticsearch7.3.2(四),查询文档

我的DEMO项目下载:https://gitee.com/zhuhongliang/soringboot_elastic_search_732.git

按照前几篇文章的方法,现在再创建一个新的实体类Article2.class,除了名称不一样,类成员都一样

public class Article2 {
    private Integer id;
    private String title;
    private String content;
    private Date create_time;
省略getter & setter
}

用前几篇文章写得方法,为这个实体类创建索引,并批量创建文档。
我们来查询一下,看代码:
service:

/**
 *
 * @param clasz 要查询的类
 * @param value 搜索内容
 * @param start 开始下标
 * @param size 搜索范围
 * @param fieldName 搜索字段
 * @return
 */
public SearchHit[] searchDoc(Class clasz, String value, int start, int size, String... fieldName){

    SearchRequest searchRequest;
    if (clasz == null) {
        searchRequest = new SearchRequest();
    }else{
        searchRequest = new SearchRequest(clasz.getSimpleName().toLowerCase());
    }
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    log.info("fieldName.length:{},fieldName:{}",fieldName.length,fieldName);
    QueryBuilder queryBuilder = QueryBuilders.multiMatchQuery(value, fieldName);
    searchSourceBuilder.query(queryBuilder);
    searchSourceBuilder.from(start);
    searchSourceBuilder.size(size);
    searchRequest.source(searchSourceBuilder);
    return search2(searchRequest);

}
public SearchHit[] search2(SearchRequest searchRequest){
    try {
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        SearchHits hits = searchResponse.getHits();
        SearchHit[] searchHits = hits.getHits();
        return searchHits;
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}

测试方法:

@Test
public void searchDoc(){
    int page =1;
    int size =10;
    System.out.println("==============================不指定索引,查询一个字段===============================================");
    SearchHit[] hits = commodityServiceImpl.searchDoc(null, "春晚", (page-1)*size, size, "title");
    for(SearchHit hit:hits){
        log.info("索引:{},value:{}",hit.getIndex(),JSONObject.toJSONString(hit.getSourceAsString()));
    }
    System.out.println("\n\n==============================不指定索引,查询多个字段===============================================");
    hits = commodityServiceImpl.searchDoc(null, "春晚", (page-1)*size, size, "title","content");
    for(SearchHit hit:hits){
        log.info("索引:{},value:{}",hit.getIndex(),JSONObject.toJSONString(hit.getSourceAsString()));
    }
    System.out.println("\n\n==============================指定索引,查询一个字段===============================================");
    hits = commodityServiceImpl.searchDoc(Article2.class, "春晚", (page-1)*size, size, "title");
    for(SearchHit hit:hits){
        log.info("索引:{},value:{}",hit.getIndex(),JSONObject.toJSONString(hit.getSourceAsString()));
    }
    System.out.println("\n\n==============================指定索引,查询多个字段===============================================");
    hits = commodityServiceImpl.searchDoc(Article2.class, "春晚", (page-1)*size, size, "title","content");
    for(SearchHit hit:hits){
        log.info("索引:{},value:{}",hit.getIndex(),JSONObject.toJSONString(hit.getSourceAsString()));
    }
}

输出:

==============================不指定索引,查询一个字段===============================================
2019-09-21 15:16:55.053  INFO 228236 --- [           main] c.z.e.service.impl.ServiceImpl           : fieldName.length:1,fieldName:[title]
2019-09-21 15:16:58.136  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1568997067851,\"title\":\"武陵春·春晚\"}"
2019-09-21 15:16:58.136  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1569050046334,\"title\":\"武陵春·春晚\"}"


==============================不指定索引,查询多个字段===============================================
2019-09-21 15:16:58.136  INFO 228236 --- [           main] c.z.e.service.impl.ServiceImpl           : fieldName.length:2,fieldName:[title, content]
2019-09-21 15:16:58.147  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1568997067851,\"title\":\"武陵春·春晚\"}"
2019-09-21 15:16:58.147  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1569050046334,\"title\":\"武陵春·春晚\"}"
2019-09-21 15:16:58.147  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article,value:"{\"content\":\"寻寻觅觅,冷冷清清,凄凄惨惨戚戚。乍暖还寒时候,最难将息。三杯两盏淡酒,怎敌他、晚来风急?雁过也,正伤心,却是旧时相识。\\n满地黄花堆积,憔悴损,如今有谁堪摘?守着窗儿,独自怎生得黑?梧桐更兼细雨,到黄昏、点点滴滴。这次第,怎一个愁字了得!(守着窗儿 一作:守著窗儿)测试春晚\",\"create_time\":1568997067851,\"title\":\"声声慢·寻寻觅觅\"}"
2019-09-21 15:16:58.148  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"寻寻觅觅,冷冷清清,凄凄惨惨戚戚。乍暖还寒时候,最难将息。三杯两盏淡酒,怎敌他、晚来风急?雁过也,正伤心,却是旧时相识。\\n满地黄花堆积,憔悴损,如今有谁堪摘?守着窗儿,独自怎生得黑?梧桐更兼细雨,到黄昏、点点滴滴。这次第,怎一个愁字了得!(守着窗儿 一作:守著窗儿)测试春晚\",\"create_time\":1569050046334,\"title\":\"声声慢·寻寻觅觅\"}"


==============================指定索引,查询一个字段===============================================
2019-09-21 15:16:58.149  INFO 228236 --- [           main] c.z.e.service.impl.ServiceImpl           : fieldName.length:1,fieldName:[title]
2019-09-21 15:16:58.154  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1569050046334,\"title\":\"武陵春·春晚\"}"


==============================指定索引,查询多个字段===============================================
2019-09-21 15:16:58.154  INFO 228236 --- [           main] c.z.e.service.impl.ServiceImpl           : fieldName.length:2,fieldName:[title, content]
2019-09-21 15:16:58.158  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"风住尘香花已尽,日晚倦梳头。物是人非事事休,欲语泪先流。\\n闻说双溪春尚好,也拟泛轻舟。只恐双溪舴艋舟,载不动许多愁。测试寻寻觅觅咏梅.测试咏梅\",\"create_time\":1569050046334,\"title\":\"武陵春·春晚\"}"
2019-09-21 15:16:58.158  INFO 228236 --- [           main] c.z.e.ElasticsearchDemoApplicationTests  : 索引:article2,value:"{\"content\":\"寻寻觅觅,冷冷清清,凄凄惨惨戚戚。乍暖还寒时候,最难将息。三杯两盏淡酒,怎敌他、晚来风急?雁过也,正伤心,却是旧时相识。\\n满地黄花堆积,憔悴损,如今有谁堪摘?守着窗儿,独自怎生得黑?梧桐更兼细雨,到黄昏、点点滴滴。这次第,怎一个愁字了得!(守着窗儿 一作:守著窗儿)测试春晚\",\"create_time\":1569050046334,\"title\":\"声声慢·寻寻觅觅\"}"

从日志中我们能发现,没有置顶索引的查询会从系统所有索引中一起查询内容,查询的字段也可以指定匹配多个字段来查询。很简单。

我的DEMO项目下载:https://gitee.com/zhuhongliang/soringboot_elastic_search_732.git

作者 朱宏亮
elasticsearch, Spring 9月 21,2019

springboot2.1.8+elasticsearch7.3.2(二),pom.xml文件和ik中文分词插件的使用以及创建索引代码

我的DEMO项目下载:https://gitee.com/zhuhongliang/soringboot_elastic_search_732.git

下载elasticsearch-analysis-ik-7.3.2.zip :https://github.com/medcl/elasticsearch-analysis-ik/releases
在\elasticsearch-7.3.2\plugins 目录里创建文件夹ik,然后把zip全部解压进去
用命令查看当前的插件:

PS C:\data\elasticsearch-7.3.2\bin> .\elasticsearch-plugin.bat list
ik

然后重启elasticsearch
在一大堆启动输出内容里能发现

[2019-09-20T23:52:09,550][INFO ][o.e.p.PluginsService     ] [node-1] loaded plugin [analysis-ik]

表示ik安装成功。

项目文件:

pom.xml在依赖elasticsearch-rest-high-level-client的时候,需要把一些内部低版本的依赖去除

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.8.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.zhl</groupId>
    <artifactId>elasticsearch_demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>elasticsearch_demo</name>
    <description>Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
        <spring.data.elasticsearch.version>3.1.10.RELEASE</spring.data.elasticsearch.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.3.2</version>
            <exclusions>
                <!--这里内部依赖的版本太低,去除-->
                <exclusion>
                    <groupId>org.elasticsearch</groupId>
                    <artifactId>elasticsearch</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.elasticsearch.client</groupId>
                    <artifactId>elasticsearch-rest-client</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-client</artifactId>
            <version>7.3.2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.3.2</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.60</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

然后看spring的代码,创建实体类Article

public class Article {
    private Integer id;
    private String title;
    private String content;
    private Date create_time;
    //省略getter & setter
}

创建一个RestHighLevelClient客户端的Bean,后面的操作都用这个Bean

@SpringBootApplication
public class ElasticsearchDemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(ElasticsearchDemoApplication.class, args);
    }

    @Bean
    public RestHighLevelClient client() {
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("192.168.10.1", 9200, "http")));
        return client;
    }

}

下面的代码就是创建一个索引,对指定字段设置ik中文分词

    @Autowired
    RestHighLevelClient client;
   /**
     * 创建索引
     * 将实体类的String类型的字段添加ik中文分词,其他字段在实际添加数据后会按类型自动添加
     * @param clasz
     * @return
     */
    public Boolean createIndex2(Class clasz){
        try {
            CreateIndexRequest index = new CreateIndexRequest(clasz.getSimpleName().toLowerCase());

            XContentBuilder builder = JsonXContent.contentBuilder();
            builder.startObject()
                    .startObject("mappings")
                    .startObject("properties");
            Field[] fields = clasz.getDeclaredFields();
            for(Field field : fields){
                Class<?> type = field.getType();
                //这里只对String类型的字段添加ik中文分词处理
                if(type.getSimpleName().equals("String")){
                    builder.startObject(field.getName().toLowerCase())
                                .field("type","text")
                                .field("index",true)
                                .field("analyzer","ik_max_word")
                            .endObject();
                }
            }
            builder.endObject().endObject();
            builder.startObject("settings")
                    //分片数
                    .field("number_of_shards",1)
                    //副本数,1台机器设为0
                    .field("number_of_replicas",0)
                    .endObject()
                    .endObject();

            index.source(builder);
            CreateIndexResponse response = client.indices().create(index,RequestOptions.DEFAULT);
            log.info(JSONObject.toJSONString(response));
            return response.isAcknowledged();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return false;
    }

    /**
     * 索引是否存在
     * @param indeces
     */
    public Boolean exist(String indeces){
        GetIndexRequest request = new GetIndexRequest(indeces.toLowerCase());
        Boolean exist = null;
        try {
            exist = client.indices().exists(request, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return exist;
    }

    /**
     * 删除索引
     * @param indeces
     */
    public void delete(String indeces){
        DeleteIndexRequest request = new DeleteIndexRequest(indeces.toLowerCase());
        try {
            AcknowledgedResponse deleteIndexResponse = client.indices().delete(request, RequestOptions.DEFAULT);
            boolean acknowledged = deleteIndexResponse.isAcknowledged();
            log.info("删除索引:{}", acknowledged);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }


    /**
     * 测试创建索引
     */
    @Test
    public void indexCreate(){
        Boolean exist = commodityServiceImpl.exist(Article.class.getSimpleName());
        if(!exist){
            log.info("需要创建新的索引");
            Boolean result = commodityServiceImpl.createIndex2(Article.class);
            log.info("创建索引结果:{}",result);
            return;
        }
        log.info("索引已存在,不需要创建新的");
    }

    /**
     * 删除索引
     */
    @Test
    public void indexDelete(){
        commodityServiceImpl.delete(Article.class.getSimpleName());
    }

运行测试代码,打开kibana,查看创建的索引http://localhost:5601/:

aticle就是前面创建的实体类的小写名称,mapping中的content字段和title是Article.class的两个String类型的成员,设置使用ik分词,其余没有设置的字段,会在添加数据后自动添加

 

除了用代码访问接口外,也可以使用kibana在Dev Tools使用创建mapping接口:

PUT article
{
  "mappings" : {
      "properties" : {
        "content" : {
          "type" : "text",
          "analyzer":"ik_max_word"
        },
        "title" : {
          "type" : "text",
          "analyzer":"ik_max_word"
        },
        "year" : {
          "type" : "long"
        }
      }
  }
}

用put 加 索引名称,json数据按照实际的实体类来修改即可。
 

 

 

我的DEMO项目下载:https://gitee.com/zhuhongliang/soringboot_elastic_search_732.git

作者 朱宏亮
elasticsearch, Spring 9月 20,2019

springboot2.1.8+elasticsearch7.3.2(一),下载安装elasticsearch+kibana

我的DEMO项目下载:https://gitee.com/zhuhongliang/soringboot_elastic_search_732.git

elasticsearch,我用的是windows10系统,在官网下载:https://www.elastic.co/cn/start 连kibana一起下载了
解压后,看下配置文件elasticsearch-7.3.2\config\elasticsearch.yml,注意修改一下network.host这个属性,设为elasticsearch启动机器的ip,cluster.name属性去掉注释,然后随便加个名字上去,我用的: application

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#启动机器的ip
network.host: 192.168.10.1
#
# Set a custom port for HTTP:
#默认9200
#http.port: 9200
#
# For more information, consult the network module documentation.
#

然后启动:elasticsearch.bat
看到下面的输出就行了

PS C:\data\elasticsearch-7.3.2\bin> .\elasticsearch.bat
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[2019-09-18T21:50:48,471][INFO ][o.e.e.NodeEnvironment    ] [node-1] using [1] data paths, mounts [[Windows (C:)]], net usable_space [70gb], net total_space [237.2gb], types [NTFS]
[2019-09-18T21:50:48,475][INFO ][o.e.e.NodeEnvironment    ] [node-1] heap size [989.8mb], compressed ordinary object pointers [true]
[2019-09-18T21:50:48,763][INFO ][o.e.n.Node               ] [node-1] node name [node-1], node ID [3yVr9onNQku6gNesOurCwg], cluster name [application]
[2019-09-18T21:50:48,764][INFO ][o.e.n.Node               ] [node-1] version[7.3.2], pid[38752], build[default/zip/1c1faf1/2019-09-06T14:40:30.409026Z], OS[Windows 10/10.0/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12.0.2/12.0.2+10]
[2019-09-18T21:50:48,764][INFO ][o.e.n.Node               ] [node-1] JVM home [C:\data\elasticsearch-7.3.2\jdk]
[2019-09-18T21:50:48,765][INFO ][o.e.n.Node               ] [node-1] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=C:\Users\zhuho\AppData\Local\Temp\elasticsearch, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -XX:MaxDirectMemorySize=536870912, -Delasticsearch, -Des.path.home=C:\data\elasticsearch-7.3.2, -Des.path.conf=C:\data\elasticsearch-7.3.2\config, -Des.distribution.flavor=default, -Des.distribution.type=zip, -Des.bundled_jdk=true]
……

我的机器ip是192.168.10.1,所以在浏览器访问http://192.168.10.1:9200/,
浏览器输出:

{
  "name" : "node-1",
  "cluster_name" : "application",
  "cluster_uuid" : "Y8Hd00TLQJ-ZlLSAb3RtoQ",
  "version" : {
    "number" : "7.3.2",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "1c1faf1",
    "build_date" : "2019-09-06T14:40:30.409026Z",
    "build_snapshot" : false,
    "lucene_version" : "8.1.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

elasticsearch就启动完毕了。

然后启动kibana
首先改一下kibana的配置文件,kibana-7.3.2-windows-x86_64\config\kibana.yml
把elasticsearch.hosts设置为elasticsearch配置的地址

# The URLs of the Elasticsearch instances to use for all your queries.
elasticsearch.hosts: ["http://192.168.10.1:9200"]

启动:kibana-7.3.2-windows-x86_64\bin\kibana.bat

启动完成后访问:http://localhost:5601 就能打开kibana的后台

作者 朱宏亮

分类目录

  • ActiveMQ
  • apache
  • docker
  • dubbo
  • Eclipse
  • elasticsearch
  • git
  • IntelliJ
  • jar
  • java
  • jsp
  • kafka
  • linux
  • MongoDB
  • MyBatis
  • MySql
  • nginx
  • php
  • Redis
  • Spring
  • SpringMVC
  • Tomcat
  • 个人日志
  • 未分类
  • 缓存
  • 阿里云相关

标签

ActiveMQ annotation apache Eclipse elasticsearch elasticsearch-rest-high-level-client feign git ik中文分词 IntelliJ java java使用redis jdbc jedis-2.9.0.jar jsp kafka kafka_2.11-1.0.0 linux maven Mybatis mysql mysql5.7 nginx php Redis redis-4.0.6 Spring spring-data-commons-2.0.1.jar spring-data-redis-1.8.3.jar spring-kafka spring4.3.13 springboot springcloud Springmvc spring整合redis tomcat zookeeper 反向代理 启动报错 安装 控制反转 数据注入 注解 缓存 项目路径

近期评论

    功能

    • 登录
    • 文章RSS
    • 评论RSS
    • WordPress.org

    我的联系邮箱:zhuhongliang.king@qq.com.