Elasticsearch简单学习8:深入了解2

Elasticsearch简单学习8:深入了解2

2.boolean查询

 ● 说明:一个 bool 查询,是一个或者多个查询子句的组合
            ○ 总共包括 4 种⼦子句。其中 2 种会影响算分,2 种不不影响算分
● 相关性并不只是全文本检索的专利。也适⽤用于 yes | no 的⼦句,匹配的子句越多,相关性评分 越高。

     如果多条查询子句被合并为一条复合查询语句 ,比如 bool 查询,

     则每个查询子句计算 得出的评分会被合并到总的相关性评分中

3.boolean查询的语法

#基本语法
POST /products/_search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "price" : "30" }
      },
      "filter": {
        "term" : { "avaliable" : "true" }
      },
      "must_not" : {
        "range" : {
          "price" : { "lte" : 10 }
        }
      },
      "should" : [
        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },
        { "term" : { "productID.keyword" : "XHDK-A-1293-#fJ3" } }
      ],
      "minimum_should_match" :1
    }
  }
}

● 子查询可以任意顺序出现
● 可以嵌套多个查询
● 如果你的 bool 查询中,没有 must 条件, should 中必须至少满足一条查询。

【如上:should中是一个数组,至少要满足一条。】

4.如何解决结构化查询 – “包含而不是相等”的问题


#改变数据模型,增加字段。解决数组包含而不是精确匹配的问题
POST /newmovies/_bulk
{ "index": { "_id": 1 }}
{ "title" : "Father of the Bridge Part II","year":1995, "genre":"Comedy","genre_count":1 }
{ "index": { "_id": 2 }}
{ "title" : "Dave","year":1993,"genre":["Comedy","Romance"],"genre_count":2 }

#must,有算分
POST /newmovies/_search
{
  "query": {
    "bool": {
      "must": [
        {"term": {"genre.keyword": {"value": "Comedy"}}},
        {"term": {"genre_count": {"value": 1}}}

      ]
    }
  }
}

增加count字段,使用boolean查询解决。

5.Filter Context – 不影响算分

#Filter。不参与算分,结果的score是0
POST /newmovies/_search
{
  "query": {
    "bool": {
      "filter": [
        {"term": {"genre.keyword": {"value": "Comedy"}}},
        {"term": {"genre_count": {"value": 1}}}
        ]

    }
  }
}


#Filtering Context
POST _search
{
  "query": {
    "bool" : {

      "filter": {
        "term" : { "avaliable" : "true" }
      },
      "must_not" : {
        "range" : {
          "price" : { "lte" : 10 }
        }
      }
    }
  }
}

6.Query Context – 影响算分

#Query Context
DELETE products
POST /products/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10,"avaliable":true,"date":"2018-01-01", "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20,"avaliable":true,"date":"2019-01-01", "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30,"avaliable":true, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30,"avaliable":false, "productID" : "QQPX-R-3956-#aD8" }


POST /products/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "productID.keyword": {
              "value": "JODL-X-1937-#pV7"}}
        },
        {"term": {"avaliable": {"value": true}}
        }
      ]
    }
  }
}

#嵌套,实现了 should not 逻辑
POST /products/_search
{
  "query": {
    "bool": {
      "must": {
        "term": {
          "price": "30"
        }
      },
      "should": [
        {
          "bool": {
            "must_not": {
              "term": {
                "avaliable": "false"
              }
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

7.查询语句的结构,会对相关度算分产生影响

brown和red加起来的权重只相当于quick和dog的权重。

8.控制字段的 Boosting

#通过改变boost的值,你会发现结果的排序会改变!!
POST blogs/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {
          "title": {
            "query": "apple,ipad",
            "boost": 1.3
          }
        }},

        {"match": {
          "content": {
            "query": "apple,ipad",
            "boost":1.2
          }
        }}
      ]
    }
  }
}
DELETE news
POST /news/_bulk
{ "index": { "_id": 1 }}
{ "content":"Apple Mac" }
{ "index": { "_id": 2 }}
{ "content":"Apple iPad" }
{ "index": { "_id": 3 }}
{ "content":"Apple employee like Apple Pie and Apple Juice" }

#正常查询:吃的苹果派和苹果汁的记录显示在最前面,而不是苹果产品显示在最前面
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match":{"content":"apple"}
      }
    }
  }
}

#让苹果产品显示在最前面,但是,吃的苹果或苹果派不显示了!!!不合理啊!
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match":{"content":"apple"}
      },
      "must_not": {
        "match":{"content":"pie"}
      }
    }
  }
}

9.Boosting Query

###使用boosting优先显示苹果产品,其次显示吃的苹果!!
POST news/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "content": "apple"
        }
      },
      "negative": {
        "match": {
          "content": "pie"
        }
      },
      "negative_boost": 0.5
    }
  }
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html
https://www.elastic.co/guide/en/elasticsearch/reference/7.1/query-dsl-boosting-query.html

二、单字符串多字段查询:Dis Max Quer

https://www.elastic.co/guide/en/elasticsearch/reference/7.1/query-dsl-dis-max-query.html

1.单个字符串的查询实例

##插入数据1
PUT /blogs/_doc/1
{
    "title": "Quick brown rabbits",
    "body":  "Brown rabbits are commonly seen."
}

##插入数据2
PUT /blogs/_doc/2
{
    "title": "Keeping pets healthy",
    "body":  "My quick brown fox eats rabbits on a regular basis."
}

#使用should查询Brown fox 
#should算分的过程只是简单的对每一个子查询的分数相加,所以,不太准
POST /blogs/_search
{
    "query": {
        "bool": {
            "should": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}

2.Disjunction Max Query 查询

#Disjunction Max Query 查询
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}

3.最佳字段调优

#如果分数相同怎么办?!
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ]
        }
    }
}

#如果分数相同,可以使用tie_breaker。
POST blogs/_search
{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ],
            "tie_breaker": 0.2
        }
    }
}