ElasticSearch实用篇整体栏目
内容 | 链接地址 |
---|---|
【一】ElasticSearch实用篇-需求分析和数据制造 | https://zhenghuisheng.blog.csdn.net/article/details/149178534 |
spring配置类实现原理
如需转载,请附上链接:https://blog.csdn.net/zhenghuishengq/article/details/149178534
一,【ElasticSearch实用篇】需求分析和数据制造
为了更加的熟练elasticSearch,掌握其语法,底层原理,实际业务开发等,接下来的系列就是通过实操来对es进行深度学习
1,需求分析
1.1,业务分析
假设我需要做一个简单的相亲用户平台,然后会涉及到用户的筛选,比如用户的性别,年龄,身高,体重,学历,老家,工作城市等基本信息。接下来就以这个维度的需求,来深度的学习一下es,熟悉es语法的使用和原理。
那么根据上面的需求,es需要存储的字段就如下:索引名称为user,在mysql数据库类似于表名
- 在设置mapping映射属性时,如果是基本属性可以设置成基本属性即可,比如Long,Integer等;
- 如果需要精确查询,可以直接设置成keyword,那么就不会分词,那么就可以通过term精确查找;
- 如果设置成text属性,那么就会通过对应的分词器进行分词,那么后期得通过match查找
@Data
@Document(indexName = "user")
public class UserEO {
@Id
@Field(type = FieldType.Long)
private Long id;
@Field(type = FieldType.Keyword)
private String nickName;
/**
* 性别:1=男,0=女
*/
@Field(type = FieldType.Integer)
private Integer sex;
/**
* 出生-年
*/
@Field(type = FieldType.Integer)
private Integer birthYear;
/**
* 出生-月
*/
@Field(type = FieldType.Integer)
private Integer birthMonth;
/**
* 出生-日
*/
@Field(type = FieldType.Integer)
private Integer birthDay;
/**
* 身高
*/
@Field(type = FieldType.Integer)
private Integer height;
/**
* 体重
*/
@Field(type = FieldType.Integer)
private Integer weight;
/**
* 学历: 3=大专以下,4=大专,5=大学本科,6=硕士,7=博士
*/
@Field(type = FieldType.Integer)
private Integer eduLevel;
/**
* 居住-省份
*/
@Field(type = FieldType.Keyword)
private String liveProvince;
/**
* 居住-城市
*/
@Field(type = FieldType.Keyword)
private String liveCity;
/**
* 老家-省份
*/
@Field(type = FieldType.Keyword)
private String regProvince;
/**
* 老家-城市
*/
@Field(type = FieldType.Keyword)
private String regCity;
/**
* 是否删除,0=未删除,1=已删除
*/
@Field(type = FieldType.Integer)
private Integer delFlag;
}
1.2,数据分析
也许在实际开发中,es中的数据是mysql数据库同步过去的,通过canal中间件同步过去的,canal伪装成mysql主节点的一个从节点,监听主节点的binlog日志,然后将数据同步过去,为了先将es的各个语法先练熟,那么先通过springboot项目手动的同步一些数据到es中,先同步10w条数据到es中
这里采用的是手动的制造用户数据,用户名和性别随机,年在1990-2010年区间,月日随机,身高在155-185,体重在100-160,学历在大专到硕士之间,省份是全国省份,城市是全部省会城市,当然数据可以动态调整。
1.3 功能实现
目标:快速实现数据查询,基于权重打分优先推出用户匹配度高的数据
- 可以动态的查询用户想要的数据,比如实现异性,同城和高学历等的优质异性,也能对身高体重的一些塞选;
- 优先推出优质男用户,比如同城异性优先推出,年龄相仿优先推出,学历相同或者更高优先推出等
- 快速响应用户想要推出的数据,实现快速响应
2,数据制造代码实现
这里采用线程池多批量插入的方式制造数据
2.1,基础配置
其详细代码如下,首先就是核心依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactI
<version>2.7.10</version> <!-- 请根据你的 Spring Boot 版本选择适当的版本 --
</dependency>
其次就是yml配置,设置域名和端口号到application.yml中统一管理
es:
param:
connect:
hostname: xx.xx.xx.xx
port: 9200
上面的配置对应的配置文件如下
@Component
@ConfigurationProperties(prefix = "es.param.connect")
@Data
public class EsConnectProperties {
private String hostname;
private Integer port;
}
配置对应的es连接文件,将es注入到spring容器中
@Configuration
@Slf4j
public class ElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
COMMON_OPTIONS = builder.build();
}
private final EsConnectProperties esConnectProperties;
public ElasticSearchConfig(EsConnectProperties esConnectProperties) {
this.esConnectProperties = esConnectProperties;
}
@Bean
public RestHighLevelClient esRestClient() {
log.info("ES配置注入完成:,{},{}", esConnectProperties.getHostname(), esConnectProperties.getPort());
//初始化配置
RestClientBuilder builder = RestClient.builder(new HttpHost(esConnectProperties.getHostname(), esConnectProperties.getPort()));
builder.setRequestConfigCallback(requestConfigBuilder ->
requestConfigBuilder.setConnectTimeout(5000).setSocketTimeout(60000));
builder.setHttpClientConfigCallback(httpClientBuilder ->
httpClientBuilder.setMaxConnTotal(100).setMaxConnPerRoute(20));
return new RestHighLevelClient(builder);
}
}
2.2,线程池和线程配置
自定义线程池,采用cpu密集型的线程池,设置阻塞队列为有界链表
@Slf4j
public class ThreadPoolUtil {
/**
* io密集型:最大核心线程数为2N,可以给cpu更好的轮换,
* 核心线程数不超过2N即可,可以适当留点空间
* cpu密集型:最大核心线程数为N或者N+1,N可以充分利用cpu资源,N加1是为了防止缺页造成cpu空闲,
* 核心线程数不超过N+1即可
* 使用线程池的时机:1,单个任务处理时间比较短 2,需要处理的任务数量很大
*/
private static ThreadPoolExecutor pool = null;
public static synchronized ThreadPoolExecutor getThreadPool() {
if (pool == null) {
//获取当前机器的cpu
int cpuNum = Runtime.getRuntime().availableProcessors();
log.info("当前机器的cpu的个数为:{}", cpuNum);
int maximumPoolSize = cpuNum * 2;
pool = new ThreadPoolExecutor(
maximumPoolSize - 2,
maximumPoolSize,
5L, //5s
TimeUnit.SECONDS,
new LinkedBlockingQueue<>(50), //数组有界队列
Executors.defaultThreadFactory(), //默认的线程工厂
new ThreadPoolExecutor.AbortPolicy()); //直接抛异常,默认异常
}
return pool;
}
}
定义线程任务,这里直接实现Runnable即可,里面包括每个属性的设置
@Slf4j
public class UserSaveTask implements Runnable {
private final UserRepository userRepository;
public UserSaveTask(UserRepository userRepository) {
this.userRepository = userRepository;
}
/**
* 批量插入10 0000条数据
*/
@Override
public void run() {
List<UserEO> list = new ArrayList<>();
//每次1000条
log.info("开始插入数据...");
for (int i = 0; i < 100; i++) {
list.add(buildUserBaseInfo());
}
userRepository.saveAll(list);
log.info("结束插入数据...");
}
/**
* 构建用户基础信息
* @return
*/
public UserEO buildUserBaseInfo() {
UserEO user = new UserEO();
//设置用户id,雪花算法
user.setId(IdUtil.getSnowflakeNextId());
user.setNickName("用户" + getRandomString(6));
//设置性别
user.setSex(ThreadLocalRandom.current().nextInt(0, 2));
//构建年月日
int year = randBetween(1990, 2010);
int month = randBetween(1, 12);
int day = getRandomDay(year, month);
user.setBirthYear(year);
user.setBirthMonth(month);
user.setBirthDay(day);
//设置身高体重
user.setHeight(randBetween(150, 185));
user.setWeight(randBetween(100, 160));
user.setEduLevel(randBetween(3, 7)); // 大专以下 ~ 硕士
//居住省份+城市
String[] live = CityUtil.getRandomCity();
user.setLiveProvince(live[0]);
user.setLiveCity(live[1]);
//老家省份+城市
String[] reg = CityUtil.getRandomCity();
user.setRegProvince(reg[0]);
user.setRegCity(reg[1]);
// 默认不被删除
user.setDelFlag(0);
return user;
}
private static String getRandomString(int length) {
String chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
StringBuilder sb = new StringBuilder(length);
ThreadLocalRandom r = ThreadLocalRandom.current();
for (int i = 0; i < length; i++) {
sb.append(chars.charAt(r.nextInt(chars.length())));
}
return sb.toString();
}
private static int randBetween(int start, int end) {
return ThreadLocalRandom.current().nextInt(start, end + 1);
}
private static int getRandomDay(int year, int month) {
// 获取当月最大天数
return randBetween(1, LocalDate.of(year, month, 1).lengthOfMonth());
}
}
城市工具类如下,只需要对应的省份和省会城市即可,这里还包含了四个直辖市
public class CityUtil {
private static final List<String[]> PROVINCE_AND_CITY_LIST = Arrays.asList(
new String[]{"北京市", "北京市"},
new String[]{"天津市", "天津市"},
new String[]{"上海市", "上海市"},
new String[]{"重庆市", "重庆市"},
new String[]{"河北省", "石家庄市"},
new String[]{"山西省", "太原市"},
new String[]{"辽宁省", "沈阳市"},
new String[]{"吉林省", "长春市"},
new String[]{"黑龙江省", "哈尔滨市"},
new String[]{"江苏省", "南京市"},
new String[]{"浙江省", "杭州市"},
new String[]{"安徽省", "合肥市"},
new String[]{"福建省", "福州市"},
new String[]{"江西省", "南昌市"},
new String[]{"山东省", "济南市"},
new String[]{"河南省", "郑州市"},
new String[]{"湖北省", "武汉市"},
new String[]{"湖南省", "长沙市"},
new String[]{"广东省", "广州市"},
new String[]{"海南省", "海口市"},
new String[]{"四川省", "成都市"},
new String[]{"贵州省", "贵阳市"},
new String[]{"云南省", "昆明市"},
new String[]{"陕西省", "西安市"},
new String[]{"甘肃省", "兰州市"},
new String[]{"青海省", "西宁市"},
new String[]{"台湾省", "台北市"},
new String[]{"内蒙古自治区", "呼和浩特市"},
new String[]{"广西壮族自治区", "南宁市"},
new String[]{"西藏自治区", "拉萨市"},
new String[]{"宁夏回族自治区", "银川市"},
new String[]{"新疆维吾尔自治区", "乌鲁木齐市"},
new String[]{"香港特别行政区", "香港"},
new String[]{"澳门特别行政区", "澳门"}
);
public static String[] getRandomCity() {
return PROVINCE_AND_CITY_LIST.get(ThreadLocalRandom.current().nextInt(PROVINCE_AND_CITY_LIST.size()));
}
}
2.3,插入数据
配置UserRepository接口,需要加上 @Repository 注解
@Repository
public interface UserRepository extends ElasticsearchRepository<UserEO, Long> {
}
随后定义一个 UserMatchService 接口,里面先定义一个插入方法
public interface UserMatchService {
AjaxResult matchSave();
}
随后实现上面的这个接口以及方法,循环向线程池中提交1000个任务
/**
*
* @Author zhenghuisheng
* @Date:2025/6/23 15:50
*/
@Service
public class UserMatchServiceImpl implements UserMatchService {
@Resource
private UserRepository userRepository;
//获取线程池
ThreadPoolExecutor threadPool = ThreadPoolUtil.getThreadPool();
/**
* 线程池批量生成100000个用户
* @return
*/
@Override
public AjaxResult matchSave() {
for (int i = 0; i < 1000; i++) {
//提交任务
threadPool.submit(new UserSaveTask(userRepository));
}
return AjaxResult.success("数据生成完毕");
}
}
最后配置Controller即可
@RestController
@RequestMapping("/es/user")
public class UserMatchController {
@Resource
private UserMatchService userMatchService;
@GetMapping("/matchSave")
public AjaxResult matchSave() {
return userMatchService.matchSave();
}
}
3,kibana查看数据
项目启动执行完上面的接口之后,可以查看一下这个索引对应的数据,其总数据如下
get /user/_count
看一下其mapping映射,就是每个字段的数据类型映射
GET /user/_mapping
{
"user" : {
"mappings" : {
"properties" : {
"_class" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"birthDay" : {
"type" : "long"
},
"birthMonth" : {
"type" : "long"
},
"birthYear" : {
"type" : "long"
},
"delFlag" : {
"type" : "long"
},
"eduLevel" : {
"type" : "long"
},
"height" : {
"type" : "long"
},
"id" : {
"type" : "long"
},
"liveCity" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"liveProvince" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"nickName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"regCity" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"regProvince" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"sex" : {
"type" : "long"
},
"weight" : {
"type" : "long"
}
}
}
}
}
查看数据,并且分页
GET /user/_search?from=1&size=5
{
"query": {
"match_all": {}
}
}
其返回数据如下
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "user",
"_type" : "_doc",
"_id" : "1937084380165083277",
"_score" : 1.0,
"_source" : {
"_class" : "com.zhs.elasticsearch.match.eo.UserEO",
"id" : 1937084380165083277,
"nickName" : "用户Rxo729",
"sex" : 0,
"birthYear" : 1998,
"birthMonth" : 6,
"birthDay" : 16,
"height" : 179,
"weight" : 153,
"eduLevel" : 4,
"liveProvince" : "河北省",
"liveCity" : "石家庄市",
"regProvince" : "辽宁省",
"regCity" : "沈阳市",
"delFlag" : 0
}
},
{
"_index" : "user",
"_type" : "_doc",
"_id" : "1937084380165083281",
"_score" : 1.0,
"_source" : {
"_class" : "com.zhs.elasticsearch.match.eo.UserEO",
"id" : 1937084380165083281,
"nickName" : "用户pLNM3B",
"sex" : 0,
"birthYear" : 2007,
"birthMonth" : 7,
"birthDay" : 14,
"height" : 172,
"weight" : 131,
"eduLevel" : 7,
"liveProvince" : "西藏自治区",
"liveCity" : "拉萨市",
"regProvince" : "内蒙古自治区",
"regCity" : "呼和浩特市",
"delFlag" : 0
}
},
{
"_index" : "user",
"_type" : "_doc",
"_id" : "1937084380165083286",
"_score" : 1.0,
"_source" : {
"_class" : "com.zhs.elasticsearch.match.eo.UserEO",
"id" : 1937084380165083286,
"nickName" : "用户yupBE5",
"sex" : 0,
"birthYear" : 1999,
"birthMonth" : 10,
"birthDay" : 29,
"height" : 166,
"weight" : 140,
"eduLevel" : 7,
"liveProvince" : "贵州省",
"liveCity" : "贵阳市",
"regProvince" : "澳门特别行政区",
"regCity" : "澳门",
"delFlag" : 0
}
},
{
"_index" : "user",
"_type" : "_doc",
"_id" : "1937084380165083290",
"_score" : 1.0,
"_source" : {
"_class" : "com.zhs.elasticsearch.match.eo.UserEO",
"id" : 1937084380165083290,
"nickName" : "用户fTGRMJ",
"sex" : 1,
"birthYear" : 2003,
"birthMonth" : 7,
"birthDay" : 9,
"height" : 182,
"weight" : 128,
"eduLevel" : 6,
"liveProvince" : "海南省",
"liveCity" : "海口市",
"regProvince" : "辽宁省",
"regCity" : "沈阳市",
"delFlag" : 0
}
},
{
"_index" : "user",
"_type" : "_doc",
"_id" : "1937084380165083295",
"_score" : 1.0,
"_source" : {
"_class" : "com.zhs.elasticsearch.match.eo.UserEO",
"id" : 1937084380165083295,
"nickName" : "用户v6ZwfS",
"sex" : 0,
"birthYear" : 1995,
"birthMonth" : 12,
"birthDay" : 11,
"height" : 173,
"weight" : 140,
"eduLevel" : 5,
"liveProvince" : "湖南省",
"liveCity" : "长沙市",
"regProvince" : "江苏省",
"regCity" : "南京市",
"delFlag" : 0
}
}
]
}
}
那么此时数据制造成功
详细代码可以直接gitee获取:https://gitee.com/zhenghuisheng/elasticsearch_study