前言:
clickhouse的安装见
https://blog.csdn.net/weixin_41463944/article/details/142534145?spm=1001.2014.3001.5501
1、需要在路径/etc/clickhouse-serve下定义*_dictionary.xml结尾的配置文件
2、需要准备organization.csv和sales.csv本地测试文件(注意不是windows下的xlsx表格,是linux下的文件)
2.1 organization.csv文件
1,"a0001","研发部"
2,"a0002","产品部"
3,"a0003","数据部"
4,"a0004","测试部"
5,"a0005","运维部"
6,"a0006","规划部",
7,"a0007","市场部"
2.2 sales.csv文件
1,2016-01-01,2017-01-10,100
2,2016-05-01,2017-07-01,200
3,2014-03-05,2018-01-20,300
4,2018-08-01,2019-10-01,400
5,2017-03-01,2017-06-01,500
6,2017-04-09,2018-05-30,600
7,2018-06-01,2019-01-25,700
8,2019-08-01,2019-12-12,800
3、关于每个xml字典文件的格式
<?xml version="1.0"?>
<dictionaries>
<dictionary>
<name>字典名称(全局唯一)</name>
<source>
数据源配置
</source>
<layout>
字典数据在内存中的格式
</layout>
<structure>
字典的数据结构
</structure>
<lifetime>
字典的自动更新频率
<min>300</min>
<max>360</max>
</lifetime>
</dictionary>
</dictionaries>
4、关于扩展字典的几种类型配置
一共7种类型,其中flat、hashed、sparse_hashed、range_hashed、cache的主键为数值型UInt64类型的,其他类型如complex_key_hashed、complex_key_cache为复合类型的key
4.1)flat类型的字典,在/etc/clickhouse-serve目录下新建test_flat_dictionary.xml
<?xml version="1.0"?>
<dictionaries>
<dictionary>
<name>test_flat_dict</name>
<source>
<file>
<path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
<format>CSV</format>
</file>
</source>
<layout>
<flat/>
</layout>
<structure>
<id>
<name>id</name>
</id>
<attribute>
<name>code</name>
<type>String</type>
<null_value></null_value>
</attribute>
<attribute>
<name>name</name>
<type>String</type>
<null_value></null_value>
</attribute>
</structure>
<lifetime>
<min>300</min>
<max>360</max>
</lifetime>
</dictionary>
</dictionaries>
4.2)hashed类型的字典,在/etc/clickhouse-serve目录下新建test_hashed_dictionary.xml
<?xml version="1.0"?>
<dictionaries>
<dictionary>
<name>test_hashed_dict</name>
<source>
<file>
<path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
<format>CSV</format>
</file>
</source>
<layout>
<hashed/>
</layout>
<structure>
<id>
<name>id</name>
</id>
<attribute>
<name>code</name>
<type>String</type>
<null_value></null_value>
</attribute>
<attribute>
<name>name</name>
<type>String</type>
<null_value></null_value>
</attribute>
</structure>
<lifetime>
<min>300</min>
<max>360</max>
</lifetime>
</dictionary>
</dictionaries>
4.3)complex_key_hashed类型的字典,在/etc/clickhouse-serve目录下新建complex_key_hashed_dictionary.xml
<?xml version="1.0"?>
<dictionaries>
<dictionary>
<name>test_complex_key_hashed_dict</name>
<source>
<file>
<path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
<format>CSV</format>
</file>
</source>
<layout>
<complex_key_hashed/>
</layout>
<structure>
<key>
<attribute>
<name>id</name>
<type>UInt64</type>
</attribute>
<attribute>
<name>code</name>
<type>String</type>
</attribute>
</key>
<attribute>
<name>name</name>
<type>String</type>
<null_value></null_value>
</attribute>
</structure>
<lifetime>
<min>300</min>
<max>360</max>
</lifetime>
</dictionary>
</dictionaries>
4.4)range_hashed类型的字典,在/etc/clickhouse-serve目录下新建test_range_hashed_dictionary.xml
<?xml version="1.0"?>
<dictionaries>
<dictionary>
<name>test_range_hashed_dict</name>
<source>
<file>
<path>/home/clickhouse-server/etc/clickhouse-server/dictionary/sales.csv</path>
<format>CSV</format>
</file>
</source>
<layout>
<range_hashed/>
</layout>
<structure>
<id>
<name>id</name>
</id>
<range_min>
<name>start</name>
</range_min>
<range_max>
<name>end</name>
</range_max>
<attribute>
<name>price</name>
<type>Float32</type>
<null_value></null_value>
</attribute>
</structure>
<lifetime>
<min>300</min>
<max>360</max>
</lifetime>
</dictionary>
</dictionaries>
5、关于字典的数据源
5.1 文件类型,格式通常为CSV或者TabSeparated
1)本地文件
<source>
<file>
<path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
<format>CSV</format>
</file>
</source>
2)可执行文件
<source>
<executable>
<command>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</command>
<format>CSV</format>
</executable>
</source>
3)远程文件
<source>
<http>
<url>xxxxx.csv</url>
<format>CSV</format>
</http>
</source>
5.2 数据库类型的数据源
1)mysql
<source>
<mysql>
<port>3306</port>
<user>xxxx</user>
<password>xxxx</password>
<replica>
<host>localhost</host>
<priority>1</priority>
</replica>
<db>db_mybatis</db>
<table>userList</table>
</mysql>
</source>
其中<mysql>
标签的下一级可添加 <where>is_deleted=0</where>
做为筛选条件
2)clickhouse
<source>
<clickhouse>
<host>localhost</host>
<port>9000</port>
<user>xxxx</user>
<password>xxxxx</password>
<db>my_database</db>
<table>test_table</table>
</clickhouse>
</source>
6、关于DDL语句
dictGet(‘字典名称’,‘字典的具体字段’,‘key键值’),如果字典为复合key的话,则在字典函数中的key用元组tuple(‘a’,‘b’,‘c’)表示
7、关于手动刷新数据字典
7.1 )全局字典刷新
system reload dictionaries
7.2)刷新单个字典(字典名称不加单引号)
system reload dictionary test_mysql_dict