TiSpark v2.4.x 升级到 TiSpark v2.5.x

网友投稿 1015 2022-10-01

TiSpark v2.4.x 升级到 TiSpark v2.5.x

TiSpark v2.4.x 升级到 TiSpark v2.5.x

作者: 边城元元 ​

一、背景

在安装 TiDB v6.0的时候,使用 Tiup 扩容的方式安装TiSpark集群,最高的版本是 TiSpark v2.4.1,没有最新的 Release TiSpark v2.5.1 。另外,TiSpark v2.5.0 及以上版本实现了部分鉴权与授权功能。

本次主要是体验

TiSpark v2.4.1 升级到 TiSpark v2.5.1体验 TiSpark v2.5.1 的鉴权和授权功能

二、准备环境

2.1 安装 Cluster111 (V6.0.0)

2.1.1 Cluster111 拓扑

# cluster111.ymlserver_configs: tidb: log.slow-threshold: 300 binlog.enable: false binlog.ignore-error: false tikv: readpool.storage.use-unified-pool: false readpool.coprocessor.use-unified-pool: true pd: schedule.leader-schedule-limit: 4 schedule.region-schedule-limit: 2048 schedule.replica-schedule-limit: 64 replication.location-labels: - hostpd_servers: - host: 10.0.2.15 # ssh_port: 22 # name: "pd-1" client_port: 2379 # peer_port: 2380tidb_servers: - host: 10.0.2.15tikv_servers: - host: 10.0.2.15 # ssh_port: 22 port: 20160 status_port: 20180 config: server.grpc-concurrency: 4monitoring_servers: - host: 10.0.2.15grafana_servers: - host: 10.0.2.15alertmanager_servers: - host: 10.0.2.15

2.1.2 安装 Cluster111

# 安装tiupcurl --proto '=--tlsv1.2 -sSf | shsource /root/.bash_profiletiup update clustertiup cluster list# 检测环境配置并尝试修正tiup cluster check ./cluster111.yml --user root -p --apply# 安装cluster111tiup cluster deploy cluster111 v6.0.0 ./cluster111.yml --user root -p# 启动集群tiup cluster start cluster111tiup cluster display cluster111

2.2 TiSpark v2.4.1

2.2.1 拓扑

# cluster111-v6.0.0-tispark.ymltispark_masters: - host: 10.0.2.15 ssh_port: 22 port: 7077# NOTE: multiple worker nodes on the same host is not supported by Sparktispark_workers: - host: 10.0.2.15

2.2.2 安装 TiSpark

安装openjdk8 (略)扩容的方式安装 TiSpark

tiup cluster scale-out cluster111 ./cluster111-v6.0.0-tispark.yml -uroot -p

2.3 测试 Spark v2.4.3 Standalone

spark-defaults.conf 中增加配置

# sql扩展类spark.sql.extensions org.apache.spark.sql.TiExtensions# master节点spark.master spark://10.0.2.15:7077# pd节点 多个pd用逗号隔开 如:10.16.20.1:2379,10.16.20.2:2379,10.16.20.3:2379spark.tispark.pd.addresses 10.0.2.15:2379

启动 Spark 集群

​​/tidb-deploy/tispark-master-7077/sbin/start-all.sh​​

启动Spark-shell

# 启动 spark-shell/tidb-deploy/tispark-master-7077/bin/spark-shell# 执行 spark.sql("select ti_version()").collect

启动 Spark-sql

# 启动 Spark-sql/tidb-deploy/tispark-master-7077/bin/spark-sql# 执行 select ti_version();

三、升级 TiSpark

3.1 -升级软件

# - Spark V3.1.3curl -L "-O spark-3.1.3-bin-hadoop3.2.tgz# - TiSpark V2.5.1curl -L "-O tispark-assembly-3.1-2.5.1.jar

3.2 备份

\cp -rf /tidb-deploy/tispark-master-7077 /tidb-deploy/tispark-master-7077-bak2.4.1

3.3 升级

# 替换 Sparkmkdir -p /usr/local0/webserver/tispark && tar -zxvf spark-3.1.3-bin-hadoop3.2.tgz -C /usr/local0/webserver/tispark/mv /usr/local0/webserver/tispark/spark-3.1.3-bin-hadoop3.2 /tidb-deploy/tispark-master-7077chown tidb.tidb -R /tidb-deploy/tispark-master-7077# 替换 TiSpark 包cp -rf tispark-assembly-3.1-2.5.1.jar /tidb-deploy/tispark-master-7077/jars/# 配置文件cp -rf /tidb-deploy/tispark-master-7077-bak2.4.1/conf/* /tidb-deploy/tispark-master-7077/conf/

3.4 测试

启动 Spark 集群

​​/tidb-deploy/tispark-master-7077/sbin/start-all.sh​​

启动Spark-shell

# 启动 spark-shell/tidb-deploy/tispark-master-7077/bin/spark-shell# 执行 spark.sql("select ti_version()").collect

启动 Spark-sql

# 启动 Spark-sql/tidb-deploy/tispark-master-7077/bin/spark-sql# 执行 select ti_version();

四、测试 TiSpark v2.5.1 鉴权

参考:​​and authentication through TiDB serverThe database's user account must have the​​PROCESS​​ privilege.TiSpark version >= 2.5.0Spark version = 3.0.x or 3.1.x

4.1 增加配置 ​​spark-defaults.conf​​

spark.sql.tidb.addr 10.0.2.15spark.sql.tidb.port 4000spark.sql.tidb.user rootspark.sql.tidb.password abc# Must config in conf filespark.sql.auth.enable true# in seconds. Values range from 5 to 3600spark.sql.tidb.auth.refreshInterval 30

4.2 配置错误密码

#这里是错误的密码spark.sql.tidb.password abc

启动 spark-sql 后使用 执行 sql 语句将报错

4.3 修正密码

# 空密码spark.sql.tidb.password # 开启下面的 30s 将刷新一下(仅对新连接的spark-sql 使用新配置的 spark.sql.tidb.password)spark.sql.tidb.auth.refreshInterval 30

启动 spark-sql

/tidb-deploy/tispark-master-7077/bin/spark-sql

use tidb_catalog;show databases;

select 'CUSTOMER' tablename , count(*) ct from tidb_catalog.TPCH_001.CUSTOMER union allselect 'NATION' tablename , count(*) ct from tidb_catalog.TPCH_001.NATION union allselect 'REGION' tablename , count(*) ct from tidb_catalog.TPCH_001.REGION union allselect 'PART' tablename , count(*) ct from tidb_catalog.TPCH_001.PART union allselect 'SUPPLIER' tablename , count(*) ct from tidb_catalog.TPCH_001.SUPPLIER union allselect 'PARTSUPP' tablename , count(*) ct from tidb_catalog.TPCH_001.PARTSUPP union allselect 'ORDERS' tablename , count(*) ct from tidb_catalog.TPCH_001.ORDERS union allselect 'LINEITEM' tablename , count(*) ct from tidb_catalog.TPCH_001.LINEITEM order by ct desc;

4.4 SparkSession 中配置密码

spark.sqlContext.setConf("spark.sql.tidb.addr", your_tidb_server_address)spark.sqlContext.setConf("spark.sql.tidb.port", your_tidb_server_port)spark.sqlContext.setConf("spark.sql.tidb.user", your_tidb_server_user)spark.sqlContext.setConf("spark.sql.tidb.password", your_tidb_server_password)

4.5 限制

不能与 TiDB 以外的其他数据源一起工作不支持基于角色的权限TiDB Data Source API 不支持,例如 TiBatchWrite

五、总结

本篇实践了 tiup list tispark --all 没有 TiSpark v2.5.x的情况下,升级到 TiSpark v2.5.1;同时试用了 TiSpark v2.5.x 新支持的鉴权特性。

谢谢!

参考

​​2.4.5)到TiSpark 2.5.0(Spark 3.0.X/3.1.X)迁移实践

​​https://github.com/pingcap/tispark/blob/master/docs/authorization_userguide.md​​

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:从前端的角度来梳理微信支付的流程(微信支付操作流程)
下一篇:TiDB中如何查看database级别的QPS
相关文章

 发表评论

暂时没有评论,来抢沙发吧~