當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

Mysql的分库分表(基于shardingsphere)

發布時間：2025/3/15 数据库 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 Mysql的分库分表(基于shardingsphere) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、名詞解釋

庫：database；表：table；分庫分表：sharding

二、數據庫架構演變

剛開始我們只用單機數據庫就夠了，隨后面對越來越多的請求，我們將數據庫的寫操作和讀操作進行分離，使用多個從庫副本（Slaver Replication）負責讀，使用主庫（Master）負責寫，從庫從主庫同步更新數據，保持數據一致。架構上就是數據庫主從同步。從庫可以水平擴展，所以更多的讀請求不成問題。但是當用戶量級上來后，寫請求越來越多，該怎么辦？加一個Master是不能解決問題的，因為數據要保存一致性，寫操作需要2個master之間同步，相當于是重復了，而且更加復雜。這時就需要用到分庫分表（sharding），對寫操作進行切分。

三、分庫分表前的問題

任何問題都是太大或者太小的問題，我們這里面對的數據量太大的問題。

用戶請求量太大
因為單服務器TPS，內存，IO都是有限的。解決方法：分散請求到多個服務器上；其實用戶請求和執行一個sql查詢是本質是一樣的，都是請求一個資源，只是用戶請求還會經過網關，路由，http服務器等。

單庫太大
單個數據庫處理能力有限；單庫所在服務器上磁盤空間不足；單庫上操作的IO瓶頸解決方法：切分成更多更小的庫

單表太大
CRUD都成問題；索引膨脹，查詢超時解決方法：切分成多個數據集更小的表。

四、分庫分表的方式方法

一般就是垂直切分和水平切分，這是一種結果集描述的切分方式，是物理空間上的切分。我們從面臨的問題，開始解決，闡述：首先是用戶請求量太大，我們就堆機器搞定（這不是本文重點）。然后是單個庫太大，這時我們要看是因為表多而導致數據多，還是因為單張表里面的數據多。如果是因為表多而數據多，使用垂直切分，根據業務切分成不同的庫。如果是因為單張表的數據量太大，這時要用水平切分，即把表的數據按某種規則切分成多張表，甚至多個庫上的多張表。分庫分表的順序應該是先垂直分，后水平分。因為垂直分更簡單，更符合我們處理現實世界問題的方式。

1. 垂直拆分

垂直分表
也就是“大表拆小表”，基于列字段進行的。一般是表中的字段較多，將不常用的，數據較大，長度較長（比如text類型字段）的拆分到“擴展表“。一般是針對那種幾百列的大表，也避免查詢時，數據量太大造成的“跨頁”問題。

垂直分庫（就是每個業務專門建庫，不要把多個業務的表糅合在一個庫里）
垂直分庫針對的是一個系統中的不同業務進行拆分，比如用戶User一個庫，商品Producet一個庫，訂單Order一個庫。切分后，要放在多個服務器上，而不是一個服務器上。為什么？我們想象一下，一個購物網站對外提供服務，會有用戶，商品，訂單等的CRUD。沒拆分之前，全部都是落到單一的庫上的，這會讓數據庫的單庫處理能力成為瓶頸。按垂直分庫后，如果還是放在一個數據庫服務器上，隨著用戶量增大，這會讓單個數據庫的處理能力成為瓶頸，還有單個服務器的磁盤空間，內存，tps等非常吃緊。所以我們要拆分到多個服務器上，這樣上面的問題都解決了，以后也不會面對單機資源問題。數據庫業務層面的拆分，和服務的“治理”，“降級”機制類似，也能對不同業務的數據分別的進行管理，維護，監控，擴展等。數據庫往往最容易成為應用系統的瓶頸，而數據庫本身屬于“有狀態”的，相對于Web和應用服務器來講，是比較難實現“橫向擴展”的。數據庫的連接資源比較寶貴且單機處理能力也有限，在高并發場景下，垂直分庫一定程度上能夠突破IO、連接數及單機硬件資源的瓶頸。

水平拆分

水平分表
針對數據量巨大的單張表（比如訂單表），按照某種規則（RANGE,HASH取模等），切分到多張表里面去。但是這些表還是在同一個庫中，所以庫級別的數據庫操作還是有IO瓶頸。不建議采用。
水平分庫分表
將單張表的數據切分到多個服務器上去，每個服務器具有相應的庫與表，只是表中數據集合不同。水平分庫分表能夠有效的緩解單機和單庫的性能瓶頸和壓力，突破IO、連接數、硬件資源等的瓶頸。
水平分庫分表切分規則
- RANGE
  從0到10000一個表，10001到20000一個表；
- HASH取模
  一個商場系統，一般都是將用戶，訂單作為主表，然后將和它們相關的作為附表，這樣不會造成跨庫事務之類的問題。取用戶id，然后hash取模，分配到不同的數據庫上。
- 地理區域
  比如按照華東，華南，華北這樣來區分業務，七牛云應該就是如此。
- 時間
  按照時間切分，就是將6個月前，甚至一年前的數據切出去放到另外的一張表，因為隨著時間流逝，這些表的數據被查詢的概率變小，所以沒必要和“熱數據”放在一起，這個也是“冷熱數據分離”。

水平分庫
水平分表（這些表還是在同一個庫中，所以庫級別的數據庫操作還是有IO瓶頸）

注意：隨著數據庫數據量增加，不要馬上考慮做水平切分，首先考慮緩存處理，讀寫分離，使用索引等等方式，如果這些方式不能根本解決問題了，再考慮做水平分庫和水平分表

五、分庫分表后面臨的問題

事務支持
分庫分表后，就成了分布式事務了。如果依賴數據庫本身的分布式事務管理功能去執行事務，將付出高昂的性能代價；如果由應用程序去協助控制，形成程序邏輯上的事務，又會造成編程方面的負擔。

多庫(多數據源問題)結果集合并（group by，order by）

跨庫join(跨節點連接查詢問題（分頁、排序）)
分庫分表后表之間的關聯操作將受到限制，我們無法join位于不同分庫的表，也無法join分表粒度不同的表，結果原本一次查詢能夠完成的業務，可能需要多次查詢才能完成。粗略的解決方法：全局表：基礎數據，所有庫都拷貝一份。字段冗余：這樣有些字段就不用join去查詢了。系統層組裝：分別查詢出所有，然后組裝起來，較復雜。

六、Mysql分庫分表例子(shardingsphere)

shardingsphere官網

水平分表

數據庫創建兩張一樣結構的表

application.properties配置

# shardingjdbc分片策略 # 配置數據源，給數據源起名稱 spring.shardingsphere.datasource.names=m1 # 一個實體類對應兩張表，覆蓋 spring.main.allow-bean-definition-overriding=true #配置數據源具體內容，包含連接池，驅動，地址，用戶名和密碼 spring.shardingsphere.datasource.m1.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.m1.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.m1.url=jdbc:mysql://localhost:63306/sharding?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.m1.username=root spring.shardingsphere.datasource.m1.password=root#指定course表分布情況，配置表在哪個數據庫里面，表名稱都是什么 m1.course_1 , m1.course_2 spring.shardingsphere.sharding.tables.course.actual-data-nodes=m1.course_$->{1..2}# 指定course表里面主鍵cid 生成策略 SNOWFLAKE spring.shardingsphere.sharding.tables.course.key-generator.column=cid spring.shardingsphere.sharding.tables.course.key-generator.type=SNOWFLAKE# 指定分片策略約定cid值偶數添加到course_1表，如果cid是奇數添加到course_2表 spring.shardingsphere.sharding.tables.course.table-strategy.inline.sharding-column=cid spring.shardingsphere.sharding.tables.course.table-strategy.inline.algorithm-expression=course_$->{cid % 2 + 1}# 打開sql輸出日志 spring.shardingsphere.props.sql.show=true

測試：

@SpringBootTest class ShardingApplicationTests {@AutowiredCourseMapper courseMapper;@Testvoid contextLoads() {for (long i = 0 ; i<20; i++){courseMapper.insert(new Course(null,String.valueOf(i+1),i,String.valueOf(i)));}} }

course_1

course_2

水平分庫分表
分別創建sharding_1庫，sharding_2庫，每個庫下再創建course_1表，course_2表，course_3表

application.properties配置

# shardingjdbc分片策略 # 配置數據源，給數據源起名稱, # 水平分庫，配置兩個數據源(分別對應shardin_g1,sharding_2) spring.shardingsphere.datasource.names=m1,m2#配置第一個數據源具體內容，包含連接池，驅動，地址，用戶名和密碼 spring.shardingsphere.datasource.m1.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.m1.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.m1.url=jdbc:mysql://localhost:63306/sharding_1?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.m1.username=root spring.shardingsphere.datasource.m1.password=root#配置第二個數據源具體內容，包含連接池，驅動，地址，用戶名和密碼 spring.shardingsphere.datasource.m2.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.m2.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.m2.url=jdbc:mysql://localhost:63306/sharding_2?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.m1.username=root spring.shardingsphere.datasource.m1.password=root#指定數據庫分布情況，數據庫里面表分布情況 # m1 m2 course_1 course_2 course_3 spring.shardingsphere.sharding.tables.course.actual-data-nodes=m$->{1..2}.course_$->{1..3}# 指定course表里面主鍵cid 生成策略 SNOWFLAKE spring.shardingsphere.sharding.tables.course.key-generator.column=cid spring.shardingsphere.sharding.tables.course.key-generator.type=SNOWFLAKE# 指定表分片策略約定cid%3=0加到course_1表，如果cid%3=1添加到course_2表, 如果cid%3=2添加到course_3表 spring.shardingsphere.sharding.tables.course.table-strategy.inline.sharding-column=cid spring.shardingsphere.sharding.tables.course.table-strategy.inline.algorithm-expression=course_$->{cid % 3 + 1}# 指定庫分片策略約定user_id是偶數添加m1，是奇數添加m2 spring.shardingsphere.sharding.tables.course.database-strategy.inline..sharding-column=user_id spring.shardingsphere.sharding.tables.course.database-strategy.inline.algorithm-expression=m$->{user_id % 2 + 1}# 打開sql輸出日志 spring.shardingsphere.props.sql.show=true

測試：

這里只展示userid為偶數的數據（完美實現分庫分表）

測試獲取一條數據

@Testvoid get(){Course course = courseMapper.selectById(631501645238239232L);}

可以看到，因為無法確定該數據是在哪個庫（因為我們查詢的時候并沒有指定分庫時用的userid），所以shardingproxy就到兩個庫里找，但是根據cid，shardingproxy能確定在course_3庫，所以，只在course_3里查找。
參考文章

公共表
公共表屬于系統中數據量較小，變動少，而且屬于高頻聯合查詢的依賴表。參數表、數據字典表等屬于此類型。可以將這類表在每個數據庫都保存一份，所有更新操作都同時發送到所有分庫執行。

創建dict表

添加以下配置：

# 添加公共表 spring.shardingsphere.sharding.broadcast-tables=dict spring.shardingsphere.sharding.tables.dict.key-generator.column=dict_id spring.shardingsphere.sharding.tables.dict.key-generator.type=SNOWFLAKE

測試

@Testvoid insertCommon(){dictMapper.insert(new Dict(null,"1","1","1"));}

結果可以看到，兩個數據源中都插了該信息

刪除的時候也是全部刪除

面對日益增加的系統訪問量，數據庫的吞吐量面臨著巨大瓶頸。對于同一時刻有大量并發讀操作和較少寫操作類型的應用系統來說，將數據庫拆分為主庫和從庫，主庫負責處理事務性的增刪改操作，從庫負責處理查詢操作，能夠有效的避免由數據更新導致的行鎖，使得整個系統的查詢性能得到極大的改善

可以看到，讀寫分離需要配合數據庫的主從復制，而Sharding-JDBC讀寫分離則是根據SQL語義的分析，將讀操作（select）和寫操作(insert update)分別路由至主庫與從庫，它提供透明化的讀寫分離，讓使用方盡量使用一個數據庫一樣使用主從數據庫集群。

在這里我部署了兩臺mysql從機，和一臺mysql主機，主機負責寫，兩臺從機同步主機數據并負責讀。

結構如下：

數據分離+讀寫分離的配置文件

# shardingjdbc分片策略 # 配置數據源，給數據源起名稱 # 水平分庫 spring.shardingsphere.datasource.names=m1,m2,s1,s2,s3,s4##配置數據源具體內容，包含連接池，驅動，地址，用戶名和密碼 spring.shardingsphere.datasource.m1.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.m1.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.m1.url=jdbc:mysql://ip:13306/sharding_1?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.m1.username=root spring.shardingsphere.datasource.m1.password=rootspring.shardingsphere.datasource.m2.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.m2.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.m2.url=jdbc:mysql://ip:13306/sharding_2?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.m2.username=root spring.shardingsphere.datasource.m2.password=rootspring.shardingsphere.datasource.s1.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.s1.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.s1.url=jdbc:mysql://ip:23306/sharding_1?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.s1.username=root spring.shardingsphere.datasource.s1.password=rootspring.shardingsphere.datasource.s2.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.s2.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.s2.url=jdbc:mysql://ip:23306/sharding_2?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.s2.username=root spring.shardingsphere.datasource.s2.password=rootspring.shardingsphere.datasource.s3.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.s3.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.s3.url=jdbc:mysql://ip:33306/sharding_1?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.s3.username=root spring.shardingsphere.datasource.s3.password=rootspring.shardingsphere.datasource.s4.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.s4.driver-class-name=com.mysql.cj.jdbc.Driver spring.shardingsphere.datasource.s4.url=jdbc:mysql://ip:33306/sharding_2?useUnicode=true&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8&useSSL=true&characterEncoding=UTF-8&rewriteBatchedStatements=true spring.shardingsphere.datasource.s4.username=root spring.shardingsphere.datasource.s4.password=root#指定數據庫分布情況，數據庫里面表分布情況 # m1 m2 course_1 course_2 course_3 spring.shardingsphere.sharding.tables.course.actual-data-nodes=m$->{1..2}.course_$->{1..3} # 指定course表里面主鍵cid及其生成策略:SNOWFLAKE spring.shardingsphere.sharding.tables.course.key-generator.column=cid spring.shardingsphere.sharding.tables.course.key-generator.type=SNOWFLAKE# 指定分片策略約定cid%3=0寫入course_1表，cid%3=1寫入course_2表，cid%3=2寫入course_3表 spring.shardingsphere.sharding.tables.course.table-strategy.inline.sharding-column=cid spring.shardingsphere.sharding.tables.course.table-strategy.inline.algorithm-expression=course_$->{cid % 3 + 1}# 指定數據庫分片策略約定user_id是偶數添加m1，是奇數添加m2 spring.shardingsphere.sharding.tables.course.database-strategy.inline..sharding-column=user_id spring.shardingsphere.sharding.tables.course.database-strategy.inline.algorithm-expression=m$->{user_id % 2 + 1}spring.main.allow-bean-definition-overriding=true# 指定公共表 spring.shardingsphere.sharding.broadcast-tables=dict spring.shardingsphere.sharding.tables.dict.key-generator.column=dict_id spring.shardingsphere.sharding.tables.dict.key-generator.type=SNOWFLAKE# 綁定主從庫關系 spring.shardingsphere.sharding.master-slave-rules.m1.master-data-source-name=m1 spring.shardingsphere.sharding.master-slave-rules.m1.slave-data-source-names=s1,s3 spring.shardingsphere.sharding.master-slave-rules.m1.load-balance-algorithm-type=ROUND_ROBIN spring.shardingsphere.sharding.master-slave-rules.m2.master-data-source-name=m2 spring.shardingsphere.sharding.master-slave-rules.m2.slave-data-source-names=s2,s4# 打開sql輸出日志 spring.shardingsphere.props.sql.show=true

查詢：

@Testvoid get(){LambdaQueryWrapper<Course> queryWrapper=new LambdaQueryWrapper();queryWrapper.eq(Course::getUserId, 43);System.out.println(courseMapper.selectOne(queryWrapper));}

可以看到，由于是使用user_id進行查詢的，sharding根據43自然知道只需要在user_id為單數的那個庫查找即可，而且由于查詢是輪詢的，所以可以看到兩個從庫被輪詢查找，但是每次查找的表是不一樣的

插入

@Testvoid contextLoads() {for (long i = 102 ; i<112; i++){Course course=new Course(null,String.valueOf(i),i+1,String.valueOf(i));courseMapper.insert(course);}}

可以看到，只在m1，m2之間輪詢插入數據，而不會在從庫(s1,s2,s3,s4)進行插入

更詳細的配置文件參考官網

總結

以上是生活随笔為你收集整理的Mysql的分库分表(基于shardingsphere)的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。