mysql - 表分区与“手动”表分区

I have the following choice:


I have a huge table (9999999999999 rows), let's call it tableHuge, and I would like to split it into multiple tables (to optimize queries). This table contains dates (days of the month), and most queries are made using a specified month as search key in select. This leads me to the following choices:


Choice one: Split the table into multiple tables, using a month as his tail (like lessHugeTable_01, lessHugeTable_02, etc.). Then I can take care in my app to access table that I need. The main downside is loosing the ability to join, in cases that includes more than one month (or join with an union... well.. complications).


Choice two: Use table partitioning.


Since I never used partitioning before (so I don't have knowledge to compare), I would like some advice on how to do it, pros and cons if possible (except obvious things like "if your manual partition table gets broken you loose only that data while in table part you loose whole data").


Thank you for your time.


The answer here is really "depends".


More specifically it depends on the nature of your data, what accesses your data and how that data is accessed.


From the sounds of it you might be best off with a table partitioned by year and month. I am making wild assumptions here that you will need to access older data less frequently/never and hence will be able to archive it off to keep data volumes down in your main table (like I said "depends"!);


If your table is, and always will be, accessed by one application alone into which you can build logic to handle your 'tail' naming conventions then you might want to go down the multiple tables route.


Here is how I see pros and cons stacking up:


Multiple Tables Pros

  1. Smaller individual table if just selecting data for a single month
  2. 如果只选择一个月的数据,则表格较小

  3. Errr. I can only think of one actually
  4. Errr。我实际上只能想到一个

Multiple Tables Cons

  1. Difficulty in querying/updating multi-month datasets
  2. 查询/更新多月数据集的难度

  3. What happens if you get data from January in the February table? "But it will never happen!". Really? Really?!
  4. 如果从2月表中的1月份获得数据会发生什么? “但它永远不会发生!”真?真?!

  5. If multiple applications need to access these tables then they will all have to have your 'tail' naming convention logic in place i.e. lessHugeTable_02 has data from February in it.
  6. 如果多个应用程序需要访问这些表,那么它们都必须具有“尾部”命名约定逻辑,即lessHugeTable_02具有来自二月的数据。

and now partitioning:


Partitioning Tables Pros

  1. You're letting MySQL take care of your data sharding for you. So no "this month = this table" logic needed in your application
  2. 您正在让MySQL为您处理数据分片。因此,您的应用程序中不需要“本月=此表”逻辑

  3. No risk of January data getting into the February table
  4. 没有1月数据进入2月表的风险

  5. Joining becomes easier since you have a single logical (if not physical) table
  6. 由于您拥有单个逻辑(如果不是物理)表,因此加入变得更容易

  7. If you are using MySQL 5.5 or newer then you can truncate partitions. Really handy for any housekeeping you might want to do
  8. 如果您使用的是MySQL 5.5或更高版本,则可以截断分区。对于您可能想做的任何家务管理来说真的很方便

Partitioning Tables Cons

  1. Potentially you have a much much larger dataset to query. If you run a query that spans multiple partitions then it will probably take a while. Choose your partition key wisely!
  2. 您可能要查询的数据集要大得多。如果您运行跨越多个分区的查询,那么可能需要一段时间。明智地选择你的分区键!

  3. Probably more but I am running out of time and wild assumptions!
  4. 可能更多,但我没有时间和疯狂的假设!

PS There is a good answer on some of the points here




