Here is my situation:
这是我的情况:
- I have 4 queries running on a php dashboard showing asynchronous data.
- All 4 queries are fired simultaneously and on demand throughout the day.
- Each query joins around 3-5 tables (applying unique where conditions depending on the query).
- All 4 queries share 2 or 3 of the same tables...thus there is a lot of repeated code between the 4 queries.
- Each table can have up to 6 columns, hundreds of millions of rows, and up to 5gigs of data total
- Queries should return in under 30 seconds when NOT relying on cache, and a couple seconds or less when they can use the cache
我在php仪表板上运行了4个查询,显示异步数据。
所有4个查询都会在一天内同时按需激活。
每个查询加入3-5个表(根据查询在条件中应用唯一)。
所有4个查询共享2个或3个相同的表...因此在4个查询之间存在大量重复的代码。
每个表最多可包含6列,数亿行和最多5 GB的数据
不依赖缓存时,查询应该在30秒内返回,而当它们可以使用缓存时,查询应该在几秒或更短的时间内返回
My question is:
我的问题是:
Would it be possible/beneficial to apply inheritance to optimize queries? I could have a view that is a query for the 2-3 common tables all 4 of my main queries depend on, and then have 4 smaller queries that just apply their unique extra tables and conditions?
应用继承来优化查询是否可能/有益?我可以有一个视图,它是对我所有4个主要查询所依赖的2-3个公共表的查询,然后有4个较小的查询只应用它们唯一的额外表和条件?
Pros:
- It would be easier to maintain the queries down the road because I could just update the one view instead of repeatedly making changes in all 4 queries
- It might be faster because that one stored procedure/view would end up being cached most of the time
由于我可以只更新一个视图而不是在所有4个查询中重复进行更改,因此更容易维护查询
它可能更快,因为一个存储过程/视图最终会在大多数时间被缓存
Cons:
- Now instead of running 4 queries I would essentially be running 8 queries (since every time you run a query that calls the view/stored procedure you have to run that one too)
- Using the view might cause me to pull tons of extra data for every query
现在不是运行4个查询,而是基本上运行8个查询(因为每次运行调用视图/存储过程的查询时,您也必须运行该查询)
使用该视图可能会导致我为每个查询提取大量额外数据
Here's an example of two similar queries:
以下是两个类似查询的示例:
SELECT
SUM(visitor_cart_items.quantity * (TRIM(REPLACE(visitor_cart_items.price, '$', '')) + 0.0))
FROM
(SELECT DISTINCT
visitor_cart_items.id,
visitor_cart_items.price,
visitor_cart_items.quantity
FROM
messages_sent
JOIN session_guid ON session_guid.visitor_forms_id = messages_sent.visitor_forms_id
JOIN session_guid sessions ON sessions.session_guid = session_guid.session_guid
JOIN visitor_cart_items ON visitor_cart_items.cart_id = sessions.visitor_forms_id
WHERE
sessions.status_type_id = 1
AND sessions.website_id = 7
AND messages_sent.status_id = 1
AND sessions.updated_timestamp BETWEEN '2015-05-01' AND '2015-05-08') visitor_cart_items;
and
SELECT
COUNT(DISTINCT sessions.visitor_forms_id) all_saves
FROM
messages_sent
JOIN session_guid ON session_guid.visitor_forms_id = messages_sent.visitor_forms_id
JOIN session_guid sessions ON sessions.session_guid = session_guid.session_guid
JOIN visitor_forms ON visitor_forms.id = sessions.visitor_forms_id
WHERE
sessions.status_type_id = 1
AND sessions.website_id = 7
AND visitor_forms.form_type_id = 1 #1 for forms, 2 for carts
AND sessions.updated_timestamp BETWEEN '2015-05-01' AND '2015-05-08';
1 个解决方案
#1
My experience has been that trying to apply inheritance/OO style thinking at the database level is a very bad thing to do.
我的经验是尝试在数据库级别应用继承/ OO风格思考是一件非常糟糕的事情。
In general, simpler views tend to be OK... the query optimizers on different DBMS's are good at 'boiling down' the views and giving you good performance.
一般来说,更简单的视图往往是正常的......不同DBMS上的查询优化器擅长“煮沸”视图并为您提供良好的性能。
However, when you get more complicated and start using views with aggregate functions, user-defined functions and other errata to achieve code reuse, it usually comes at huge I/O and query plan expense. We went down that road and had to rip out those things and retreat. It's just not worth it in my opinion. Code duplication is a better problem to have than a horrendous query plan.
但是,当您变得更复杂并开始使用具有聚合函数,用户定义函数和其他勘误的视图来实现代码重用时,它通常会带来巨大的I / O和查询计划开销。我们走了那条路,不得不撕掉那些东西然后撤退。在我看来,这是不值得的。代码重复比一个可怕的查询计划更好。
#1
My experience has been that trying to apply inheritance/OO style thinking at the database level is a very bad thing to do.
我的经验是尝试在数据库级别应用继承/ OO风格思考是一件非常糟糕的事情。
In general, simpler views tend to be OK... the query optimizers on different DBMS's are good at 'boiling down' the views and giving you good performance.
一般来说,更简单的视图往往是正常的......不同DBMS上的查询优化器擅长“煮沸”视图并为您提供良好的性能。
However, when you get more complicated and start using views with aggregate functions, user-defined functions and other errata to achieve code reuse, it usually comes at huge I/O and query plan expense. We went down that road and had to rip out those things and retreat. It's just not worth it in my opinion. Code duplication is a better problem to have than a horrendous query plan.
但是,当您变得更复杂并开始使用具有聚合函数,用户定义函数和其他勘误的视图来实现代码重用时,它通常会带来巨大的I / O和查询计划开销。我们走了那条路,不得不撕掉那些东西然后撤退。在我看来,这是不值得的。代码重复比一个可怕的查询计划更好。