Consider the following postgres (version 9.4) database:
考虑以下postgres(版本9.4)数据库:
testbase=# select * from employee;
id | name
----+----------------------------------
1 | johnson, jack
2 | jackson, john
(2 rows)
testbase=# select * from worklog;
id | activity | employee | time
----+----------------------------------+----------+----------------------------
1 | department alpha | 1 | 2018-01-27 20:32:16.512677
2 | department beta | 1 | 2018-01-27 20:32:18.112356
5 | break | 1 | 2018-01-27 20:32:22.255563
3 | department gamma | 2 | 2018-01-27 20:32:20.073173
4 | department gamma | 2 | 2018-01-27 20:32:21.05962
(5 rows)
The column 'name' in table 'employee' is of type character(32)
and unique, the column 'employee' in 'worklog' references 'id' from the table 'employee'. The column 'id' is the primary key in either table.
表'employee'中的列'name'的类型为character(32)且唯一,'worklog'中的'employee'列引用'employee'表中的'id'。列'id'是两个表中的主键。
I can see all activities from a certain employee by issuing:
我可以通过签发来查看某个员工的所有活动:
testbase=# select * from worklog where employee=(select id from employee where name='johnson, jack');
id | activity | employee | time
----+----------------------------------+----------+----------------------------
1 | department alpha | 1 | 2018-01-27 20:32:16.512677
2 | department beta | 1 | 2018-01-27 20:32:18.112356
5 | break | 1 | 2018-01-27 20:32:22.255563
(3 rows)
I would rather like to simplify the query to
我宁愿简化查询
testbase=# select * from worklog where employee='johnson, jack';
For this I would change 'employee' to type character(32)
in 'worklog' and declare 'name' as primary key in table 'employee'. Column 'employee' in 'worklog' would, of course, reference 'name' from table 'employee'.
为此,我将'employee'更改为'worklog'中的字符(32),并将'name'声明为表'employee'中的主键。当然,'worklog'中的'employee'列会引用表'employee'中的'name'。
My question:
Will every new row in 'worklog' require additional 32 bytes for name of the 'employee' or will postgres internally just keep a pointer to the foreign field without duplicating the name for every new row?
'worklog'中的每一行都需要额外的32字节用于'employee'的名称,或者postgres内部只保留一个指向外部字段的指针而不重复每个新行的名称?
I suppose that the answer for my question is somewhere in the documentation but I could not find it. It would be very helpful if someone could provide an according link.
我想我的问题的答案在文档的某处,但我找不到它。如果有人可以提供相应的链接,将会非常有帮助。
PS: I did find this thread, however, there was no link to some official documentation. The behaviour might also have changed, since the thread is now over seven years old.
PS:我确实找到了这个帖子,但是,没有链接到一些官方文档。行为也可能已经改变,因为该线程已经超过七年了。
1 个解决方案
#1
2
Postgres will store the data that you tell it to store. There are some new databases that will do compression under the hood -- and Postgres might have features to enable that (I do not know all Postgres features).
Postgres将存储您告诉它存储的数据。有一些新的数据库将在引擎盖下进行压缩 - 而Postgres可能具有启用它的功能(我不知道Postgres的所有功能)。
But, you shouldn't do this. Integer primary keys are more efficient than strings for three reasons:
但是,你不应该这样做。整数主键比字符串更有效,原因有三:
- They are fixed length in bytes.
- They are shorter.
- Collations are not an issue.
它们是固定长度的字节数。
它们更短。
整理不是问题。
Stick with your original query, but write it using a join
:
坚持原始查询,但使用连接写入:
select wl.*
from worklog wl join
employee e
on wl.employee = e.id
where e.name = 'johnson, jack';
I suggest this because this is more consistent with how SQL works and makes it easier to choose multiple employees.
我建议这样做,因为这与SQL的工作方式更加一致,并且更容易选择多个员工。
If you want to see the name and not the id, create a view (say v_worklog
) and add in the employee name.
如果要查看名称而不是ID,请创建一个视图(例如v_worklog)并添加员工姓名。
#1
2
Postgres will store the data that you tell it to store. There are some new databases that will do compression under the hood -- and Postgres might have features to enable that (I do not know all Postgres features).
Postgres将存储您告诉它存储的数据。有一些新的数据库将在引擎盖下进行压缩 - 而Postgres可能具有启用它的功能(我不知道Postgres的所有功能)。
But, you shouldn't do this. Integer primary keys are more efficient than strings for three reasons:
但是,你不应该这样做。整数主键比字符串更有效,原因有三:
- They are fixed length in bytes.
- They are shorter.
- Collations are not an issue.
它们是固定长度的字节数。
它们更短。
整理不是问题。
Stick with your original query, but write it using a join
:
坚持原始查询,但使用连接写入:
select wl.*
from worklog wl join
employee e
on wl.employee = e.id
where e.name = 'johnson, jack';
I suggest this because this is more consistent with how SQL works and makes it easier to choose multiple employees.
我建议这样做,因为这与SQL的工作方式更加一致,并且更容易选择多个员工。
If you want to see the name and not the id, create a view (say v_worklog
) and add in the employee name.
如果要查看名称而不是ID,请创建一个视图(例如v_worklog)并添加员工姓名。