在表中插入与另一个表有关系的行

In my database schema I have an entity that is identified. The identifier can be reused and thus there is a one-to-many relation with the entity. Example: A person can have a nickname. Nicknames are not unique and can be shared amongst many people. So the schema might look like:

在我的数据库模式中,我有一个已识别的实体。标识符可以重复使用,因此与实体存在一对多的关系。示例:一个人可以有一个昵称。昵称不是唯一的,可以在很多人之间共享。因此架构可能如下所示:

PERSON
id
name
nickname_id

NICKNAME
id
name

The issue is that when inserting a new person, I have to first query NICKNAME to see if the nickname exists. If it doesn't then I have to create a row in NICKNAME. When inserting many persons, this can be slow as each person insertion results in a query to NICKNAME.

问题是,在插入新人时,我必须首先查询NICKNAME以查看昵称是否存在。如果没有,那么我必须在NICKNAME中创建一行。当插入许多人时,这可能会很慢,因为每个人插入都会导致对NICKNAME的查询。

I could optimize large insertions by first querying Nickname for all the nicknames. JPA query language:

我可以通过首先查询所有昵称的昵称来优化大插入。 JPA查询语言:

SELECT n FROM NICKNAME n WHERE name in ('Krusty', 'Doppy', 'Flash', etc)

And then create the new nicknames as necessary, followed by setting nickname_id on the persons.

然后根据需要创建新的昵称,然后在人员上设置nickname_id。

This complicates the software a bit as it has to temporarily store nicknames in memory. Furthermore, some databases have a limit on the parameters of the IN clause (SQL Server is 2100 or so) so I have perform multiple queries.

这使软件变得复杂,因为它必须暂时将昵称存储在内存中。此外,一些数据库对IN子句的参数有限制(SQL Server是2100左右)所以我执行了多个查询。

I'm curious how this issue is dealt with by others. More specifically, when a database is normalized and an entity has a relationship with another, inserting a new entity basically results in having to check the other entity. For large inserts this can be slow unless the operation is lifted into the code domain. Is there someway to auto insert the related table rows?

我很好奇其他人是如何处理这个问题的。更具体地,当数据库被规范化并且实体与另一个实体具有关系时,插入新实体基本上导致必须检查另一个实体。对于大型插入,除非将操作提升到代码域,否则这可能很慢。有没有办法自动插入相关的表行?

FYI I'm using Hibernate's implementation of JPA

仅供参考我正在使用Hibernate的JPA实现

4 个解决方案

#1

I'm not sure if an ORM can handle this, but in straight SQL you could:

我不确定ORM是否可以处理这个问题,但是在直接SQL中你可以:

Create a table of name/nickname pairs,

创建一个名称/昵称对表,

INSERT INTO NicknameTable SELECT Nickname FROM temp WHERE Nickname NOT IN (SELECT Nickname FROM NicknameTable)

INSERT INTO NicknameTable SELECT昵称FROM temp WHERE昵称NOT IN(SELECT Nickname FROM NicknameTable)

Insert into main table knowing the Nickname exists.

知道昵称存在时插入主表。

In your example, you can just have a NULLable nickname column withoout another table, unless a person can have more than one nickname.

在您的示例中,除了一个人可以拥有多个昵称之外,您可以只使用另一个表中的NULLable昵称列。

#2

Truthfully? I'd make nickname a varchar column in the Person table, and forget about the Nickname table. Nickname is an attribute of a person, not a separate entity.

说实话?我在昵称表中将昵称设为varchar列,并忘记昵称表。昵称是一个人的属性,而不是一个单独的实体。

Is this a simplified example, and your 'identifiers' really do benefit from the entity-relationships?

这是一个简化的例子,您的“标识符”确实从实体关系中受益吗?

edit: Okay, understood this is just an artificial example. The question is a good one, because it comes up often enough.

编辑:好的,明白这只是一个人为的例子。这个问题很好,因为它经常出现。

Standard SQL supports a form of INSERT statement with an optional "...ON DUPLICATE KEY UPDATE..." clause. Support for this syntax varies by database brand. If you add a UNIQUE constraint to the identifier name in the Nickname table, a duplicate entry will invoke the UPDATE part of the clause (you can do a dummy update, instead of changing anything).

标准SQL支持一种INSERT语句形式,带有可选的“...... ON DUPLICATE KEY UPDATE ...”子句。对此语法的支持因数据库品牌而异。如果在Nickname表中为标识符名称添加UNIQUE约束,则重复的条目将调用子句的UPDATE部分(您可以执行虚拟更新,而不是更改任何内容)。

CREATE TABLE Nickname (
  id SERIAL PRIMARY KEY,
  name VARCHAR(20) UNIQUE
);

INSERT INTO Nickname (name) VALUES ("Bill")
  ON DUPLICATE KEY UPDATE name = name;

#3

INSERT INTO Person(Name, NicknameID)
    VALUES(:name, (SELECT id FROM Nickname WHERE Name = :nickname))

If the INSERT fails because the nickname doesn't exist, then insert the nickname and then the person record.

如果INSERT由于昵称不存在而失败,则插入昵称,然后插入人员记录。

I'm assuming that :name and :nickname identify host variables containing the user's name and nickname - and the that person.id column will be assigned a value automatically when it is omitted from the SQL. Adapt to suit your circumstances.

我假设:name和:nickname标识包含用户名和昵称的主机变量 - 并且当从SQL中省略该person.id列时,将自动为该person.id列分配值。适应您的情况。

If you think most nicknames will in fact be unique, you could simply attempt to insert the nickname unconditionally, but ignore the error that occurs if the nickname already exists.

如果您认为大多数昵称实际上都是唯一的,您可以简单地尝试无条件地插入昵称,但忽略昵称已存在时发生的错误。

#4

Alternatively, perhaps a 'MERGE' statement could help? It offers the option of inserting a new value or updating an existng value. Syntax and suport varies by DB, but possibly more common than the 'ON DUPLICATE' option.

或者,也许'MERGE'声明可能有帮助吗?它提供了插入新值或更新existsng值的选项。语法和支持因DB而异,但可能比'ON DUPLICATE'选项更常见。

#1