Google云数据存储区仅存储唯一的实体

I am trying to learn NoSQL with Google Datastore but I am running into a problem with uniqueness.

我正在尝试使用Google Datastore学习NoSQL,但我遇到了一个独特的问题。

Consider an ecommerce store, it has categories and products.

考虑一个电子商务商店,它有类别和产品。

You do not want two products of the same SKU in the database.

您不希望数据库中有两个相同SKU的产品。

So I insert an entity with JSON:

所以我插入一个带有JSON的实体:

{"sku": 1234, "product_name": "Test product"}

And it shows up with two fields. But then I can do that again and I have two or more identical products.

它显示了两个字段。但是我可以再次这样做,我有两个或更多相同的产品。

How do you avoid this? Can you make the sku field unique?

你怎么避免这个?你能让sku领域独一无二吗?

Do I need to do a query before insert?

在插入之前我是否需要进行查询?

The same issue arises with categories. Should I just use one entity for ALL my categories and stucture it in my JSON?

类别也会出现同样的问题。我应该只为我的所有类别使用一个实体并将其构建在我的JSON中吗?

What is a good common practice here?

这里有什么好的常见做法?

3 个解决方案

#1

Create a new kind called 'sku'. When you create a new product, you'll want to do a transactional insert of both the product entity and the sku entity.

创建一种名为'sku'的新类型。创建新产品时,您需要对产品实体和sku实体进行事务性插入。

For example, let's say you want to add a new product with the kind name product with the id of abc:

例如,假设您要添加ID为abc的类型名称产品的新产品:

"product/abc" = {"sku": 1234, "product_name": "Test product"}

“product / abc”= {“sku”:1234,“product_name”:“测试产品”}

To ensure uniqueness on the property "sku", you'll always want to insert an entity with the kind name sku and the id that equals the property's value:

为了确保属性“sku”的唯一性,您总是希望插入一个名称为sku且id等于属性值的实体:

"sku/1234" = {"created": "2017-05-11"}

“sku / 1234”= {“已创建”:“2017-05-11”}

The above example entity has a property for created date - just something optional I threw in as part of the example.

上面的示例实体具有创建日期的属性 - 只是我作为示例的一部分投入的可选项。

Now, as long as you insert both of these as part of the same transaction, you will be ensuring that the "sku" property has a unique value. This works because:

现在,只要您将这两个插入作为同一事务的一部分,您将确保“sku”属性具有唯一值。这是因为:

Insert ensures write will fail if the sku entity for that number already exists

如果该数字的sku实体已存在,则Insert确保写入失败

The transaction ensures writing the product entity (with the sku value) and the sku entity are atomic - so if the sku isn't unique, writing the sku entity will fail, causing the product entity write to also fail.

事务确保编写产品实体(使用sku值)并且sku实体是原子的 - 因此如果sku不是唯一的,则写入sku实体将失败,导致产品实体写入也失败。

#2

You can use "sku" as an "id" (if it's a number) or "name" (if it's a string) for your entity, instead of storing "sku" as a property. Then it's guaranteed to be unique as it becomes part of the unique entity key.

您可以将“sku”用作实体的“id”(如果是数字)或“name”(如果它是字符串),而不是将“sku”存储为属性。然后它保证是唯一的,因为它成为唯一实体键的一部分。

#3

Data model is a big subject but IMO there are two approaches you can choose. This is more fundamental rather specific to your question. It gives some ideas.

数据模型是一个很大的主题,但IMO有两种方法可供选择。这对您的问题更具基础性。它提供了一些想法。

The first approach – storing a reference as a property

第一种方法 - 将引用存储为属性

Same as thinking of product contains product variants ...

与思考产品包含产品变体相同......

This approach sort of the same from RDBMS world. You can create products separately, and each product will have a reference in each product variants. It is similar to how foreign keys work in databases. So, you will have a new property for the product variant entities, which will contain a reference to the product to which it belongs. The product attribute will actually contain the key of an entity of the Product Kind. If it sounds confusing this is how u can dissect it. I will use python as example:

这种方法与RDBMS世界的方法相同。您可以单独创建产品,每个产品将在每个产品变体中都有参考。它类似于外键在数据库中的工作方式。因此,您将拥有产品变体实体的新属性,该属性将包含对其所属产品的引用。 product属性实际上包含Product Kind实体的密钥。如果这听起来令人困惑,那么你可以解剖它。我将以python为例:

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()
    # product key.
    product = ndb.KeyProperty(kind=Product)

hugoboss = Product(name="Hugo Boss", key=ndb.Key(Product, 'hugoboss'))
gap = Product(name="Gap", key=ndb.Key(Gap, 'gap'))

pants1 = ProductVariant(name="Black panst", price=300, product=hugoboss.key)
pants2 = ProductVariant(name="Grey pants", price=200, product=hugoboss.key)
tshirt = ProductVariant(name="White graphic tshirt", price=10, product=gap.key)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ProductVariant.product == hugoboss.key).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey panst

The second approach – a product within the key

第二种方法 - 关键内的产品

To take full advantage of it you need to know about sorting feature of Bigtable(Datastore build on top of Bigtable) row keys and how data manipulated around it. if you want to deep dive there is great paper Bigtable: A Distributed Storage System for Structured Data

要充分利用它,您需要了解Bigtable(在Bigtable上构建数据存储)行键的排序功能以及如何围绕它操作数据。如果你想深入研究,那么有很好的论文Bigtable:结构化数据的分布式存储系统

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()

hugoboss = ndb.Key(Product, 'hugoboss')
gap = ndb.Key(Product, 'gap')

Product(name="Hugo Boss", key=hugoboss).put()
Product(name="Gap", key=gap).put()

pants1 = ProductVariant(name="Black pants", price=300, parent=hugoboss)
pants2 = ProductVariant(name="Grey pants", price=200, parent=hugoboss)
tshirt = ProductVariant(name="White graphic tshirt", price=10, parent=gap)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ancestor=hugoboss).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey pants

Second approach is very powerful! I hope this helps.

第二种方法非常强大!我希望这有帮助。

#1