需要关于MMORPG数据模型设计、数据库访问和无堆栈python的建议

I'm engaged in developing a turn-based casual MMORPG game server.

我正在开发一个基于回合的休闲MMORPG游戏服务器。

The low level engine(NOT written by us) which handle networking, multi-threading, timer, inter-server communication, main game loop etc, was written by C++. The high level game logic was written by Python.

处理网络、多线程、定时器、服务器间通信、主游戏循环等的底层引擎(不是我们自己编写的)是c++编写的。高级的游戏逻辑是由Python编写的。

My question is about the data model design in our game.

我的问题是关于我们游戏中的数据模型设计。

At first we simply try to load all data of a player into RAM and a shared data cache server when client login and schedule a timer periodically flush data into data cache server and data cache server will persist data into database.

首先，当客户端登录并调度计时器将数据定期刷新到数据缓存服务器时，我们尝试将播放器的所有数据加载到RAM和共享数据缓存服务器，而数据缓存服务器将数据持久化到数据库中。

But we found this approach has some problems

但我们发现这种方法存在一些问题

1) Some data needs to be saved or checked instantly, such as quest progress, level up, item & money gain etc.

1)一些数据需要即时保存或查询，如任务进度，升级，物品和金钱收益等。

2) According to game logic, sometimes we need to query some offline player's data.

2)根据游戏逻辑，有时需要查询一些离线玩家的数据。

3) Some global game world data needs to be shared between different game instances which may be running on a different host or a different process on the same host. This is the main reason we need a data cache server sits between game logic server and database.

3)某些全局游戏世界数据需要在不同的游戏实例之间共享，这些实例可能在不同的主机上运行，也可能在同一主机上运行不同的进程。这是我们需要一个数据缓存服务器位于游戏逻辑服务器和数据库之间的主要原因。

4) Player needs freely switch between game instances.

4)玩家需要在游戏实例之间*切换。

Below is the difficulty we encountered in the past:

以下是我们在过去遇到的困难:

1) All data access operation should be asynchronized to avoid network I/O blocking the main game logic thread. We have to send message to database or cache server and then handle data reply message in callback function and continue proceed game logic. It quickly become painful to write some moderate complex game logic that needs to talk several times with db and the game logic is scattered in many callback functions makes it hard to understand and maintain.

1)对所有数据访问操作进行异步处理，避免网络I/O阻塞游戏主逻辑线程。我们必须向数据库或缓存服务器发送消息，然后在回调函数中处理数据应答消息并继续进行游戏逻辑。编写一些需要多次与db对话的中等复杂的游戏逻辑会很快变得很痛苦，并且游戏逻辑分散在许多回调函数中，这使得理解和维护变得很困难。

2) The ad-hoc data cache server makes things more complex, we hard to maintain data consistence and effectively update/load/refresh data.

2) ad hoc数据缓存服务器使事情变得更加复杂，难以保持数据一致性，有效更新/加载/刷新数据。

3) In-game data query is inefficient and cumbersome, game logic need to query many information such as inventory, item info, avatar state etc. Some transaction machanism is also needed, for example, if one step failed the entire operation should be rollback. We try to design a good data model system in RAM, building a lot of complex indexs to ease numerous information query, adding transaction support etc. Quickly I realized what we are building is a in-memory database system, we are reinventing the wheel...

3)游戏内数据查询效率低、繁琐，游戏逻辑需要查询很多信息，如库存、物品信息、头像状态等，还需要一些交易机制，如一步失败，整个操作需要回滚。我们尝试在RAM中设计一个好的数据模型系统，构建大量复杂的索引来简化大量的信息查询，增加事务支持等。

Finally I turn to the stackless python, we removed the cache server. All data are saved in database. Game logic server directly query database. With stackless python's micro tasklet and channel, we can write game logic in a synchronized way. It is far more easy to write and understand and productivity greatly improved.

最后我转向无堆栈的python，我们删除了缓存服务器。所有数据都保存在数据库中。游戏逻辑服务器直接查询数据库。通过stackless python的微微任务和通道，我们可以以一种同步的方式编写游戏逻辑。写作和理解要容易得多，生产率也大大提高。

In fact, the underlying DB access is also asynchronized: One client tasklet issue request to another dedicate DB I/O worker thread and the tasklet is blocked on a channel, but the entire main game logic is not blocked, other client's tasklet will be scheduled and run freely. When DB data reply the blocked tasklet will be waken up and continue to run on the 'break point'(continuation?).

实际上，底层的DB访问也是异步的:一个客户机微线程向另一个专用的DB I/O工作线程发出请求，并且微线程在通道上被阻塞，但是整个游戏逻辑没有被阻塞，其他客户机的微线程将被*调度和运行。当DB数据回复被阻塞的微线程时，将被唤醒并在“断点”(continuation?)上继续运行。

With above design, I have some questions:

以上设计，我有一些问题:

1) The DB access will be more frequently than previous cached solution, does the DB can support high frequent query/update operation? Does some mature cache solution such as redis, memcached is needed in near future?

1) DB访问将比之前缓存的解决方案更频繁，DB是否支持高频繁的查询/更新操作?是否需要一些成熟的缓存解决方案，如redis, memcached ?

2) Are there any serious pitfalls in my design? Can you guys give me some better suggestions, especially on in-game data management pattern.

我的设计有什么严重的缺陷吗?你们能给我一些更好的建议吗，特别是关于游戏中的数据管理模式。

Any suggestion would be appreciated, thanks.

如有任何建议将不胜感激，谢谢。

2 个解决方案

#1

I've worked with one MMO engine that operated in a somewhat similar fashion. It was written in Java, however, not Python.

我曾经使用过一种类似的MMO引擎。但是，它是用Java编写的，而不是Python。

With regards to your first set of points:

关于你的第一套观点:

1) async db access We actually went the other route, and avoided having a “main game logic thread.” All game logic tasks were spawned as new threads. The overhead of thread creation and destruction was completely lost in the noise floor compared to I/O. This also preserved the semantics of having each “task” as a reasonably straightforward method, instead of the maddening chain of callbacks that one otherwise ends up with (although there were still cases of this.) It also meant that all game code had to be concurrent, and we grew increasingly reliant upon immutable data objects with timestamps.

1)异步db访问我们实际上走了另一条路，并且避免了有一个“主要的游戏逻辑线程”。“所有的游戏逻辑任务都被衍生为新的线程。与I/O相比，在噪声层中，线程创建和销毁的开销完全丢失。这也保留了将每个“任务”作为一种相当简单的方法的语义，而不是让人抓狂的回调链(尽管仍然存在这种情况)。它还意味着所有的游戏代码都必须是并发的，我们越来越依赖具有时间戳的不可变数据对象。

2) ad-hoc cache We employed a lot of WeakReference objects (I believe Python has a similar concept?), and also made use of a split between the data objects, e.g. “Player”, and the “loader” (actually database access methods) e.g. “PlayerSQLLoader;” the instances kept a pointer to their Loader, and the Loaders were called by a global “factory” class that would handle cache lookups versus network or SQL loads. Every “Setter” method in a data class would call the method changed, which was an inherited boilerplate for myLoader.changed (this);

2)临时缓存我们雇佣很多WeakReference对象(我相信Python也有类似的概念?),并利用数据对象之间的分裂,例如“玩家”,和“机”(实际上是数据库访问方法)如“PlayerSQLLoader;“实例保存一个指向他们的装载机,加载器是由一个全球性“工厂”类,调用处理缓存或SQL查询和网络负载。数据类中的每个“Setter”方法都会调用已更改的方法，这是myLoader的继承样板。改变(这);

In order to handle loading objects from other active servers, we employed “proxy” objects that used the same data class (again, say, “Player,”) but the Loader class we associated was a network proxy that would (synchronously, but over gigabit local network) update the “master” copy of that object on another server; in turn, the “master” copy would call changed itself.

为了处理从其他活动服务器加载对象,我们使用“代理”的对象使用相同的数据类(再一次,说,“球员,”)但我们相关的类加载器是一个网络代理,(同步,但在千兆本地网络)更新的“主”副本,另一个服务器上的对象;反过来，“主”副本将调用changed本身。

Our SQL UPDATE logic had a timer. If the backend database had received an UPDATE of the object within the last ($n) seconds (we typically kept this around 5), it would instead add the object to a “dirty list.” A background timer task would periodically wake and attempt to flush any objects still on the “dirty list” to the database backend asynchronously.

我们的SQL更新逻辑有一个计时器。如果后端数据库在最后($n)秒内收到了对象的更新(通常是5秒)，那么它会将对象添加到“脏列表”中。后台计时器任务会定期唤醒并尝试将“脏列表”上的任何对象异步刷新到数据库后端。

Since the global factory maintained WeakReferences to all in-core objects, and would look for a single instantiated copy of a given game object on any live server, we would never attempt to instantiate a second copy of one game object backed by a single DB record, so the fact that the in-RAM state of the game might differ from the SQL image of it for up to 5 or 10 seconds at a time was inconsequential.

自全球工厂维护weakreference核心内的所有对象,并将寻找一个给定游戏的副本实例化对象在任何生活服务器,我们不会试图实例化一个游戏对象的第二个副本由一个单一的数据库记录,所以游戏的事实在ram中状态可能不同于它的SQL形象长达5或10秒一次是无关紧要的。

Our entire SQL system ran in RAM (yes, a lot of RAM) as a mirror to another server who tried valiantly to write to disc. (That poor machine burned out RAID drives on average of once every 3-4 months due to “old age.” RAID is good.)

我们的整个SQL系统运行在RAM(是的，很多RAM)中，作为另一个服务器的镜像，该服务器试图勇敢地向磁盘写入数据。(由于“年老”，那台可怜的机器平均每3-4个月就烧掉一次RAID驱动器。“RAID是好的。)

Notably, the objects had to be flushed to database when being removed from cache, e.g. due to exceeding the cache RAM allowance.

值得注意的是，当被从缓存中删除时，对象必须被刷新到数据库，例如由于超过缓存RAM的限制。

3) in-memory database … I hadn't run across this precise situation. We did have “transaction-like” logic, but it all occurred on the level of Java getters/setters.

3)内存数据库……我没有遇到过这种情况。我们确实有“事务类似”的逻辑，但它都发生在Java getter /setter级别。

And, in regards to your latter points:

关于后几点:

1) Yes, PostgreSQL and MySQL in particular deal well with this, particularly when you use a RAMdisk mirror of the database to attempt to minimize actual HDD wear and tear. In my experience, MMO's do tend to hammer the database more than is strictly necessary, however. Our “5 second rule”* was built specifically to avoid having to solve the problem “correctly.” Each of our setters would call changed. In our usage pattern, we found that an object typically had either 1 field changed, and then no activity for some time, or else had a “storm” of updates happen, where many fields changed in a row. Building proper transactions or so (e.g. informing the object that it was about to accept many writes, and should wait for a moment before saving itself to the DB) would have involved more planning, logic, and major rewrites of the system; so, instead, we bypassed the situation.

1)是的，PostgreSQL和MySQL处理得特别好，特别是当您使用数据库的RAMdisk镜像以尽量减少HDD的实际磨损时。然而，根据我的经验，MMO对数据库的攻击往往超过了严格要求。我们的“5秒规则”*是专门为避免必须正确地解决问题而构建的。“我们的每一个定居者都会打电话改变。在我们的使用模式中，我们发现一个对象通常有一个字段发生了变化，并且在一段时间内没有活动，或者出现了更新的“风暴”，其中许多字段都发生了变化。构建适当的事务(例如，通知对象它将接受许多写操作，并且应该等待片刻，然后将自己保存到数据库中)将涉及更多的系统规划、逻辑和重大重写;所以，我们绕过了这个问题。

2) Well, there's my design above :-)

嗯，以上是我的设计

In point of fact, the MMO engine I'm presently working on uses even more reliance upon in-RAM SQL databases, and (I hope) will be doing so a bit better. However, that system is being built using an Entity-Component-System model, rather than the OOP model that I described above.

事实上，我目前正在开发的MMO引擎更依赖于ram中的SQL数据库，(我希望)这样做会更好一些。然而，这个系统是使用实体-组件-系统模型构建的，而不是我上面描述的OOP模型。

If you already are based on an OOP model, shifting to ECS is a pretty paradigm shift and, if you can make OOP work for your purposes, it's probably better to stick with what your team already knows.

如果您已经基于OOP模型，那么转向ECS是一种很好的范式转换，并且如果您能够使OOP为您的目的工作，那么最好坚持您的团队已经知道的内容。

*- “the 5 second rule” is a colloquial US “folk belief” that after dropping food on the floor, it's still OK to eat it if you pick it up within 5 seconds.

*-“五秒规则”是美国的一个口语化的“民间信仰”，当你把食物掉在地上之后，如果你在5秒内把食物捡起来的话，还是可以吃的。

#2

It's difficult to comment on the entire design/datamodel without greater understanding of the software, but it sounds like your application could benefit from an in-memory database.* Backing up such databases to disk is (relatively speaking) a cheap operation. I've found that it is generally faster to:

如果不深入了解软件，就很难对整个设计/数据模型进行评论，但您的应用程序似乎可以从内存中数据库中获益。将此类数据库备份到磁盘(相对而言)是一种廉价的操作。我发现，通常来说:

A) Create an in-memory database, create a table, insert a million** rows into the given table, and then back-up the entire database to disk

A)创建一个内存中的数据库，创建一个表，在给定的表中插入一百万**行，然后将整个数据库备份到磁盘

than

比

B) Insert a million** rows into a table in a disk-bound database.

B)向磁盘绑定数据库中的表插入一百万**行。

Obviously, single record insertions/updates/deletions also run faster in-memory. I've had success using JavaDB/Apache Derby for in-memory databases.

显然，单记录插入/更新/删除也在内存中运行得更快。我已经成功地将JavaDB/Apache Derby用于内存中数据库。

*Note that the database need not be embedded in your game server. **A million may not be an ideal size for this example.

注意，数据库不需要嵌入到游戏服务器中。**百万可能不是这个例子的理想尺寸。

#1