Preface: I'm attemping to use the repository pattern in a MVC architecture with relational databases.
前言:我正尝试在具有关系数据库的MVC体系结构中使用存储库模式。
I've recently started learning TDD in PHP, and I'm realizing that my database is coupled much too closely with the rest of my application. I've read about repositories, and using an IoC container to "inject" it into my controllers. Very cool stuff. But now have some practical questions about repository design. Consider the follow example.
我最近开始学习PHP中的TDD,我意识到我的数据库与我的应用程序的其他部分耦合得太紧密了。我读过有关存储库的文章,并使用IoC容器将其“注入”到控制器中。非常酷的东西。但是现在有一些关于存储库设计的实际问题。考虑下例。
<?php
class DbUserRepository implements UserRepositoryInterface
{
protected $db;
public function __construct($db)
{
$this->db = $db;
}
public function findAll()
{
}
public function findById($id)
{
}
public function findByName($name)
{
}
public function create($user)
{
}
public function remove($user)
{
}
public function update($user)
{
}
}
Issue #1: Too many fields
All of these find methods use a select all fields (SELECT *
) approach. However, in my apps I'm always trying to limit the number of fields I get, as this often adds overhead and slows things down. For those using this pattern, how do you deal with this?
所有这些查找方法都使用select All字段(select *)方法。然而,在我的应用程序中,我总是试图限制我得到的字段的数量,因为这通常会增加开销并降低速度。对于使用这种模式的人,您如何处理这种情况?
Issue #2: Too many methods
While this class looks nice right now, I know that in a real world app I need a lot more methods. For example:
虽然这个类现在看起来不错,但我知道在现实应用程序中我需要更多的方法。例如:
- findAllByNameAndStatus
- findAllByNameAndStatus
- findAllInCountry
- findAllInCountry
- findAllWithEmailAddressSet
- findAllWithEmailAddressSet
- findAllByAgeAndGender
- findAllByAgeAndGender
- findAllByAgeAndGenderOrderByAge
- findAllByAgeAndGenderOrderByAge
- Etc.
- 等。
As you can see, there could be very, very long list of possible methods. And then if you add in the field selection issue above, the problem worsens. In the past I'd normally just put all this logic right in my controller:
如你所见,可能的方法有很长一长串。然后如果你加上上面的字段选择问题,这个问题就会恶化。在过去,我通常只是把所有的逻辑放在我的控制器里:
<?php
class MyController
{
public function users()
{
$users = User::select('name, email, status')->byCountry('Canada')->orderBy('name')->rows()
return View::make('users', array('users' => $users))
}
}
With my repository approach, I don't want to end up with this:
使用我的存储库方法,我不想以以下方式结束:
<?php
class MyController
{
public function users()
{
$users = $this->repo->get_first_name_last_name_email_username_status_by_country_order_by_name('Canada');
return View::make('users', array('users' => $users))
}
}
Issue #3: Impossible to match an interface
I see the benefit in using interfaces for repositories, so I can swap out my implementation (for testing purposes or other). My understanding of interfaces is that they define a contract that an implementation must follow. This is great until you start adding additional methods to your repositories like findAllInCountry()
. Now I need to update my interface to also have this method, otherwise other implementations may not have it, and that could break my application. By this feels insane...a case of the tail wagging the dog.
我看到了为存储库使用接口的好处,因此我可以交换我的实现(用于测试目的或其他目的)。我对接口的理解是,它们定义了一个实现必须遵循的契约。这很好,直到您开始向您的存储库(如findAllInCountry())添加其他方法为止。现在我需要更新我的接口,使它也具有这个方法,否则其他实现可能没有它,这可能会破坏我的应用程序。这个感觉疯了…尾巴摇着狗的情形。
Specification Pattern?
This leads me to believe that repository should only have a fixed number of methods (like save()
, remove()
, find()
, findAll()
, etc). But then how do I run specific lookups? I've heard of the Specification Pattern, but it seems to me that this only reduces an entire set of records (via IsSatisfiedBy()
), which clearly has major performance issues if you're pulling from a database.
这使我相信存储库应该只有固定数量的方法(如save()、remove()、find()、findAll()等)。但是我如何运行特定的查找呢?我听说过规范模式,但在我看来,这只减少了一组记录(通过IsSatisfiedBy())),如果从数据库中提取,这组记录显然存在重大性能问题。
Help?
Clearly I need to rethink things a little when working with repositories. Can anyone enlighten on how this is best handled?
显然,在使用存储库时,我需要重新考虑一下。有人能告诉我如何处理这个问题吗?
6 个解决方案
#1
160
I thought I'd take a crack at answering my own question. What follows is just one way of solving the issues 1-3 in my original question.
我想我可以试着回答我自己的问题。下面是解决问题1-3的一种方法。
Disclaimer: I may not always use the right terms when describing patterns or techniques. Sorry for that.
免责声明:在描述模式或技术时,我可能不会总是使用正确的术语。对不起。
The Goals:
- Create a complete example of a basic controller for viewing and editing
Users
. - 创建一个用于查看和编辑用户的基本控制器的完整示例。
- All code must be fully testable and mockable.
- 所有代码必须是完全可测试和可模拟的。
- The controller should have no idea where the data is stored (meaning it can be changed).
- 控制器应该不知道数据存储在哪里(意味着可以更改数据)。
- Example to show a SQL implementation (most common).
- 示例显示SQL实现(最常见)。
- For maximum performance, controllers should only receive the data they need—no extra fields.
- 为了获得最大的性能,控制器应该只接收它们需要的数据——不需要额外的字段。
- Implementation should leverage some type of data mapper for ease of development.
- 实现应该利用某种类型的数据映射器来简化开发。
- Implementation should have the ability to perform complex data lookups.
- 实现应该具有执行复杂数据查找的能力。
The Solution
I'm splitting my persistent storage (database) interaction into two categories: R (Read) and CUD (Create, Update, Delete). My experience has been that reads are really what causes an application to slow down. And while data manipulation (CUD) is actually slower, it happens much less frequently, and is therefore much less of a concern.
我将持久存储(数据库)交互分为两类:R (Read)和CUD(创建、更新、删除)。我的经验是,读取是导致应用程序变慢的真正原因。虽然数据操作(CUD)实际上比较慢,但发生的频率要低得多,因此也就不那么令人担心了。
CUD (Create, Update, Delete) is easy. This will involve working with actual models, which are then passed to my Repositories
for persistence. Note, my repositories will still provide a Read method, but simply for object creation, not display. More on that later.
CUD(创建、更新、删除)很简单。这将涉及使用实际的模型,然后将模型传递给我的存储库以实现持久性。注意,我的存储库仍将提供读取方法,但仅用于对象创建,而不是显示。稍后将进行更详细的讨论。
R (Read) is not so easy. No models here, just value objects. Use arrays if you prefer. These objects may represent a single model or a blend of many models, anything really. These are not very interesting on their own, but how they are generated is. I'm using what I'm calling Query Objects
.
读书并不是那么容易。这里没有模型,只有值对象。如果愿意,可以使用数组。这些对象可以表示单个模型或多个模型的混合,任何东西。它们本身并不是很有趣,但是它们是如何产生的。我使用的是查询对象。
The Code:
User Model
Let's start simple with our basic user model. Note that there is no ORM extending or database stuff at all. Just pure model glory. Add your getters, setters, validation, whatever.
让我们从最基本的用户模型开始。注意,根本没有ORM扩展或数据库之类的东西。单纯的模型的荣耀。添加getter、setter、验证等等。
class User
{
public $id;
public $first_name;
public $last_name;
public $gender;
public $email;
public $password;
}
Repository Interface
Before I create my user repository, I want to create my repository interface. This will define the "contract" that repositories must follow in order to be used by my controller. Remember, my controller will not know where the data is actually stored.
在创建用户存储库之前,我想要创建存储库接口。这将定义存储库必须遵循的“契约”,以便由我的控制器使用。记住,我的控制器不会知道数据实际存储在哪里。
Note that my repositories will only every contain these three methods. The save()
method is responsible for both creating and updating users, simply depending on whether or not the user object has an id set.
注意,我的存储库将只包含这三个方法。save()方法负责创建和更新用户,这取决于用户对象是否具有id集。
interface UserRepositoryInterface
{
public function find($id);
public function save(User $user);
public function remove(User $user);
}
SQL Repository Implementation
Now to create my implementation of the interface. As mentioned, my example was going to be with an SQL database. Note the use of a data mapper to prevent having to write repetitive SQL queries.
现在创建接口的实现。如前所述,我的示例将使用SQL数据库。注意数据映射器的使用,以避免必须编写重复的SQL查询。
class SQLUserRepository implements UserRepositoryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function find($id)
{
// Find a record with the id = $id
// from the 'users' table
// and return it as a User object
return $this->db->find($id, 'users', 'User');
}
public function save(User $user)
{
// Insert or update the $user
// in the 'users' table
$this->db->save($user, 'users');
}
public function remove(User $user)
{
// Remove the $user
// from the 'users' table
$this->db->remove($user, 'users');
}
}
Query Object Interface
Now with CUD (Create, Update, Delete) taken care of by our repository, we can focus on the R (Read). Query objects are simply an encapsulation of some type of data lookup logic. They are not query builders. By abstracting it like our repository we can change it's implementation and test it easier. An example of a Query Object might be an AllUsersQuery
or AllActiveUsersQuery
, or even MostCommonUserFirstNames
.
现在,有了库管理的CUD(创建、更新、删除),我们就可以关注R(读取)了。查询对象只是某种类型的数据查找逻辑的封装。它们不是查询生成器。通过像我们的存储库那样抽象它,我们可以更改它的实现并更容易地测试它。查询对象的一个示例可能是一个AllUsersQuery或AllActiveUsersQuery,甚至是最常见的用户名。
You may be thinking "can't I just create methods in my repositories for those queries?" Yes, but here is why I'm not doing this:
您可能会想“我不能在我的存储库中为这些查询创建方法吗?”是的,但我不这么做的原因是:
- My repositories are meant for working with model objects. In a real world app, why would I ever need to get the
password
field if I'm looking to list all my users? - 我的存储库是用来处理模型对象的。在现实世界的应用程序中,如果我要列出所有的用户,为什么还需要输入密码字段?
- Repositories are often model specific, yet queries often involve more than one model. So what repository do you put your method in?
- 存储库通常是特定于模型的,但是查询通常涉及多个模型。那么,你把你的方法放在什么存储库中呢?
- This keeps my repositories very simple—not an bloated class of methods.
- 这使我的存储库非常简单——不是一种臃肿的方法。
- All queries are now organized into their own classes.
- 所有查询现在都被组织到它们自己的类中。
- Really, at this point, repositories exist simply to abstract my database layer.
- 实际上,在这一点上,存储库的存在只是为了抽象我的数据库层。
For my example I'll create a query object to lookup "AllUsers". Here is the interface:
在我的示例中,我将创建一个查询对象来查找“AllUsers”。这是接口:
interface AllUsersQueryInterface
{
public function fetch($fields);
}
Query Object Implementation
This is where we can use a data mapper again to help speed up development. Notice that I am allowing one tweak to the returned dataset—the fields. This is about as far as I want to go with manipulating the performed query. Remember, my query objects are not query builders. They simply perform a specific query. However, since I know that I'll probably be using this one a lot, in a number of different situations, I'm giving myself the ability to specify the fields. I never want to return fields I don't need!
在这里,我们可以再次使用数据映射器来帮助加速开发。注意,我允许对返回的数据集——字段进行一次调整。这就是我想要操作执行的查询的地方。记住,我的查询对象不是查询生成器。它们只是执行一个特定的查询。但是,由于我知道我可能会经常使用这个,在许多不同的情况下,我给自己指定字段的能力。我不想返回我不需要的字段!
class AllUsersQuery implements AllUsersQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch($fields)
{
return $this->db->select($fields)->from('users')->orderBy('last_name, first_name')->rows();
}
}
Before moving on to the controller, I want to show another example to illustrate how powerful this is. Maybe I have a reporting engine and need to create a report for AllOverdueAccounts
. This could be tricky with my data mapper, and I may want to write some actual SQL
in this situation. No problem, here is what this query object could look like:
在继续讨论控制器之前,我想展示另一个示例来说明它的强大功能。也许我有一个报告引擎,需要为AllOverdueAccounts创建一个报告。对于我的数据映射器来说,这可能有点棘手,在这种情况下,我可能需要编写一些实际的SQL。没问题,这里是这个查询对象的样子:
class AllOverdueAccountsQuery implements AllOverdueAccountsQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch()
{
return $this->db->query($this->sql())->rows();
}
public function sql()
{
return "SELECT...";
}
}
This nicely keeps all my logic for this report in one class, and it's easy to test. I can mock it to my hearts content, or even use a different implementation entirely.
这很好地将我的报告逻辑保存在一个类中,并且很容易测试。我可以将它模拟到我的心里,甚至完全使用不同的实现。
The Controller
Now the fun part—bringing all the pieces together. Note that I am using dependency injection. Typically dependencies are injected into the constructor, but I actually prefer to inject them right into my controller methods (routes). This minimizes the controller's object graph, and I actually find it more legible. Note, if you don't like this approach, just use the traditional constructor method.
现在有趣的部分是把所有的部分组合在一起。注意,我正在使用依赖项注入。通常依赖项被注入到构造函数中,但实际上我更喜欢将它们注入到我的控制器方法(路由)中。这最小化了控制器的对象图,实际上我发现它更清晰。注意,如果不喜欢这种方法,只需使用传统的构造函数方法。
class UsersController
{
public function index(AllUsersQueryInterface $query)
{
// Fetch user data
$users = $query->fetch(['first_name', 'last_name', 'email']);
// Return view
return Response::view('all_users.php', ['users' => $users]);
}
public function add()
{
return Response::view('add_user.php');
}
public function insert(UserRepositoryInterface $repository)
{
// Create new user model
$user = new User;
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the new user
$repository->save($user);
// Return the id
return Response::json(['id' => $user->id]);
}
public function view(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('view_user.php', ['user' => $user]);
}
public function edit(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('edit_user.php', ['user' => $user]);
}
public function update(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Update the user
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the user
$repository->save($user);
// Return success
return true;
}
public function delete(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Delete the user
$repository->delete($user);
// Return success
return true;
}
}
Final Thoughts:
The important things to note here are that when I'm modifying (creating, updating or deleting) entities, I'm working with real model objects, and performing the persistance through my repositories.
这里需要注意的重要事项是,当我修改(创建、更新或删除)实体时,我使用的是真实的模型对象,并通过我的存储库执行持久化。
However, when I'm displaying (selecting data and sending it to the views) I'm not working with model objects, but rather plain old value objects. I only select the fields I need, and it's designed so I can maximum my data lookup performance.
但是,当我显示(选择数据并将其发送到视图)时,我不是在处理模型对象,而是在处理普通的旧值对象。我只选择我需要的字段,它的设计使我可以最大限度地提高数据查找性能。
My repositories stay very clean, and instead this "mess" is organized into my model queries.
我的存储库保持非常干净,相反,这种“混乱”被组织到我的模型查询中。
I use a data mapper to help with development, as it's just ridiculous to write repetitive SQL for common tasks. However, you absolutely can write SQL where needed (complicated queries, reporting, etc.). And when you do, it's nicely tucked away into a properly named class.
我使用数据映射器来帮助开发,因为为常见任务编写重复的SQL是非常可笑的。但是,您绝对可以在需要的地方编写SQL(复杂的查询、报告等)。当你这样做的时候,它就会很好地隐藏在一个正确命名的类中。
I'd love to hear your take on my approach!
我很想听听你对我方法的看法!
July 2015 Update:
2015年7月更新:
I've been asked in the comments where I ended up with all this. Well, not that far off actually. Truthfully, I still don't really like repositories. I find them overkill for basic lookups (especially if you're already using an ORM), and messy when working with more complicated queries.
我在评论中被问到这些问题。好吧,其实没那么远。说实话,我还是不喜欢仓库。我发现它们对于基本查找(特别是当您已经使用ORM)和处理更复杂的查询时都非常混乱。
I generally work with an ActiveRecord style ORM, so most often I'll just reference those models directly throughout my application. However, in situations where I have more complex queries, I'll use query objects to make these more reusable. I should also note that I always inject my models into my methods, making them easier to mock in my tests.
我通常使用ActiveRecord风格的ORM,所以大多数情况下,我将在整个应用程序中直接引用这些模型。但是,在遇到更复杂的查询时,我将使用查询对象使这些查询更可重用。我还应该注意,我总是将模型注入到我的方法中,使它们更容易在测试中被模仿。
#2
38
Based on my experience, here are some answers to your questions:
根据我的经验,这里有一些问题的答案:
Q: How do we deal with bringing back fields we don't need?
问:我们如何处理把我们不需要的土地带回来?
A: From my experience this really boils down to dealing with complete entities versus ad-hoc queries.
答:根据我的经验,这实际上可以归结为处理完整的实体和临时查询。
A complete entity is something like a User
object. It has properties and methods, etc. It's a first class citizen in your codebase.
一个完整的实体就像一个用户对象。它有属性和方法等等。它是你代码库中的一级公民。
An ad-hoc query returns some data, but we don't know anything beyond that. As the data gets passed around the application, it is done so without context. Is it a User
? A User
with some Order
information attached? We don't really know.
一个特别的查询返回一些数据,但是除了这些我们什么都不知道。当数据在应用程序中传递时,它是在没有上下文的情况下完成的。这是一个用户吗?一个有订单信息的用户?我们真的不知道。
I prefer working with full entities.
我更喜欢与完整的实体一起工作。
You are right that you will often bring back data you won't use, but you can address this in various ways:
你是对的,你经常会带回一些你不会用到的数据,但是你可以用不同的方式来解决这个问题:
- Aggressively cache the entities so you only pay the read price once from the database.
- 积极地缓存实体,以便只从数据库中读取一次。
- Spend more time modeling your entities so they have good distinctions between them. (Consider splitting a large entity into two smaller entities, etc.)
- 花更多的时间为实体建模,以便它们之间有很好的区别。(考虑把一个大的实体分成两个小的实体,等等。)
- Consider having multiple versions of entities. You can have a
User
for the back end and maybe aUserSmall
for AJAX calls. One might have 10 properties and one has 3 properties. - 考虑拥有多个实体版本。您可以为后端提供一个用户,也可以为AJAX调用提供一个UserSmall。一个可能有10个属性,一个有3个属性。
The downsides of working with ad-hoc queries:
处理特别查询的缺点:
- You end up with essentially the same data across many queries. For example, with a
User
, you'll end up writing essentially the sameselect *
for many calls. One call will get 8 of 10 fields, one will get 5 of 10, one will get 7 of 10. Why not replace all with one call that gets 10 out of 10? The reason this is bad is that it is murder to re-factor/test/mock. - 在许多查询中,最终得到的是相同的数据。例如,对于用户,您最终将为许多调用编写本质上相同的select *。一个调用会得到10个字段中的8个,一个会得到5个10,一个会得到7个10。为什么不把所有的都换成一个10次的电话呢?这很糟糕的原因是重构/测试/模拟是谋杀。
- It becomes very hard to reason at a high level about your code over time. Instead of statements like "Why is the
User
so slow?" you end up tracking down one-off queries and so bug fixes tend to be small and localized. - 随着时间的推移,很难对代码进行高水平的推理。而不是像“为什么用户这么慢?”这样的语句,您最终会跟踪一次性的查询,因此错误修复往往是小型的、本地化的。
- It's really hard to replace the underlying technology. If you store everything in MySQL now and want to move to MongoDB, it's a lot harder to replace 100 ad-hoc calls than it is a handful of entities.
- 真的很难取代基础技术。如果您现在将所有东西都存储在MySQL中,并希望迁移到MongoDB,那么要替换100个临时调用要比替换为数不多的实体要困难得多。
Q: I will have too many methods in my repository.
问:我的资源库中将有太多的方法。
A: I haven't really seen any way around this other than consolidating calls. The method calls in your repository really map to features in your application. The more features, the more data specific calls. You can push back on features and try to merge similar calls into one.
A:除了合并电话之外,我还没见过别的办法。存储库中的方法调用实际上映射到应用程序中的特性。特性越多,特定于数据的调用就越多。您可以回推特性,并尝试将类似的调用合并为一个。
The complexity at the end of the day has to exist somewhere. With a repository pattern we've pushed it into the repository interface instead of maybe making a bunch of stored procedures.
一天结束时的复杂性必须存在于某处。使用存储库模式,我们将它推入存储库接口,而不是生成一堆存储过程。
Sometimes I have to tell myself, "Well it had to give somewhere! There are no silver bullets."
有时我不得不对自己说:“它总得有个归宿吧!”没有银弹。”
#3
11
I use the following interfaces:
我使用以下接口:
-
Repository
- loads, inserts, updates and deletes entities - 存储库——加载、插入、更新和删除实体
-
Selector
- finds entities based on filters, in a repository - 选择器——在存储库中基于过滤器查找实体
-
Filter
- encapsulates the filtering logic - 过滤器——封装过滤逻辑
My Repository
is database agnostic; in fact it doesn't specify any persistence; it could be anything: SQL database, xml file, remote service, an alien from outer space etc. For searching capabilities, the Repository
constructs an Selector
which can be filtered, LIMIT
-ed, sorted and counted. In the end, the selector fetches one or more Entities
from the persistence.
我的存储库是数据库无关的;事实上,它没有指定任何持久性;它可以是任何东西:SQL数据库、xml文件、远程服务、来自外部空间的异类等等。最后,选择器从持久性中获取一个或多个实体。
Here is some sample code:
以下是一些示例代码:
<?php
interface Repository
{
public function addEntity(Entity $entity);
public function updateEntity(Entity $entity);
public function removeEntity(Entity $entity);
/**
* @return Entity
*/
public function loadEntity($entityId);
public function factoryEntitySelector():Selector
}
interface Selector extends \Countable
{
public function count();
/**
* @return Entity[]
*/
public function fetchEntities();
/**
* @return Entity
*/
public function fetchEntity();
public function limit(...$limit);
public function filter(Filter $filter);
public function orderBy($column, $ascending = true);
public function removeFilter($filterName);
}
interface Filter
{
public function getFilterName();
}
Then, one implementation:
然后,一个实现:
class SqlEntityRepository
{
...
public function factoryEntitySelector()
{
return new SqlSelector($this);
}
...
}
class SqlSelector implements Selector
{
...
private function adaptFilter(Filter $filter):SqlQueryFilter
{
return (new SqlSelectorFilterAdapter())->adaptFilter($filter);
}
...
}
class SqlSelectorFilterAdapter
{
public function adaptFilter(Filter $filter):SqlQueryFilter
{
$concreteClass = (new StringRebaser(
'Filter\\', 'SqlQueryFilter\\'))
->rebase(get_class($filter));
return new $concreteClass($filter);
}
}
The ideea is that the generic Selector
uses Filter
but the implementation SqlSelector
uses SqlFilter
; the SqlSelectorFilterAdapter
adapts a generic Filter
to a concrete SqlFilter
.
ideea是泛型选择器使用过滤器,而实现SqlSelector使用SqlFilter;SqlSelectorFilterAdapter将通用过滤器改编为具体的SqlFilter。
The client code creates Filter
objects (that are generic filters) but in the concrete implementation of the selector those filters are transformed in SQL filters.
客户端代码创建过滤器对象(通用过滤器),但是在选择器的具体实现中,这些过滤器在SQL过滤器中被转换。
Other selector implementations, like InMemorySelector
, transform from Filter
to InMemoryFilter
using their specific InMemorySelectorFilterAdapter
; so, every selector implementation comes with its own filter adapter.
其他选择器实现,比如InMemorySelector,使用特定的InMemorySelectorFilterAdapter从过滤器转换到InMemoryFilter;因此,每个选择器实现都有自己的筛选器适配器。
Using this strategy my client code (in the bussines layer) doesn't care about a specific repository or selector implementation.
使用此策略,我的客户机代码(在bussines层中)不关心特定的存储库或选择器实现。
/** @var Repository $repository*/
$selector = $repository->factoryEntitySelector();
$selector->filter(new AttributeEquals('activated', 1))->limit(2)->orderBy('username');
$activatedUserCount = $selector->count(); // evaluates to 100, ignores the limit()
$activatedUsers = $selector->fetchEntities();
P.S. This is a simplification of my real code
这是我的真实代码的简化
#4
3
I can only comment on the way we (at my company) deal with this. First of all performance is not too much of an issue for us, but having clean/proper code is.
我只能评论我们(在我的公司)处理这件事的方式。首先,对于我们来说,性能并不是问题,但是拥有干净/正确的代码就足够了。
First of all we define Models such as a UserModel
that uses an ORM to create UserEntity
objects. When a UserEntity
is loaded from a model all fields are loaded. For fields referencing foreign entities we use the appropriate foreign model to create the respective entities. For those entities the data will be loaded ondemand. Now your initial reaction might be ...???...!!! let me give you an example a bit of an example:
首先,我们定义模型,比如使用ORM创建UserEntity对象的UserModel。当从模型加载UserEntity时,将加载所有字段。对于引用外部实体的字段,我们使用适当的外部模型来创建各自的实体。对于这些实体,数据将按需加载。现在你的第一反应可能是…?? !我举个例子
class UserEntity extends PersistentEntity
{
public function getOrders()
{
$this->getField('orders'); //OrderModel creates OrderEntities with only the ID's set
}
}
class UserModel {
protected $orm;
public function findUsers(IGetOptions $options = null)
{
return $orm->getAllEntities(/*...*/); // Orm creates a list of UserEntities
}
}
class OrderEntity extends PersistentEntity {} // user your imagination
class OrderModel
{
public function findOrdersById(array $ids, IGetOptions $options = null)
{
//...
}
}
In our case $db
is an ORM that is able to load entities. The model instructs the ORM to load a set of entities of a specific type. The ORM contains a mapping and uses that to inject all the fields for that entity in to the entity. For foreign fields however only the id's of those objects are loaded. In this case the OrderModel
creates OrderEntity
s with only the id's of the referenced orders. When PersistentEntity::getField
gets called by the OrderEntity
the entity instructs it's model to lazy load all the fields into the OrderEntity
s. All the OrderEntity
s associated with one UserEntity are treated as one result-set and will be loaded at once.
在我们的案例中,$db是一个能够装载实体的ORM。模型指示ORM加载一组特定类型的实体。ORM包含一个映射,并使用它将该实体的所有字段注入到该实体中。但是对于外字段,只加载这些对象的id。在这种情况下,OrderModel只使用被引用订单的id来创建OrderEntitys。当OrderEntity调用PersistentEntity: getField时,实体指示它的模型将所有字段延迟加载到OrderEntitys中。与一个UserEntity关联的所有OrderEntitys都被视为一个结果集,并将立即加载。
The magic here is that our model and ORM inject all data into the entities and that entities merely provide wrapper functions for the generic getField
method supplied by PersistentEntity
. To summarize we always load all the fields, but fields referencing a foreign entity are loaded when necessary. Just loading a bunch of fields is not really a performance issue. Load all possible foreign entities however would be a HUGE performance decrease.
这里的神奇之处在于,我们的模型和ORM将所有数据注入实体,实体仅仅为PersistentEntity提供的通用getField方法提供包装函数。总而言之,我们总是加载所有字段,但必要时将加载引用外部实体的字段。仅仅加载一堆字段并不是真正的性能问题。然而,加载所有可能的外国实体将是一个巨大的性能下降。
Now on to loading a specific set of users, based on a where clause. We provide an object oriented package of classes that allow you to specify simple expression that can be glued together. In the example code I named it GetOptions
. It's a wrapper for all possible options for a select query. It contains a collection of where clauses, a group by clause and everything else. Our where clauses are quite complicated but you could obviously make a simpler version easily.
现在,根据where子句加载一组特定的用户。我们提供了一个面向对象的类包,它允许您指定可以粘在一起的简单表达式。在示例代码中,我将它命名为GetOptions。它是select查询的所有可能选项的包装器。它包含where子句、group by子句和其他所有内容的集合。我们的where子句非常复杂,但是显然您可以轻松地生成一个更简单的版本。
$objOptions->getConditionHolder()->addConditionBind(
new ConditionBind(
new Condition('orderProduct.product', ICondition::OPERATOR_IS, $argObjProduct)
)
);
A simplest version of this system would be to pass the WHERE part of the query as a string directly to the model.
这个系统最简单的版本是将查询的WHERE部分作为字符串直接传递给模型。
I'm sorry for this quite complicated response. I tried to summarize our framework as quickly and clear as possible. If you have any additional questions feel free to ask them and I'll update my answer.
很抱歉这么复杂的回答。我试着尽可能快而清晰地总结我们的框架。如果你有任何问题,请随时提出来,我将更新我的答案。
EDIT: Additionally if you really don't want to load some fields right away you could specify a lazy loading option in your ORM mapping. Because all fields are eventually loaded through the getField
method you could load some fields last minute when that method is called. This is not a very big problem in PHP, but I would not recommend for other systems.
编辑:另外,如果您真的不想立即加载某些字段,您可以在ORM映射中指定一个延迟加载选项。因为所有字段最终都是通过getField方法加载的,所以在调用该方法时,您可以在最后时刻加载一些字段。这在PHP中不是一个很大的问题,但是我不推荐其他系统。
#5
3
These are some different solutions I've seen. There are pros and cons to each of them, but it is for you to decide.
这些是我见过的不同的解。它们各有利弊,但由你来决定。
Issue #1: Too many fields
This is an important aspect especially when you take in to account Index-Only Scans. I see two solutions to dealing with this problem. You can update your functions to take in an optional array parameter that would contain a list of a columns to return. If this parameter is empty you'd return all of the columns in the query. This can be a little weird; based off the parameter you could retrieve an object or an array. You could also duplicate all of your functions so that you have two distinct functions that run the same query, but one returns an array of columns and the other returns an object.
这是一个很重要的方面,尤其是考虑到只使用索引的扫描时。我看到了解决这个问题的两种方法。您可以更新函数,以接受一个可选的数组参数,该参数将包含要返回的列的列表。如果该参数为空,则返回查询中的所有列。这可能有点奇怪;根据参数可以检索对象或数组。您还可以复制所有的函数,以便有两个不同的函数运行相同的查询,但一个返回列数组,另一个返回对象。
public function findColumnsById($id, array $columns = array()){
if (empty($columns)) {
// use *
}
}
public function findById($id) {
$data = $this->findColumnsById($id);
}
Issue #2: Too many methods
I briefly worked with Propel ORM a year ago and this is based off what I can remember from that experience. Propel has the option to generate its class structure based off the existing database schema. It creates two objects for each table. The first object is a long list of access function similar to what you have currently listed; findByAttribute($attribute_value)
. The next object inherits from this first object. You can update this child object to build in your more complex getter functions.
一年前,我曾短暂地与Propel ORM合作过,这是基于我对那段经历的记忆。Propel具有基于现有数据库模式生成类结构的选项。它为每个表创建两个对象。第一个对象是一长串访问函数列表,类似于您当前列出的;findByAttribute(attribute_value美元)。下一个对象从第一个对象继承。您可以更新这个子对象以构建更复杂的getter函数。
Another solution would be using __call()
to map non defined functions to something actionable. Your __call
method would be would be able to parse the findById and findByName into different queries.
另一种解决方案是使用__call()将未定义的函数映射到可操作的东西。您的__call方法将能够将findById和findByName解析为不同的查询。
public function __call($function, $arguments) {
if (strpos($function, 'findBy') === 0) {
$parameter = substr($function, 6, strlen($function));
// SELECT * FROM $this->table_name WHERE $parameter = $arguments[0]
}
}
I hope this helps at least some what.
我希望这至少能有所帮助。
#6
3
I'll add a bit on this as I am currently trying to grasp all of this myself.
我将对此进行补充,因为我目前正试图自己掌握所有这些。
#1 and 2
This is a perfect place for your ORM to do the heavy lifting. If you are using a model that implements some kind of ORM, you can just use it's methods to take care of these things. Make your own orderBy functions that implement the Eloquent methods if you need to. Using Eloquent for instance:
这是一个完美的地方,你的ORM做重型起重。如果您正在使用实现某种ORM的模型,您可以使用它的方法来处理这些事情。如果需要,您可以创建自己的orderBy函数来实现有说服力的方法。用雄辩的例如:
class DbUserRepository implements UserRepositoryInterface
{
public function findAll()
{
return User::all();
}
public function get(Array $columns)
{
return User::select($columns);
}
What you seem to be looking for is an ORM. No reason your Repository can't be based around one. This would require User extend eloquent, but I personally don't see that as a problem.
你似乎在寻找的是一个ORM。没有理由你的存储库不能基于一个。这将需要用户进行有说服力的扩展,但我个人并不认为这是一个问题。
If you do however want to avoid an ORM, you would then have to "roll your own" to get what you're looking for.
如果你真的想避免ORM,那么你就必须“自己动手”才能得到你想要的东西。
#3
Interfaces aren't supposed be hard and fast requirements. Something can implement an interface and add to it. What it can't do is fail to implement a required function of that interface. You can also extend interfaces like classes to keep things DRY.
接口不应该是硬性和快速的需求。一些东西可以实现一个接口并添加它。它不能做的是无法实现该接口所需的函数。您还可以扩展诸如类之类的接口来保持事物的干涩。
That said, I'm just starting to get a grasp, but these realizations have helped me.
也就是说,我刚刚开始有所领悟,但这些认识帮助了我。
#1
160
I thought I'd take a crack at answering my own question. What follows is just one way of solving the issues 1-3 in my original question.
我想我可以试着回答我自己的问题。下面是解决问题1-3的一种方法。
Disclaimer: I may not always use the right terms when describing patterns or techniques. Sorry for that.
免责声明:在描述模式或技术时,我可能不会总是使用正确的术语。对不起。
The Goals:
- Create a complete example of a basic controller for viewing and editing
Users
. - 创建一个用于查看和编辑用户的基本控制器的完整示例。
- All code must be fully testable and mockable.
- 所有代码必须是完全可测试和可模拟的。
- The controller should have no idea where the data is stored (meaning it can be changed).
- 控制器应该不知道数据存储在哪里(意味着可以更改数据)。
- Example to show a SQL implementation (most common).
- 示例显示SQL实现(最常见)。
- For maximum performance, controllers should only receive the data they need—no extra fields.
- 为了获得最大的性能,控制器应该只接收它们需要的数据——不需要额外的字段。
- Implementation should leverage some type of data mapper for ease of development.
- 实现应该利用某种类型的数据映射器来简化开发。
- Implementation should have the ability to perform complex data lookups.
- 实现应该具有执行复杂数据查找的能力。
The Solution
I'm splitting my persistent storage (database) interaction into two categories: R (Read) and CUD (Create, Update, Delete). My experience has been that reads are really what causes an application to slow down. And while data manipulation (CUD) is actually slower, it happens much less frequently, and is therefore much less of a concern.
我将持久存储(数据库)交互分为两类:R (Read)和CUD(创建、更新、删除)。我的经验是,读取是导致应用程序变慢的真正原因。虽然数据操作(CUD)实际上比较慢,但发生的频率要低得多,因此也就不那么令人担心了。
CUD (Create, Update, Delete) is easy. This will involve working with actual models, which are then passed to my Repositories
for persistence. Note, my repositories will still provide a Read method, but simply for object creation, not display. More on that later.
CUD(创建、更新、删除)很简单。这将涉及使用实际的模型,然后将模型传递给我的存储库以实现持久性。注意,我的存储库仍将提供读取方法,但仅用于对象创建,而不是显示。稍后将进行更详细的讨论。
R (Read) is not so easy. No models here, just value objects. Use arrays if you prefer. These objects may represent a single model or a blend of many models, anything really. These are not very interesting on their own, but how they are generated is. I'm using what I'm calling Query Objects
.
读书并不是那么容易。这里没有模型,只有值对象。如果愿意,可以使用数组。这些对象可以表示单个模型或多个模型的混合,任何东西。它们本身并不是很有趣,但是它们是如何产生的。我使用的是查询对象。
The Code:
User Model
Let's start simple with our basic user model. Note that there is no ORM extending or database stuff at all. Just pure model glory. Add your getters, setters, validation, whatever.
让我们从最基本的用户模型开始。注意,根本没有ORM扩展或数据库之类的东西。单纯的模型的荣耀。添加getter、setter、验证等等。
class User
{
public $id;
public $first_name;
public $last_name;
public $gender;
public $email;
public $password;
}
Repository Interface
Before I create my user repository, I want to create my repository interface. This will define the "contract" that repositories must follow in order to be used by my controller. Remember, my controller will not know where the data is actually stored.
在创建用户存储库之前,我想要创建存储库接口。这将定义存储库必须遵循的“契约”,以便由我的控制器使用。记住,我的控制器不会知道数据实际存储在哪里。
Note that my repositories will only every contain these three methods. The save()
method is responsible for both creating and updating users, simply depending on whether or not the user object has an id set.
注意,我的存储库将只包含这三个方法。save()方法负责创建和更新用户,这取决于用户对象是否具有id集。
interface UserRepositoryInterface
{
public function find($id);
public function save(User $user);
public function remove(User $user);
}
SQL Repository Implementation
Now to create my implementation of the interface. As mentioned, my example was going to be with an SQL database. Note the use of a data mapper to prevent having to write repetitive SQL queries.
现在创建接口的实现。如前所述,我的示例将使用SQL数据库。注意数据映射器的使用,以避免必须编写重复的SQL查询。
class SQLUserRepository implements UserRepositoryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function find($id)
{
// Find a record with the id = $id
// from the 'users' table
// and return it as a User object
return $this->db->find($id, 'users', 'User');
}
public function save(User $user)
{
// Insert or update the $user
// in the 'users' table
$this->db->save($user, 'users');
}
public function remove(User $user)
{
// Remove the $user
// from the 'users' table
$this->db->remove($user, 'users');
}
}
Query Object Interface
Now with CUD (Create, Update, Delete) taken care of by our repository, we can focus on the R (Read). Query objects are simply an encapsulation of some type of data lookup logic. They are not query builders. By abstracting it like our repository we can change it's implementation and test it easier. An example of a Query Object might be an AllUsersQuery
or AllActiveUsersQuery
, or even MostCommonUserFirstNames
.
现在,有了库管理的CUD(创建、更新、删除),我们就可以关注R(读取)了。查询对象只是某种类型的数据查找逻辑的封装。它们不是查询生成器。通过像我们的存储库那样抽象它,我们可以更改它的实现并更容易地测试它。查询对象的一个示例可能是一个AllUsersQuery或AllActiveUsersQuery,甚至是最常见的用户名。
You may be thinking "can't I just create methods in my repositories for those queries?" Yes, but here is why I'm not doing this:
您可能会想“我不能在我的存储库中为这些查询创建方法吗?”是的,但我不这么做的原因是:
- My repositories are meant for working with model objects. In a real world app, why would I ever need to get the
password
field if I'm looking to list all my users? - 我的存储库是用来处理模型对象的。在现实世界的应用程序中,如果我要列出所有的用户,为什么还需要输入密码字段?
- Repositories are often model specific, yet queries often involve more than one model. So what repository do you put your method in?
- 存储库通常是特定于模型的,但是查询通常涉及多个模型。那么,你把你的方法放在什么存储库中呢?
- This keeps my repositories very simple—not an bloated class of methods.
- 这使我的存储库非常简单——不是一种臃肿的方法。
- All queries are now organized into their own classes.
- 所有查询现在都被组织到它们自己的类中。
- Really, at this point, repositories exist simply to abstract my database layer.
- 实际上,在这一点上,存储库的存在只是为了抽象我的数据库层。
For my example I'll create a query object to lookup "AllUsers". Here is the interface:
在我的示例中,我将创建一个查询对象来查找“AllUsers”。这是接口:
interface AllUsersQueryInterface
{
public function fetch($fields);
}
Query Object Implementation
This is where we can use a data mapper again to help speed up development. Notice that I am allowing one tweak to the returned dataset—the fields. This is about as far as I want to go with manipulating the performed query. Remember, my query objects are not query builders. They simply perform a specific query. However, since I know that I'll probably be using this one a lot, in a number of different situations, I'm giving myself the ability to specify the fields. I never want to return fields I don't need!
在这里,我们可以再次使用数据映射器来帮助加速开发。注意,我允许对返回的数据集——字段进行一次调整。这就是我想要操作执行的查询的地方。记住,我的查询对象不是查询生成器。它们只是执行一个特定的查询。但是,由于我知道我可能会经常使用这个,在许多不同的情况下,我给自己指定字段的能力。我不想返回我不需要的字段!
class AllUsersQuery implements AllUsersQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch($fields)
{
return $this->db->select($fields)->from('users')->orderBy('last_name, first_name')->rows();
}
}
Before moving on to the controller, I want to show another example to illustrate how powerful this is. Maybe I have a reporting engine and need to create a report for AllOverdueAccounts
. This could be tricky with my data mapper, and I may want to write some actual SQL
in this situation. No problem, here is what this query object could look like:
在继续讨论控制器之前,我想展示另一个示例来说明它的强大功能。也许我有一个报告引擎,需要为AllOverdueAccounts创建一个报告。对于我的数据映射器来说,这可能有点棘手,在这种情况下,我可能需要编写一些实际的SQL。没问题,这里是这个查询对象的样子:
class AllOverdueAccountsQuery implements AllOverdueAccountsQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch()
{
return $this->db->query($this->sql())->rows();
}
public function sql()
{
return "SELECT...";
}
}
This nicely keeps all my logic for this report in one class, and it's easy to test. I can mock it to my hearts content, or even use a different implementation entirely.
这很好地将我的报告逻辑保存在一个类中,并且很容易测试。我可以将它模拟到我的心里,甚至完全使用不同的实现。
The Controller
Now the fun part—bringing all the pieces together. Note that I am using dependency injection. Typically dependencies are injected into the constructor, but I actually prefer to inject them right into my controller methods (routes). This minimizes the controller's object graph, and I actually find it more legible. Note, if you don't like this approach, just use the traditional constructor method.
现在有趣的部分是把所有的部分组合在一起。注意,我正在使用依赖项注入。通常依赖项被注入到构造函数中,但实际上我更喜欢将它们注入到我的控制器方法(路由)中。这最小化了控制器的对象图,实际上我发现它更清晰。注意,如果不喜欢这种方法,只需使用传统的构造函数方法。
class UsersController
{
public function index(AllUsersQueryInterface $query)
{
// Fetch user data
$users = $query->fetch(['first_name', 'last_name', 'email']);
// Return view
return Response::view('all_users.php', ['users' => $users]);
}
public function add()
{
return Response::view('add_user.php');
}
public function insert(UserRepositoryInterface $repository)
{
// Create new user model
$user = new User;
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the new user
$repository->save($user);
// Return the id
return Response::json(['id' => $user->id]);
}
public function view(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('view_user.php', ['user' => $user]);
}
public function edit(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('edit_user.php', ['user' => $user]);
}
public function update(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Update the user
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the user
$repository->save($user);
// Return success
return true;
}
public function delete(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Delete the user
$repository->delete($user);
// Return success
return true;
}
}
Final Thoughts:
The important things to note here are that when I'm modifying (creating, updating or deleting) entities, I'm working with real model objects, and performing the persistance through my repositories.
这里需要注意的重要事项是,当我修改(创建、更新或删除)实体时,我使用的是真实的模型对象,并通过我的存储库执行持久化。
However, when I'm displaying (selecting data and sending it to the views) I'm not working with model objects, but rather plain old value objects. I only select the fields I need, and it's designed so I can maximum my data lookup performance.
但是,当我显示(选择数据并将其发送到视图)时,我不是在处理模型对象,而是在处理普通的旧值对象。我只选择我需要的字段,它的设计使我可以最大限度地提高数据查找性能。
My repositories stay very clean, and instead this "mess" is organized into my model queries.
我的存储库保持非常干净,相反,这种“混乱”被组织到我的模型查询中。
I use a data mapper to help with development, as it's just ridiculous to write repetitive SQL for common tasks. However, you absolutely can write SQL where needed (complicated queries, reporting, etc.). And when you do, it's nicely tucked away into a properly named class.
我使用数据映射器来帮助开发,因为为常见任务编写重复的SQL是非常可笑的。但是,您绝对可以在需要的地方编写SQL(复杂的查询、报告等)。当你这样做的时候,它就会很好地隐藏在一个正确命名的类中。
I'd love to hear your take on my approach!
我很想听听你对我方法的看法!
July 2015 Update:
2015年7月更新:
I've been asked in the comments where I ended up with all this. Well, not that far off actually. Truthfully, I still don't really like repositories. I find them overkill for basic lookups (especially if you're already using an ORM), and messy when working with more complicated queries.
我在评论中被问到这些问题。好吧,其实没那么远。说实话,我还是不喜欢仓库。我发现它们对于基本查找(特别是当您已经使用ORM)和处理更复杂的查询时都非常混乱。
I generally work with an ActiveRecord style ORM, so most often I'll just reference those models directly throughout my application. However, in situations where I have more complex queries, I'll use query objects to make these more reusable. I should also note that I always inject my models into my methods, making them easier to mock in my tests.
我通常使用ActiveRecord风格的ORM,所以大多数情况下,我将在整个应用程序中直接引用这些模型。但是,在遇到更复杂的查询时,我将使用查询对象使这些查询更可重用。我还应该注意,我总是将模型注入到我的方法中,使它们更容易在测试中被模仿。
#2
38
Based on my experience, here are some answers to your questions:
根据我的经验,这里有一些问题的答案:
Q: How do we deal with bringing back fields we don't need?
问:我们如何处理把我们不需要的土地带回来?
A: From my experience this really boils down to dealing with complete entities versus ad-hoc queries.
答:根据我的经验,这实际上可以归结为处理完整的实体和临时查询。
A complete entity is something like a User
object. It has properties and methods, etc. It's a first class citizen in your codebase.
一个完整的实体就像一个用户对象。它有属性和方法等等。它是你代码库中的一级公民。
An ad-hoc query returns some data, but we don't know anything beyond that. As the data gets passed around the application, it is done so without context. Is it a User
? A User
with some Order
information attached? We don't really know.
一个特别的查询返回一些数据,但是除了这些我们什么都不知道。当数据在应用程序中传递时,它是在没有上下文的情况下完成的。这是一个用户吗?一个有订单信息的用户?我们真的不知道。
I prefer working with full entities.
我更喜欢与完整的实体一起工作。
You are right that you will often bring back data you won't use, but you can address this in various ways:
你是对的,你经常会带回一些你不会用到的数据,但是你可以用不同的方式来解决这个问题:
- Aggressively cache the entities so you only pay the read price once from the database.
- 积极地缓存实体,以便只从数据库中读取一次。
- Spend more time modeling your entities so they have good distinctions between them. (Consider splitting a large entity into two smaller entities, etc.)
- 花更多的时间为实体建模,以便它们之间有很好的区别。(考虑把一个大的实体分成两个小的实体,等等。)
- Consider having multiple versions of entities. You can have a
User
for the back end and maybe aUserSmall
for AJAX calls. One might have 10 properties and one has 3 properties. - 考虑拥有多个实体版本。您可以为后端提供一个用户,也可以为AJAX调用提供一个UserSmall。一个可能有10个属性,一个有3个属性。
The downsides of working with ad-hoc queries:
处理特别查询的缺点:
- You end up with essentially the same data across many queries. For example, with a
User
, you'll end up writing essentially the sameselect *
for many calls. One call will get 8 of 10 fields, one will get 5 of 10, one will get 7 of 10. Why not replace all with one call that gets 10 out of 10? The reason this is bad is that it is murder to re-factor/test/mock. - 在许多查询中,最终得到的是相同的数据。例如,对于用户,您最终将为许多调用编写本质上相同的select *。一个调用会得到10个字段中的8个,一个会得到5个10,一个会得到7个10。为什么不把所有的都换成一个10次的电话呢?这很糟糕的原因是重构/测试/模拟是谋杀。
- It becomes very hard to reason at a high level about your code over time. Instead of statements like "Why is the
User
so slow?" you end up tracking down one-off queries and so bug fixes tend to be small and localized. - 随着时间的推移,很难对代码进行高水平的推理。而不是像“为什么用户这么慢?”这样的语句,您最终会跟踪一次性的查询,因此错误修复往往是小型的、本地化的。
- It's really hard to replace the underlying technology. If you store everything in MySQL now and want to move to MongoDB, it's a lot harder to replace 100 ad-hoc calls than it is a handful of entities.
- 真的很难取代基础技术。如果您现在将所有东西都存储在MySQL中,并希望迁移到MongoDB,那么要替换100个临时调用要比替换为数不多的实体要困难得多。
Q: I will have too many methods in my repository.
问:我的资源库中将有太多的方法。
A: I haven't really seen any way around this other than consolidating calls. The method calls in your repository really map to features in your application. The more features, the more data specific calls. You can push back on features and try to merge similar calls into one.
A:除了合并电话之外,我还没见过别的办法。存储库中的方法调用实际上映射到应用程序中的特性。特性越多,特定于数据的调用就越多。您可以回推特性,并尝试将类似的调用合并为一个。
The complexity at the end of the day has to exist somewhere. With a repository pattern we've pushed it into the repository interface instead of maybe making a bunch of stored procedures.
一天结束时的复杂性必须存在于某处。使用存储库模式,我们将它推入存储库接口,而不是生成一堆存储过程。
Sometimes I have to tell myself, "Well it had to give somewhere! There are no silver bullets."
有时我不得不对自己说:“它总得有个归宿吧!”没有银弹。”
#3
11
I use the following interfaces:
我使用以下接口:
-
Repository
- loads, inserts, updates and deletes entities - 存储库——加载、插入、更新和删除实体
-
Selector
- finds entities based on filters, in a repository - 选择器——在存储库中基于过滤器查找实体
-
Filter
- encapsulates the filtering logic - 过滤器——封装过滤逻辑
My Repository
is database agnostic; in fact it doesn't specify any persistence; it could be anything: SQL database, xml file, remote service, an alien from outer space etc. For searching capabilities, the Repository
constructs an Selector
which can be filtered, LIMIT
-ed, sorted and counted. In the end, the selector fetches one or more Entities
from the persistence.
我的存储库是数据库无关的;事实上,它没有指定任何持久性;它可以是任何东西:SQL数据库、xml文件、远程服务、来自外部空间的异类等等。最后,选择器从持久性中获取一个或多个实体。
Here is some sample code:
以下是一些示例代码:
<?php
interface Repository
{
public function addEntity(Entity $entity);
public function updateEntity(Entity $entity);
public function removeEntity(Entity $entity);
/**
* @return Entity
*/
public function loadEntity($entityId);
public function factoryEntitySelector():Selector
}
interface Selector extends \Countable
{
public function count();
/**
* @return Entity[]
*/
public function fetchEntities();
/**
* @return Entity
*/
public function fetchEntity();
public function limit(...$limit);
public function filter(Filter $filter);
public function orderBy($column, $ascending = true);
public function removeFilter($filterName);
}
interface Filter
{
public function getFilterName();
}
Then, one implementation:
然后,一个实现:
class SqlEntityRepository
{
...
public function factoryEntitySelector()
{
return new SqlSelector($this);
}
...
}
class SqlSelector implements Selector
{
...
private function adaptFilter(Filter $filter):SqlQueryFilter
{
return (new SqlSelectorFilterAdapter())->adaptFilter($filter);
}
...
}
class SqlSelectorFilterAdapter
{
public function adaptFilter(Filter $filter):SqlQueryFilter
{
$concreteClass = (new StringRebaser(
'Filter\\', 'SqlQueryFilter\\'))
->rebase(get_class($filter));
return new $concreteClass($filter);
}
}
The ideea is that the generic Selector
uses Filter
but the implementation SqlSelector
uses SqlFilter
; the SqlSelectorFilterAdapter
adapts a generic Filter
to a concrete SqlFilter
.
ideea是泛型选择器使用过滤器,而实现SqlSelector使用SqlFilter;SqlSelectorFilterAdapter将通用过滤器改编为具体的SqlFilter。
The client code creates Filter
objects (that are generic filters) but in the concrete implementation of the selector those filters are transformed in SQL filters.
客户端代码创建过滤器对象(通用过滤器),但是在选择器的具体实现中,这些过滤器在SQL过滤器中被转换。
Other selector implementations, like InMemorySelector
, transform from Filter
to InMemoryFilter
using their specific InMemorySelectorFilterAdapter
; so, every selector implementation comes with its own filter adapter.
其他选择器实现,比如InMemorySelector,使用特定的InMemorySelectorFilterAdapter从过滤器转换到InMemoryFilter;因此,每个选择器实现都有自己的筛选器适配器。
Using this strategy my client code (in the bussines layer) doesn't care about a specific repository or selector implementation.
使用此策略,我的客户机代码(在bussines层中)不关心特定的存储库或选择器实现。
/** @var Repository $repository*/
$selector = $repository->factoryEntitySelector();
$selector->filter(new AttributeEquals('activated', 1))->limit(2)->orderBy('username');
$activatedUserCount = $selector->count(); // evaluates to 100, ignores the limit()
$activatedUsers = $selector->fetchEntities();
P.S. This is a simplification of my real code
这是我的真实代码的简化
#4
3
I can only comment on the way we (at my company) deal with this. First of all performance is not too much of an issue for us, but having clean/proper code is.
我只能评论我们(在我的公司)处理这件事的方式。首先,对于我们来说,性能并不是问题,但是拥有干净/正确的代码就足够了。
First of all we define Models such as a UserModel
that uses an ORM to create UserEntity
objects. When a UserEntity
is loaded from a model all fields are loaded. For fields referencing foreign entities we use the appropriate foreign model to create the respective entities. For those entities the data will be loaded ondemand. Now your initial reaction might be ...???...!!! let me give you an example a bit of an example:
首先,我们定义模型,比如使用ORM创建UserEntity对象的UserModel。当从模型加载UserEntity时,将加载所有字段。对于引用外部实体的字段,我们使用适当的外部模型来创建各自的实体。对于这些实体,数据将按需加载。现在你的第一反应可能是…?? !我举个例子
class UserEntity extends PersistentEntity
{
public function getOrders()
{
$this->getField('orders'); //OrderModel creates OrderEntities with only the ID's set
}
}
class UserModel {
protected $orm;
public function findUsers(IGetOptions $options = null)
{
return $orm->getAllEntities(/*...*/); // Orm creates a list of UserEntities
}
}
class OrderEntity extends PersistentEntity {} // user your imagination
class OrderModel
{
public function findOrdersById(array $ids, IGetOptions $options = null)
{
//...
}
}
In our case $db
is an ORM that is able to load entities. The model instructs the ORM to load a set of entities of a specific type. The ORM contains a mapping and uses that to inject all the fields for that entity in to the entity. For foreign fields however only the id's of those objects are loaded. In this case the OrderModel
creates OrderEntity
s with only the id's of the referenced orders. When PersistentEntity::getField
gets called by the OrderEntity
the entity instructs it's model to lazy load all the fields into the OrderEntity
s. All the OrderEntity
s associated with one UserEntity are treated as one result-set and will be loaded at once.
在我们的案例中,$db是一个能够装载实体的ORM。模型指示ORM加载一组特定类型的实体。ORM包含一个映射,并使用它将该实体的所有字段注入到该实体中。但是对于外字段,只加载这些对象的id。在这种情况下,OrderModel只使用被引用订单的id来创建OrderEntitys。当OrderEntity调用PersistentEntity: getField时,实体指示它的模型将所有字段延迟加载到OrderEntitys中。与一个UserEntity关联的所有OrderEntitys都被视为一个结果集,并将立即加载。
The magic here is that our model and ORM inject all data into the entities and that entities merely provide wrapper functions for the generic getField
method supplied by PersistentEntity
. To summarize we always load all the fields, but fields referencing a foreign entity are loaded when necessary. Just loading a bunch of fields is not really a performance issue. Load all possible foreign entities however would be a HUGE performance decrease.
这里的神奇之处在于,我们的模型和ORM将所有数据注入实体,实体仅仅为PersistentEntity提供的通用getField方法提供包装函数。总而言之,我们总是加载所有字段,但必要时将加载引用外部实体的字段。仅仅加载一堆字段并不是真正的性能问题。然而,加载所有可能的外国实体将是一个巨大的性能下降。
Now on to loading a specific set of users, based on a where clause. We provide an object oriented package of classes that allow you to specify simple expression that can be glued together. In the example code I named it GetOptions
. It's a wrapper for all possible options for a select query. It contains a collection of where clauses, a group by clause and everything else. Our where clauses are quite complicated but you could obviously make a simpler version easily.
现在,根据where子句加载一组特定的用户。我们提供了一个面向对象的类包,它允许您指定可以粘在一起的简单表达式。在示例代码中,我将它命名为GetOptions。它是select查询的所有可能选项的包装器。它包含where子句、group by子句和其他所有内容的集合。我们的where子句非常复杂,但是显然您可以轻松地生成一个更简单的版本。
$objOptions->getConditionHolder()->addConditionBind(
new ConditionBind(
new Condition('orderProduct.product', ICondition::OPERATOR_IS, $argObjProduct)
)
);
A simplest version of this system would be to pass the WHERE part of the query as a string directly to the model.
这个系统最简单的版本是将查询的WHERE部分作为字符串直接传递给模型。
I'm sorry for this quite complicated response. I tried to summarize our framework as quickly and clear as possible. If you have any additional questions feel free to ask them and I'll update my answer.
很抱歉这么复杂的回答。我试着尽可能快而清晰地总结我们的框架。如果你有任何问题,请随时提出来,我将更新我的答案。
EDIT: Additionally if you really don't want to load some fields right away you could specify a lazy loading option in your ORM mapping. Because all fields are eventually loaded through the getField
method you could load some fields last minute when that method is called. This is not a very big problem in PHP, but I would not recommend for other systems.
编辑:另外,如果您真的不想立即加载某些字段,您可以在ORM映射中指定一个延迟加载选项。因为所有字段最终都是通过getField方法加载的,所以在调用该方法时,您可以在最后时刻加载一些字段。这在PHP中不是一个很大的问题,但是我不推荐其他系统。
#5
3
These are some different solutions I've seen. There are pros and cons to each of them, but it is for you to decide.
这些是我见过的不同的解。它们各有利弊,但由你来决定。
Issue #1: Too many fields
This is an important aspect especially when you take in to account Index-Only Scans. I see two solutions to dealing with this problem. You can update your functions to take in an optional array parameter that would contain a list of a columns to return. If this parameter is empty you'd return all of the columns in the query. This can be a little weird; based off the parameter you could retrieve an object or an array. You could also duplicate all of your functions so that you have two distinct functions that run the same query, but one returns an array of columns and the other returns an object.
这是一个很重要的方面,尤其是考虑到只使用索引的扫描时。我看到了解决这个问题的两种方法。您可以更新函数,以接受一个可选的数组参数,该参数将包含要返回的列的列表。如果该参数为空,则返回查询中的所有列。这可能有点奇怪;根据参数可以检索对象或数组。您还可以复制所有的函数,以便有两个不同的函数运行相同的查询,但一个返回列数组,另一个返回对象。
public function findColumnsById($id, array $columns = array()){
if (empty($columns)) {
// use *
}
}
public function findById($id) {
$data = $this->findColumnsById($id);
}
Issue #2: Too many methods
I briefly worked with Propel ORM a year ago and this is based off what I can remember from that experience. Propel has the option to generate its class structure based off the existing database schema. It creates two objects for each table. The first object is a long list of access function similar to what you have currently listed; findByAttribute($attribute_value)
. The next object inherits from this first object. You can update this child object to build in your more complex getter functions.
一年前,我曾短暂地与Propel ORM合作过,这是基于我对那段经历的记忆。Propel具有基于现有数据库模式生成类结构的选项。它为每个表创建两个对象。第一个对象是一长串访问函数列表,类似于您当前列出的;findByAttribute(attribute_value美元)。下一个对象从第一个对象继承。您可以更新这个子对象以构建更复杂的getter函数。
Another solution would be using __call()
to map non defined functions to something actionable. Your __call
method would be would be able to parse the findById and findByName into different queries.
另一种解决方案是使用__call()将未定义的函数映射到可操作的东西。您的__call方法将能够将findById和findByName解析为不同的查询。
public function __call($function, $arguments) {
if (strpos($function, 'findBy') === 0) {
$parameter = substr($function, 6, strlen($function));
// SELECT * FROM $this->table_name WHERE $parameter = $arguments[0]
}
}
I hope this helps at least some what.
我希望这至少能有所帮助。
#6
3
I'll add a bit on this as I am currently trying to grasp all of this myself.
我将对此进行补充,因为我目前正试图自己掌握所有这些。
#1 and 2
This is a perfect place for your ORM to do the heavy lifting. If you are using a model that implements some kind of ORM, you can just use it's methods to take care of these things. Make your own orderBy functions that implement the Eloquent methods if you need to. Using Eloquent for instance:
这是一个完美的地方,你的ORM做重型起重。如果您正在使用实现某种ORM的模型,您可以使用它的方法来处理这些事情。如果需要,您可以创建自己的orderBy函数来实现有说服力的方法。用雄辩的例如:
class DbUserRepository implements UserRepositoryInterface
{
public function findAll()
{
return User::all();
}
public function get(Array $columns)
{
return User::select($columns);
}
What you seem to be looking for is an ORM. No reason your Repository can't be based around one. This would require User extend eloquent, but I personally don't see that as a problem.
你似乎在寻找的是一个ORM。没有理由你的存储库不能基于一个。这将需要用户进行有说服力的扩展,但我个人并不认为这是一个问题。
If you do however want to avoid an ORM, you would then have to "roll your own" to get what you're looking for.
如果你真的想避免ORM,那么你就必须“自己动手”才能得到你想要的东西。
#3
Interfaces aren't supposed be hard and fast requirements. Something can implement an interface and add to it. What it can't do is fail to implement a required function of that interface. You can also extend interfaces like classes to keep things DRY.
接口不应该是硬性和快速的需求。一些东西可以实现一个接口并添加它。它不能做的是无法实现该接口所需的函数。您还可以扩展诸如类之类的接口来保持事物的干涩。
That said, I'm just starting to get a grasp, but these realizations have helped me.
也就是说,我刚刚开始有所领悟,但这些认识帮助了我。