[转载] ORMs under the hood

时间:2023-01-27 08:46:55

原文: http://www.vertabelo.com/blog/technical-articles/orms-under-the-hood

It often happens that if something is loved, it is also hated with the same power. The idea of object relational mapping fits into this concept perfectly. You will definitely come across many opposite points and fierce discussions as well as unwavering advocates and haters. So if you have ever asked whether to use ORM or not, the answer “it depends” will not be enough.

What is an object-relational mapping (ORM)?

Object-relational mapping refers to synchronization between two different representations of data. From one side, there is a relational database, where the data is represented in terms of tuples, grouped into relations, while in application we manipulate objects.

[转载] ORMs under the hood

[转载] ORMs under the hood

Most contemporary applications are built in object oriented technologies like Java or C# and use mostly relational databases to store data. Since it is definitely more convenient when data retrieved from the database can be accessed in the form of objects, there are also many implementations of ORM for many languages and frameworks (current list is quite huge ).

There is a big controversy connected to this topic which forces us to ask: What exactly are the benefits of an ORM and what are its detriments? Wading through different blog posts and discussions I will start from the opinion of Megan Bowra-Dean (@megahbite):

They’re great 99% of the time in MVC frameworks but that other 1% causes a lot of headaches if you don’t know SQL.

It’s said that ORM allows a developer to completely separate themselves from the database and almost forget about it (apart from situations when database changes need to be performed). Indeed a large number of developers are not very passionate about SQL and database concepts, and they live with the illusion that ORM will handle the job for them. Consequently the database sinks into oblivion till some problem appears. Neglecting the database while using an ORM is a big mistake and causes many problems.

Why SQL is so important while using an ORM?

[转载] ORMs under the hood

[转载] ORMs under the hood

Generally, there are two approaches that are followed. The first one is “Domain model centric” in which the code drives database design, while the second one, “Relational model centric”, places database design as the first step followed by the code. Both approaches have their pros and cons, but the second one is earning more advocates, mostly due to the database-first approach and requirements that are impossible to avoid like knowing about the principles of relational database and SQL.

So, what are the most common dangers associated with using ORM and not knowing SQL?

Danger 1: Not knowing database data types. This usually involves using too-general data types like ‘text’ type in PostgreSQL for most of the columns. This type is introduced to store strings of unlimited length what is identified with CLOB type. Multiple columns with this type cause an enormous performance challenge.

Danger 2: Not knowing constraints. This approach is quite often seen in many database designs. As joins become the most vulnerable case, in this approach they are overused which is extremely inefficient.

Danger 3: Not knowing basic database concepts : for example, not knowing about indexes. Without these smart structures, retrieving data becomes a performance nightmare. A system with data stored in a database with no indexes is a sign of imminent death for the application.

These three points are some of the many things the beginner developer need to know. A strong enthusiast of SQL and the creator of jOOQ (SQL library for Java) – Lukas Eder (@lukaseder), confirms a painful truth that many developers don’t know SQL:

We’re just not exposed to SQL nowadays. But consider this: We developers (or our customers) are paying millions of dollars every year to Oracle, Microsoft, IBM, SAP for their excellent RDBMS, only to ignore 90% of their database features and to perform a little bit of CRUD and a little bit of ACID with ORMs like Hibernate. We have forgotten about why those RDBMS were so expensive in the first place.
Source: “NoSQL? No, SQL! – How to Calculate Running Totals”

Use only SQL or stay with ORM?

The thesis of the article “What ORMs have taught me: just learn SQL” by Geoff Wozniak:

Attribute creep and excessive use of foreign keys shows me that in order to use ORMs effectively, you still need to know SQL. My contention with ORMs is that, if you need to know SQL, just use SQL since it prevents the need to know how non-SQL gets translated to SQL.

This article caused an avalanche of replies and contradictory discussions. Most of them were focusing on abandoning ORM and instead using only SQL or staying with it and suffering from learning both the ORM and SQL. It still appears to be an unresolved problem.

One of the first-class benefits of ORM is its ease-of-use and the fact that it requires fewer lines of code. It definitely makes development quicker but only in case you’re familiar with a particular ORM. For others, ORM is a waste of time spent resolving persistence related issues. Not knowing it imposes some learning time and cumbersome debugging as a consequence of lack of experience. Indeed, with an ORM, you have no control over what is happening “under the hood.” Sometimes there are many configurations that are difficult to manage efficiently, many weird behaviors and in many cases problems with generating an optimized SQL query with an ORM. And here a common, popular story gets repeated: an ORM presents nice development initially and then there are problems when tracking ORM related bugs and inefficiencies. The statement of fellow ORM developer Gavin King – the creator of Hibernate rings true.

[转载] ORMs under the hood

At this point I could come to the conclusion that ORM is not a universal tool and that all the haters are justified, because ORMs were designed this way on purpose, like the concept of ‘leaky abstraction.’ In his comments on Reddit, Gavin King explains this concept as:

the leakiness of the ORM abstraction is a feature, not a bug. It’s meant to be that way, because both object model and the relational model are valid ways of looking at the data, and both are useful

systems like Hibernate are intentionally designed as leaky abstractions so that it’s possible to easily mix in native SQL where necessary

Not only Gavin, but many others that have problems with ORMs say that it is connected to not enough knowledge about SQL, relational databases and the relational model instead of the way that ORM works with SQL. The commonly cited example of a developer with “a single object” with 600 attributes and 14 joins perfectly proves this phenomenon.

How does ORM work?

The main postulate that characterizes ORM is that it encapsulates database interaction inside an object. One part of the object keeps the data and the second one knows how to deal with the data and transfers it to the relational database (this functionality is implemented inside the ORM engine).

ORMs design patterns

According to Martin Fowler (@martinfowler), there are two patterns that were adopted in different object relational mappers.

The first one is Active Record and it is “An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data.” It means that a record on a table in a relational database is represented as an object.

[转载] ORMs under the hood

Source: http://www.martinfowler.com/eaaCatalog/activeRecord.html

The second one is a Data Mapper.

According to the official definition, it is “A layer of Mappers that moves data between objects and a database while keeping them independent of each other and the mapper itself.” That means that there is a special layer that separates in-memory objects from the database and its main responsibility is to transfer data between those two layers.

[转载] ORMs under the hood

Source: http://martinfowler.com/eaaCatalog/dataMapper.html

There are a lot discussions about the supremacy of one over the other. Generally speaking, the Data Mapper pattern lacks the efficiencies when it comes to CRUD operations. In comparison to the Active Record pattern, the representation of an object is not necessarily a record in the database. The strongest feature of the Data Mapper design is the one API for various data stores (relational and NO SQL databases: external data stores accessed through REST, file based data storage or schema free databases).

Popular ORMs

Here is the list of some popular ORMs for the three leading programming languages: Python, Java and PHP.

ORMs using Active Record pattern:

Python   Java   PHP
   

ORMs using Data Mapper pattern:

Python   Java   PHP
   

In brief: advantages and disadvantages of using ORM

Let’s gather together the most important points for and against ORM.

FOR:

  • let the developer think in terms of objects rather than tables
  • no need to write SQL code
  • many advanced features like lazy loading
  • database independent: No need to write code specific to a particular database
  • reduces code and allows developers to focus on their business logic, rather than complex database queries
  • various ORMs provide a rich query interface

AGAINST:

  • complex (because they handle a bidirectional mapping). Their complexity implies a grueling learning curve – they have a special query language which developers have to learn
  • provides only a leaky abstraction over a relational data store
  • usually systems using an ORM perform badly (due to naive interactions with the underlying database)
  • ORM, by adding a layer of abstraction, speeds up the development but adds overhead to the application

[转载] ORMs under the hood

Depending on your problem domain, the cons may outweigh the pros and so you might just want to stick with straight SQL. It’s important to know when to use and when not to use ORM.

SAY YES TO ORM:

  • it is intended for OLTP applications

SAY NO TO ORM:

  • not intended for batch processing
  • not recommended where there is a need to perform huge analysis

ORMs by example

Considering the below simple model, let’s see how some example usage looks in top ORMs in Python, Java and PHP:

↑ Click on a logo to preview the model in Vertabelo | Download the model as a png file

The ddl for this example looks as follows:

-- tables
-- Table: client
CREATE TABLE client (
    id int  NOT NULL,
    full_name varchar(255)  NOT NULL,
    email varchar(255)  NOT NULL,
    CONSTRAINT client_pk PRIMARY KEY (id)
);
 
-- Table: purchase
CREATE TABLE purchase (
    id int  NOT NULL,
    purchase_no char(12)  NOT NULL,
    client_id int  NOT NULL,
    CONSTRAINT purchase_pk PRIMARY KEY (id)
);
 
-- foreign keys
-- Reference:  client_purchase (table: purchase)
 
 
ALTER TABLE purchase ADD CONSTRAINT client_purchase
    FOREIGN KEY (client_id)
    REFERENCES client (id)
    NOT DEFERRABLE
    INITIALLY IMMEDIATE
;

Python

Let’s start from one of the most-used Python ORM – SQLAlchemy. This framework provides an ORM with an adopted Data Mapper pattern. It contains two parts. The first one is ORM, while the second one is the Core. The Core is a fully featured SQL abstraction toolkit with SQL Expression Language that enable you to write SQL statements via generative Python expressions. Advanced ORM features like unit of work (confines all the database manipulation code to a specific database session that controls the life cycles of every object in that session. Session here is a beefed up version of database transaction), in-memory collections, eager loading of collections via joins and secondary subselects, and other optimizations allow SQLAlchemy’s ORM to emit efficient queries, select rows from not only tables, but also joins and other select statements and compose it to various Python structures.

In order to map tables to Python classes we have to define subclasses of declarative_base(). If you are using Vertabelo as a database modeling tool, you can provide some automation and generate it using open-source script hosted on Github, that takes as an argument the model structure in Vertabelo XML (you can generate it by clicking on XML button in the application). Follow full tutorial here or you can simply write the classes by hand.

The model looks as follows:

import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, BigInteger,String, ForeignKey, Unicode, Binary, LargeBinary, Time, DateTime, Date, Text, Boolean
from sqlalchemy.orm import relationship, backref, deferred
from sqlalchemy.orm import sessionmaker
 
 
Base = declarative_base()
 
class Purchase (Base):
    __tablename__ = "purchase"
    id = Column('id', Integer, primary_key = True)
    purchase_no = Column('purchase_no', Unicode)
    client_id = Column('client_id', Integer, ForeignKey('client.id'))
 
    client = relationship('Client', foreign_keys=client_id)
 
class Client (Base):
    __tablename__ = "client"
    id = Column('id', Integer, primary_key = True)
    full_name = Column('full_name', Unicode)
    email = Column('email', Unicode)

To carry out a small investigation of this ORM we will create tables in SQLite database from the generated Python classes and try some Python expressions to perform some insert.

Run the interactive Python console and create a database engine for our database session.

>>> from sqlalchemy import create_engine
>>> engine = create_engine('sqlite:///')

Import classes from generated output.py file

>>> from output import *

Create a database session and database tables for classes defined as Client and Purchase. The lines provided below will create a session object which will be bound to the sqlite database. All of the database manipulation code and objects will be attached to a database session.

>>> from sqlalchemy.orm import sessionmaker
>>> session = sessionmaker()
>>> session.configure(bind=engine)
>>> Base.metadata.create_all(engine)

Any changes performed on a database are done in the context of a session. Multiple changes can be grouped in a session via session.add(object). To register all the changes in a database, issue a statementdb.session.commit().

Inserting data can be performed in three steps.

Create a session object

  1. Create a Python object of Client and Purchase classes
  2. Add it to the session
  3. Commit the session
>>> s = session()
>>> c = Client(full_name=’client1’, email=’client1@gmail.com’)
>>> p = Purchase(purchase_no=u’abcd123’, client_id=c.id)
>>> s.add(c)
>>> s.add(p)
>>> s.commit()

To retrieve the database objects, we call query() and filter() methods from the databasesession object.

>>> c = s.query(Client).filter(Client.full_name == 'client1').one()
 
>>>  c.full_name
 u'client1'
 
>>> c.email
 u'client1@gmail.com'
 
>>> p = s.query(Purchase).filter(Purchase.client_id == c.id).one()
>>> p.purchase_no
 u'abcd123'

Python frameworks to use with SQLAlchemy:

Java

jOOQ – is a database-mapping tool that implements the Active Record pattern, and is commonly considered a “happy medium between ORMs and JDBC”, because it embraces SQL rather than trying to hide it. jOOQ focuses on a relational model centric approach and assumes that database design should drive Java code. One of the features is possibility to construct queries that has similar structure to SQL.

Via jOOQ you can generate Java classes that model tables, records, sequences, stored procedures and many more. There are three ways to generate the jOOQ classes. You can generate jOOQ classse from the existing database or using Vertabelo-jOOQ integration. Having already prepared the database structure in Vertabeloyou can export it as a Vertabelo XML file and add as a property in the codegen configuration file (Full tutorialhere) or you can download the model directly from the cloud by providing the Vertabelo API token (model identifier) to the codegen file (full tutorial here). More information can be found here.

For example, an insert for our client-purchase example can be visualized in the code as follows:

// Create a new record
ClientRecord client = create.newRecord(CLIENT);
 
// Insert the record: INSERT INTO CLIENT VALUES (‘John’, ‘john@gmail.com’)
client.setFullName(‘John’);
client.setEmail(‘john@gmail.com’);
client.store();

Or we can loop over the clients returned from SELECT statements and delete those whose full_name is ‘John’.

for (PurchaseRecord purchase : create .selectFrom(PURCHASE)
   .where(PURCHASE.CLIENT_ID.in(
       select(CLIENT.ID)
       .from(CLIENT)
       .where(CLIENT.FULL_NAME.equal("John"))
)) {
purchase.delete();
}

PHP

Propel – is a PHP ORM with implemented Active Record pattern. It describes the model structure (tables, columns, relationships) in an XML file called the schema and generates classes based on the schema definition of tables.

The schema.xml can be generated by using a vertabeloPropel script. For more details follow this tutorial.

After generation, the structure of the database schema looks like the following:

<?xml version="1.0"?>
<database name="my_db" defaultIdMethod="native">
  <table name="purchase">
    <column name="id" type="INTEGER" sqlType="int" required="true" primaryKey="true"/>
    <column name="purchase_no" type="CHAR" size="12" sqlType="char(12)" required="true"/>
    <column name="client_id" type="INTEGER" sqlType="int" required="true"/>
    <foreign-key foreignTable="client" name="client_purchase" onDelete="none" onUpdate="none">
      <reference foreign="id" local="client_id"/>
    </foreign-key>
  </table>
  <table name="client">
    <column name="id" type="INTEGER" sqlType="int" required="true" primaryKey="true"/>
    <column name="full_name" type="VARCHAR" size="255" sqlType="varchar(255)" required="true"/>
    <column name="email" type="VARCHAR" size="255" sqlType="varchar(255)" required="true"/>
  </table>
</database>

Having described the structure, we can generate the SQL code, by issuing:

$ propel sql:build

and then generate model classes, by running:

$ propel model:build

Adding new data in Propel can be performed in few steps

  1. Create a Propel object
  2. Define a value for each column by using setXXX() method
  3. Call save() method.
$client = new Client();
$client->setFullName('client1');
$client->setEmail('client1@gmail.com');
$client->save();

To read a row from a database, we use generated Query object and generated findPK() method. Reading object properties is done via calling the getter on the column.

$firstClient = ClientQuery::create()->findPK(1);
echo $firstClient->getFullName();

where line $firstClient = ClientQuery::create()->findPK(1); issues a simple SQL query.

SELECT client.id, client.full_name, client.email
FROM client
WHERE client.id = 1
LIMIT 1;

Yes, the problem of Object Relational Mapping is definitely a hard one. Most of the discussion still keeps revolving around the thoughts: “I would like to have something that lacks typical Hibernate or another ORM defects, but I don’t have a viable replacement. I am not planning to write all the code in pure SQL/JDBC, therefore I will stick to what I know.” The problem of Object Relational Mapping is still unresolved and discussions about it still become a battlefield. How do you handle the ORM problem? When do you say YES or NO to using ORM?

[转载] ORMs under the hood的更多相关文章

  1. python 常用模块(转载)

    转载地址:http://codeweblog.com/python-%e5%b8%b8%e7%94%a8%e6%a8%a1%e5%9d%97/ adodb:我们领导推荐的数据库连接组件bsddb3:B ...

  2. 【转载】LINUX 和 WINDOWS 内核的区别

    LINUX 和 WINDOWS 内核的区别 [声明:欢迎转载,转载请注明出自CU ACCESSORY http://linux.chinaunix.net/bbs/thread-1153868-1-1 ...

  3. Valgrind使用-转载

    完全转载的,仅当学习使用 对应链接地址:http://www.gjprj.cn/news1.asp?id=2862 另见valgrind的使用简介,http://blog.csdn.net/sduli ...

  4. Crystal Clear Applied&colon; The Seven Properties of Running an Agile Project &lpar;转载&rpar;

    作者Alistair Cockburn, Crystal Clear的7个成功要素,写得挺好. 敏捷方法的关注点,大家可以参考,太激动所以转载了. 原文:http://www.informit.com ...

  5. RTP与RTCP协议介绍(转载)

    RTSP发起/终结流媒体.RTP传输流媒体数据 .RTCP对RTP进行控制,同步.RTP中没有连接的概念,本身并不能为按序传输数据包提供可靠的保证,也不提供流量控制和拥塞控制,这些都由RTCP来负责完 ...

  6. 《Walking the callstack(转载)》

    本文转载自:https://www.codeproject.com/articles/11132/walking-the-callstack Download demo project with so ...

  7. &lbrack;转载&rsqb;MVVM模式原理分析及实践

    没有找到很好的MVVM模式介绍文章,简单找了一篇,分享一下.MVVM实现了UI\UE设计师(Expression Blend 4设计界面)和软件工程师的合理分工,在SilverLight.WPF.Wi ...

  8. &lbrack;转载&rsqb;&colon;STM32为什么必须先配置时钟再配置GPIO

    转载来源 :http://blog.csdn.net/fushiqianxun/article/details/7926442 [原创]:我来添两句,就是很多同学(包括我)之前搞低端单片机,到了stm ...

  9. &lbrack;转载&rsqb;从MyEclipse到IntelliJ IDEA-让你摆脱鼠标,全键盘操作

    从MyEclipse转战到IntelliJ IDEA的经历 注转载址:http://blog.csdn.net/luoweifu/article/details/13985835 我一个朋友写了一篇“ ...

随机推荐

  1. javascript之Object&period;defineProperty的奥妙

    直切主题 今天遇到一个这样的功能: 写一个函数,该函数传递两个参数,第一个参数为返回对象的总数据量,第二个参数为初始化对象的数据.如: var o = obj (4, {name: 'xu', age ...

  2. 当一个页面出现多个checkbox全选时的处理

    HTML: <input type="checkbox" onclick="boxOnclick(this,'some1')">全选一 <in ...

  3. 移动设备和SharePoint 2013 - 第3部分:推送通知

    博客地址:http://blog.csdn.net/foxdave 原文地址 在该系列文章中,作者展示了SharePoint 2013最显著的新功能概观--对移动设备的支持. 该系列文章: 移动设备和 ...

  4. cpu affinity &lpar;亲和性&rpar;

    来源:http://www.ibm.com/developerworks/cn/linux/l-affinity.html#download 管理处理器的亲和性(affinity) 为什么(3 个原因 ...

  5. Linux云计算运维-MySQL

    0.建初心 优秀DBA的素质 1.人品,不做某些事情2.严谨,运行命令前深思熟虑,三思而后行,即使是依据select3.细心,严格按照步骤一步一步执行,减少出错4.心态,遇到灾难,首先要稳住,不慌张, ...

  6. JAVA 根据设置的概率生成随机数

    import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; i ...

  7. href&equals;&quot&semi;javascript&colon;void&lpar;0&rpar;&quot&semi; 的用法

    href=”javascript:void(0);”这个的含义是,让超链接去执行一个js函数,而不是去跳转到一个地址,而void(0)表示一个空的方法,也就是不执行js函数. 为什么要使用href=” ...

  8. C语言 &&num;183&semi; 成绩查询系统

    抱歉,昨天忘了往博客上更新,今天补上. 成绩查询系统 分值: 21 数学老师小y 想写一个成绩查询系统,包含如下指令: insert [name] [score],向系统中插入一条信息,表示名字为na ...

  9. switch case &colon;在JDK 7中,又加入了对String类型的支持,从此不用再写If-Else来判断字符串了

    switch的case语句可以处理int,short,byte,char类型的值, 因为short,byte,char都会转换成int进行处理,这一点也可以从生成的字节码看出. char a = 'e ...

  10. Oracle笔记之用户管理

    1. 创建用户 创建用户使用create user语句,需要DBA权限: CREATE USER tom IDENTIFIED BY mot; 2. 更改用户密码 修改别人的密码需要DBA权限,或者a ...