Python DB-API 2.0规范
PEP: | 249 |
---|---|
标题: | Python Database API Specification v2.0 |
版本: | 83893e13db91 |
修改日期: | 2008-03-03 12:37:19 +0000 (Mon, 03 Mar 2008) |
作者: | Marc-André Lemburg <mal at lemburg.com> |
翻译: | 佣工7001 |
讨论邮件: | <db-sig at python.org> |
状态: | 最终 |
类别: | 信息 |
创建: | |
历史: | |
替代: |
248 |
译者注:PEP是Python Enhancement Proposals的缩写,意为Python扩展协议
简介:
定义本API的目的在于鼓励促进所有用于访问数据库的Python模块相互之间的一致性。为了做到这点,我们希望实现一个一致性的指引,以便实现更易于理解的模块,
更加通用的易于在不同数据库间移植的代码,和Python中更广泛的数据库访问手段。
本接口规格由下面几个方面构成:
* 模块接口
* 数据库连接对象
* 游标对象
* Type Objects and Constructors
* 给模块作者的实现提示
* 可选的DB API扩展
* 可选的错误处理扩展
* 可选的两阶段提交(Two-Phase Commit)扩展
* 常见问题
* 从版本1.0到2.0的主要变化
* 遗留问题(Open Issues)
* 脚注(Footnotes)
* 鸣谢(Acknowledgments)
对于本规范的意见和为题可以去讨论列表the SIG for Database Interfacing with Python
(db-sig@python.org)。
更多关于Python书库接口的信息可以访问 http://www.python.org/topics/database/.
本文档描述了Python DB-API2.0规范和一些常见可选扩展。上一个版本1.0
仍然可以在这里PEP 248获得。鼓励程序包的作者们使用上一版的规范来实现,可以作为新版本的基础版本。
模块接口(Module Interface):
数据库的访问是通过连接对象(connection objects)来实现的。 程序模块中必须提供以下形式的连接对象构造函数: connect(parameters...) 数据库连接对象的构造函数,返回值为Connection对象实例。 由于目的数据库不同,函数接收数量不等的一些参数。[1] 模块中必须定义下面这些模块级的变量: apilevel 字符串常量,表明支持的DB API版本。 目前只允许取值'1.0'和'2.0'。 如果没有定义本常量,默认为DB-API 1.0。 threadsafety 整数常量,表明模块支持的线程安全级别,可能的值为: 0 线程不安全,线程不能共享模块。 1 线程可以共享模块,但是不能共享连接对象(connections)。 2 线程可以共享模块和连接对象。 3 线程安全,线程间可以共享模块、连接对象以及游标对象(module,connections,cursors)。 上文*享的意思是两个线程在没有使用互斥信号(mutex semaphore)锁的情况下, 同时使用一个资源。 要注意的是,你并不总能使用互斥信号来确保一个外部资源线程安全,这是因为外部资源很有可能依赖于一 个全局变量或是其他的外部资源,然而这些是你不能控制的。 paramstyle 字符串常量声明模块使用的SQL语句中的参数引出方式。可能的取值如下[2]: 'qmark' 问号方式,例如:'...WHERE name=?' 'numeric' 序数方式,例如:'...WHERE name=:1' 'named' 命名方式,例如:'...WHERE name=:name' 'format' 通用方式(ANSI C printf format codes)例如:'...WHERE name=%s' 'pyformat' python扩展方式(Python extended format codes),例如:'...WHERE name=%(name)s' 模块中应该按照下面所阐述的错误类别和层次关系来处理各种错误信息: Warning 当有严重警告时触发,例如插入数据是被截断等等。必须是Python StandardError的子类(定义于exceptions模块中)。 Error 这应该是警告以外所有其他错误类的基类。你可以使用这个类在单一 的‘except’语句中捕捉所有的错误。警告(Warnings)不应认为是错误, 因此不应该以此类作为基类,而只应该以Python StandardError作为基类。 InterfaceError 当有数据库接口模块本身的错误(而不是数据库的错误)发生时触发。 必须是Error的子类。 DatabaseError 和数据库有关的错误发生时触发。 必须是Error的子类。 DataError 当有数据处理时的错误发生时触发,例如:除零错误,数据超范围等等。 必须是DatabaseError的子类。 OperationalError 指非用户控制的,而是操作数据库时发生的错误。例如:连接意外断开、 数据库名未找到、事务处理失败、内存分配错误等等操作数据库是发生的错误。 必须是DatabaseError的子类。 IntegrityError 完整性相关的错误,例如外键检查失败等。必须是DatabaseError子类。 InternalError 数据库的内部错误,例如游标(cursor)失效了、事务同步失败等等。 必须是DatabaseError子类。 ProgrammingError 程序错误,例如数据表(table)没找到或已存在、SQL语句语法错误、 参数数量错误等等。必须是DatabaseError的子类。 NotSupportedError 不支持错误,指使用了数据库不支持的函数或API等。例如在连接对象上 使用.rollback()函数,然而数据库并不支持事务或者事务已关闭。 必须是DatabaseError的子类。 下面是错误类的层次关系: StandardError |__Warning |__Error |__InterfaceError |__DatabaseError |__DataError |__OperationalError |__IntegrityError |__InternalError |__ProgrammingError |__NotSupportedError 注:这些例外(Exceptions)的值(错误信息)并没有明确的定义,但是它们应该能够给用户指出足够好的错误提示。
连接对象(Connection Objects):
连接对象应该具有下面的方法:
.close()
马上关闭数据连接(而不是当__del__方法被调用的时候)。
连接应该此后变得不可用,再次访问本连接对象应该触发
一个错误(Error或其子类),同样所有使用本连接对象的游标(cursor)
对象,也会导致例外发生。
需要注意的是,在关闭连接对象之前,没有首先提交对数据库的改变
将会导致一个隐含的回滚动作(rollback),这将丢弃之前的数据改变操作。
.commit()
提交任何挂起的事务到数据库中。
需要注意的是,如果数据库支持自动提交(auto-commit),必须
在初始化时关闭。一般会有一个接口函数关闭此特性。
不支持事务的数据库也应该实现此方法,只需什么都不做。
.rollback()
由于并非所有数据库都支持事务,此方法是可选的。[3]
对于支持事务的数据库,调用此方法将导致数据库回滚到事务
开始时的状态。关闭数据库连接之前没有明确调用commit()提交
数据更新,将隐含导致rollback()被执行。
.cursor()
方法返回给定连接上建立的游标对象(Cursor Object)。如果数据库
没有提供对应的游标对象,那么将由程序来模拟实现游标功能。[4]
Cursor Objects:
.execute*() 游标对象表示数据库游标,游标用来管理获取结果操作的上下文对象。 同一个连接对象创建的游标对象不是彼此隔离的,也就是说一个游标对象 对数据库造成的变化将会对于其他游标对象立即可见。而不同的连接对象 创建的游标,则可能是隔离的也可能是非隔离的,这取决于数据库对事务 的支持的实现(参见连接对象的.rollback()和.commit()方法)。 游标对象应具有以下的方法和属性: .description 这是一个只读属性,是7个项目组成的tulip的序列。 每个tulip包含描述一个结果集中的列的信息描述: (name, type_code, display_size, internal_size, precision, scale, null_ok) 其中,前两个项目(name and type_code)时必须的,其他的五项是可选的。 如果没有意义可以设置为None。 对于没有返回结果集的操作或者游标对象还没有执行过任何.execute*()的操作 本属性可以为空(None)。 type_code的含义可以比对下面Type对象的描述。 .rowcount 这是一个只读属性,描述的是最后一次数据库操作影响的数据行数 (执行.execute系列方法)。 可以是数据查询语句(DQL),比如 'select'等返回的数据行,也可以是数据操纵语句(DML )比如'update' 和'insert'语句 等所影响的数据行。 如果还没有执行过任何语句,或者操作本身影响的函数由于数据访问接口的原因不能检测到。 则本属性的值为-1 [7] 注:将来的版本有可能重新定义后一种情况,使其取值为空(None)而不是-1。 .callproc(procname[,parameters]) (由于并非每种数据库都支持存储过程,此方法是可选的 [3]) 调用数据库存储过程时,首先必须给出存储过程的名字,其次,对于存储过程需要的 每一个参数都必须依次给出。调用结果按照调用时的次序,输入型参数(Input parameters) 原样不动,输出型和输入输出二合一型参数可能会被新的内容代替。 存储过程也很可能以数据结果集作为返回结果,此时就要用标准的fech系列方法来 获取结果了(.fetch*() methods)。 .close() 立即关闭游标(不论 __del__方法是否已调用)。从此刻开始游标对象就变得不可用了。 任何试图访问此游标对象的方法或属性的动作都将导致一个错误Error或其子类被抛出。 .execute(operation[,parameters]) 准备和执行数据库操作(查询或其他命令)。所提供参数将会被绑定 到语句中的变量。变量的定义和数据库模块有关。(请参见模块的 paramstyle属性的描述)。[5] 游标对象将会保留这个操作的引用,如果一个后续的相同的操作被调用, 游标对象将会以此来进行优化。当有相同的操作调用(不同的参数变量被传递) 时,这是最为有效的优化。 一项数据库操作,为了获得最大的执行效率,最好先期使用方法.setinputsizes() 来 指定参数的类型和大小。执行时实际给出的参数和预定义的不同也是合法的,模块的实现 需要容忍这个问题,即使以效率的损失为代价。 参数可以以tuples的tuple或list的形式提供,例如当在一次调用中插入多行数据。但是 这种调用应被认为是抛弃的不建议使用,应该使用专用的方法.executemany() 。 没有对返回值进行明确界定。 .executemany(operation,seq_of_parameters) 准备数据库操作(查询或其他命令),然后以序列的序列形式的函数 来执行该操作。 模块开发这可以*选择是转化为一系列的.execute() 方法调用,还是以 数组操作的形式,以便使数据库把这个序列的操作作为一个整体。 使用此方法,可能产生一个或多个由未知的行为构成的结果集。 建议模块作者(而不是要求)当检测到一次调用已经产生结果集时抛出例外。 对于.execute()方法的描述同样可适于此方法。 返回值未定义。 .fetchone() 从一查询结果集中获取下一行数据,返回值为一个值的序列,如果没有更多数据 了则返回None。[6] 如果上次的.execute系列方法的调用没有生成任何结果集()或还没有进行任何数据 库操作的调用,则调用此方法将抛出例外(Error或其子类)。 .fetchmany([size=cursor.arraysize]) Fetch the next set of rows of a query result, returning a sequence of sequences (e.g. a list of tuples). An empty sequence is returned when no more rows are available. The number of rows to fetch per call is specified by the parameter. If it is not given, the cursor's arraysize determines the number of rows to be fetched. The method should try to fetch as many rows as indicated by the size parameter. If this is not possible due to the specified number of rows not being available, fewer rows may be returned. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet. Note there are performance considerations involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, then it is best for it to retain the same value from one .fetchmany() call to the next. .fetchall() Fetch all (remaining) rows of a query result, returning them as a sequence of sequences (e.g. a list of tuples). Note that the cursor's arraysize attribute can affect the performance of this operation. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet. .nextset() (This method is optional since not all databases support multiple result sets. [3]) This method will make the cursor skip to the next available set, discarding any remaining rows from the current set. If there are no more sets, the method returns None. Otherwise, it returns a true value and subsequent calls to the fetch methods will return rows from the next result set. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet. .arraysize This read/write attribute specifies the number of rows to fetch at a time with .fetchmany(). It defaults to 1 meaning to fetch a single row at a time. Implementations must observe this value with respect to the .fetchmany() method, but are free to interact with the database a single row at a time. It may also be used in the implementation of .executemany(). .setinputsizes(sizes) This can be used before a call to .execute*() to predefine memory areas for the operation's parameters. sizes is specified as a sequence -- one item for each input parameter. The item should be a Type Object that corresponds to the input that will be used, or it should be an integer specifying the maximum length of a string parameter. If the item is None, then no predefined memory area will be reserved for that column (this is useful to avoid predefined areas for large inputs). This method would be used before the .execute*() method is invoked. Implementations are free to have this method do nothing and users are free to not use it. .setoutputsize(size[,column]) Set a column buffer size for fetches of large columns (e.g. LONGs, BLOBs, etc.). The column is specified as an index into the result sequence. Not specifying the column will set the default size for all large columns in the cursor. This method would be used before the .execute*() method is invoked. Implementations are free to have this method do nothing and users are free to not use it.
数据类型对象及构造(Type Objects and Constructors):
Many databases need to have the input in a particular format for
binding to an operation's input parameters. For example, if an
input is destined for a DATE column, then it must be bound to the
database in a particular string format. Similar problems exist
for "Row ID" columns or large binary items (e.g. blobs or RAW
columns). This presents problems for Python since the parameters
to the .execute*() method are untyped. When the database module
sees a Python string object, it doesn't know if it should be bound
as a simple CHAR column, as a raw BINARY item, or as a DATE.
To overcome this problem, a module must provide the constructors
defined below to create objects that can hold special values.
When passed to the cursor methods, the module can then detect the
proper type of the input parameter and bind it accordingly.
A Cursor Object's description attribute returns information about
each of the result columns of a query. The type_code must compare
equal to one of Type Objects defined below. Type Objects may be
equal to more than one type code (e.g. DATETIME could be equal to
the type codes for date, time and timestamp columns; see the
Implementation Hints below for details).
The module exports the following constructors and singletons:
Date(year,month,day)
This function constructs an object holding a date value.
Time(hour,minute,second)
This function constructs an object holding a time value.
Timestamp(year,month,day,hour,minute,second)
This function constructs an object holding a time stamp
value.
DateFromTicks(ticks)
This function constructs an object holding a date value
from the given ticks value (number of seconds since the
epoch; see the documentation of the standard Python time
module for details).
TimeFromTicks(ticks)
This function constructs an object holding a time value
from the given ticks value (number of seconds since the
epoch; see the documentation of the standard Python time
module for details).
TimestampFromTicks(ticks)
This function constructs an object holding a time stamp
value from the given ticks value (number of seconds since
the epoch; see the documentation of the standard Python
time module for details).
Binary(string)
This function constructs an object capable of holding a
binary (long) string value.
STRING
This type object is used to describe columns in a database
that are string-based (e.g. CHAR).
BINARY
This type object is used to describe (long) binary columns
in a database (e.g. LONG, RAW, BLOBs).
NUMBER
This type object is used to describe numeric columns in a
database.
DATETIME
This type object is used to describe date/time columns in
a database.
ROWID
This type object is used to describe the "Row ID" column
in a database.
SQL NULL values are represented by the Python None singleton on
input and output.
Note: Usage of Unix ticks for database interfacing can cause
troubles because of the limited date range they cover.
给模块作者的实现方法示意(Implementation Hints for Module Authors):
* 时间/日期对象(Date/time objects)可以用Python的datetime模块中
的对象来实现(Python 2.3版本开始提供,2.4版本开始提供C版本API),
或者使用mxDateTime包中对象(可供Python1.5.2版本以上使用)。
它们都提供有足够的构造方法,使用方法(在Python中和C中使用都可以)。
* 下面是一个Unix下基于ticks构造为通用的date/time对象的代理示例:
import time def DateFromTicks(ticks): return Date(*time.localtime(ticks)[:3]) def TimeFromTicks(ticks): return Time(*time.localtime(ticks)[3:6]) def TimestampFromTicks(ticks): return Timestamp(*time.localtime(ticks)[:6])
* The preferred object type for Binary objects are the buffer types available in standard Python starting with version 1.5.2. Please see the Python documentation for details. For information about the C interface have a look at Include/bufferobject.h and Objects/bufferobject.c in the Python source distribution. * This Python class allows implementing the above type objects even though the description type code field yields multiple values for on type object: class DBAPITypeObject: def __init__(self,*values): self.values = values def __cmp__(self,other): if other in self.values: return 0 if other < self.values: return 1 else: return -1 The resulting type object compares equal to all values passed to the constructor. * Here is a snippet of Python code that implements the exception hierarchy defined above: import exceptions class Error(exceptions.StandardError): pass class Warning(exceptions.StandardError): pass class InterfaceError(Error): pass class DatabaseError(Error): pass class InternalError(DatabaseError): pass class OperationalError(DatabaseError): pass class ProgrammingError(DatabaseError): pass class IntegrityError(DatabaseError): pass class DataError(DatabaseError): pass class NotSupportedError(DatabaseError): pass In C you can use the PyErr_NewException(fullname, base, NULL) API to create the exception objects.
可选的DB API扩展(Optional DB API Extensions):
在 DB API 2.0生命期内,模块作者经常扩展他们的实现提供超越规范要求的
内容,为了增强兼容性和指明清晰的升级到将来DB API版本的道路,本章节
定义了一系列的通用对于DB API2.0核心规范内容的扩展。
对于所有的DB API可选规范,数据库模块作者可*选择是否实现这些附加的
方法和属性,因此使用它们很有可能导致抛出一个AttributeError或者是
NotSupportedError,因为这些方法属性是否支持智能在运行时做检查。
因此,已建议在用户使用这些扩展内容时会得到警告消息(Python warnings)
为了使这项特性好用,警告消息必须标准化以便可以屏蔽它们。这些标准的信息
下面称之为“警告信息”("Warning Message")。
Cursor Attribute .rownumber
This read-only attribute should provide the current 0-based
index of the cursor in the result set or None if the index
cannot be determined.
The index can be seen as index of the cursor in a sequence
(the result set). The next fetch operation will fetch the row
indexed by .rownumber in that sequence.
警告信息(Warning Message): "DB-API extension cursor.rownumber used"
Connection Attributes .Error, .ProgrammingError, etc.
All exception classes defined by the DB API standard should be
exposed on the Connection objects as attributes (in addition
to being available at module scope).
These attributes simplify error handling in multi-connection
environments.
警告信息(Warning Message): "DB-API extension connection.<exception> used"
Cursor Attributes .connection
This read-only attribute return a reference to the Connection
object on which the cursor was created.
The attribute simplifies writing polymorph code in
multi-connection environments.
Warning Message: "DB-API extension cursor.connection used"
Cursor Method .scroll(value[,mode='relative'])
Scroll the cursor in the result set to a new position according
to mode.
If mode is 'relative' (default), value is taken as offset to
the current position in the result set, if set to 'absolute',
value states an absolute target position.
An IndexError should be raised in case a scroll operation would
leave the result set. In this case, the cursor position is left
undefined (ideal would be to not move the cursor at all).
Note: This method should use native scrollable cursors, if
available , or revert to an emulation for forward-only
scrollable cursors. The method may raise NotSupportedErrors to
signal that a specific operation is not supported by the
database (e.g. backward scrolling).
警告信息(Warning Message): "DB-API extension cursor.scroll() used"
Cursor Attribute .messages
This is a Python list object to which the interface appends
tuples (exception class, exception value) for all messages
which the interfaces receives from the underlying database for
this cursor.
The list is cleared by all standard cursor methods calls (prior
to executing the call) except for the .fetch*() calls
automatically to avoid excessive memory usage and can also be
cleared by executing "del cursor.messages[:]".
All error and warning messages generated by the database are
placed into this list, so checking the list allows the user to
verify correct operation of the method calls.
The aim of this attribute is to eliminate the need for a
Warning exception which often causes problems (some warnings
really only have informational character).
警告信息(Warning Message): "DB-API extension cursor.messages used"
Connection Attribute .messages
Same as cursor.messages except that the messages in the list
are connection oriented.
The list is cleared automatically by all standard connection
methods calls (prior to executing the call) to avoid excessive
memory usage and can also be cleared by executing "del
connection.messages[:]".
警告信息(Warning Message):"DB-API extension connection.messages used"
Cursor Method .next()
Return the next row from the currently executing SQL statement
using the same semantics as .fetchone(). A StopIteration
exception is raised when the result set is exhausted for Python
versions 2.2 and later. Previous versions don't have the
StopIteration exception and so the method should raise an
IndexError instead.
警告信息(Warning Message):"DB-API extension cursor.next() used"
Cursor Method .__iter__()
Return self to make cursors compatible to the iteration
protocol [8].
警告信息(Warning Message):"DB-API extension cursor.__iter__() used"
Cursor Attribute .lastrowid
This read-only attribute provides the rowid of the last
modified row (most databases return a rowid only when a single
INSERT operation is performed). If the operation does not set
a rowid or if the database does not support rowids, this
attribute should be set to None.
The semantics of .lastrowid are undefined in case the last
executed statement modified more than one row, e.g. when
using INSERT with .executemany().
警告信息(Warning Message): "DB-API extension cursor.lastrowid used"
可选的错误处理扩展(Optional Error Handling Extensions):
The core DB API specification only introduces a set of exceptions
which can be raised to report errors to the user. In some cases,
exceptions may be too disruptive for the flow of a program or even
render execution impossible.
For these cases and in order to simplify error handling when
dealing with databases, database module authors may choose to
implement user defineable error handlers. This section describes a
standard way of defining these error handlers.
Cursor/Connection Attribute .errorhandler
Read/write attribute which references an error handler to call
in case an error condition is met.
The handler must be a Python callable taking the following
arguments:
errorhandler(connection, cursor, errorclass, errorvalue)
where connection is a reference to the connection on which the
cursor operates, cursor a reference to the cursor (or None in
case the error does not apply to a cursor), errorclass is an
error class which to instantiate using errorvalue as
construction argument.
The standard error handler should add the error information to
the appropriate .messages attribute (connection.messages or
cursor.messages) and raise the exception defined by the given
errorclass and errorvalue parameters.
If no errorhandler is set (the attribute is None), the
standard error handling scheme as outlined above, should be
applied.
Warning Message: "DB-API extension .errorhandler used"
Cursors should inherit the .errorhandler setting from their
connection objects at cursor creation time.
可选两阶段提交扩展(Optional Two-Phase Commit Extensions):
Many databases have support for two-phase commit (TPC) which
allows managing transactions across multiple database connections
and other resources.
If a database backend provides support for two-phase commit and
the database module author wishes to expose this support, the
following API should be implemented. NotSupportedError should be
raised, if the database backend support for two-phase commit
can only be checked at run-time.
TPC Transaction IDs
As many databases follow the XA specification, transaction IDs
are formed from three components:
* a format ID
* a global transaction ID
* a branch qualifier
For a particular global transaction, the first two components
should be the same for all resources. Each resource in the
global transaction should be assigned a different branch
qualifier.
The various components must satisfy the following criteria:
* format ID: a non-negative 32-bit integer.
* global transaction ID and branch qualifier: byte strings no
longer than 64 characters.
Transaction IDs are created with the .xid() connection method:
.xid(format_id, global_transaction_id, branch_qualifier)
Returns a transaction ID object suitable for passing to the
.tpc_*() methods of this connection.
If the database connection does not support TPC, a
NotSupportedError is raised.
The type of the object returned by .xid() is not defined, but
it must provide sequence behaviour, allowing access to the
three components. A conforming database module could choose
to represent transaction IDs with tuples rather than a custom
object.
TPC Connection Methods
.tpc_begin(xid)
Begins a TPC transaction with the given transaction ID xid.
This method should be called outside of a transaction
(i.e. nothing may have executed since the last .commit() or
.rollback()).
Furthermore, it is an error to call .commit() or .rollback()
within the TPC transaction. A ProgrammingError is raised, if
the application calls .commit() or .rollback() during an
active TPC transaction.
If the database connection does not support TPC, a
NotSupportedError is raised.
.tpc_prepare()
Performs the first phase of a transaction started with
.tpc_begin(). A ProgrammingError should be raised if this
method outside of a TPC transaction.
After calling .tpc_prepare(), no statements can be executed
until tpc_commit() or tpc_rollback() have been called.
.tpc_commit([xid])
When called with no arguments, .tpc_commit() commits a TPC
transaction previously prepared with .tpc_prepare().
If .tpc_commit() is called prior to .tpc_prepare(), a single
phase commit is performed. A transaction manager may choose
to do this if only a single resource is participating in the
global transaction.
When called with a transaction ID xid, the database commits
the given transaction. If an invalid transaction ID is
provided, a ProgrammingError will be raised. This form should
be called outside of a transaction, and is intended for use in
recovery.
On return, the TPC transaction is ended.
.tpc_rollback([xid])
When called with no arguments, .tpc_rollback() rolls back a
TPC transaction. It may be called before or after
.tpc_prepare().
When called with a transaction ID xid, it rolls back the given
transaction. If an invalid transaction ID is provided, a
ProgrammingError is raised. This form should be called
outside of a transaction, and is intended for use in recovery.
On return, the TPC transaction is ended.
.tpc_recover()
Returns a list of pending transaction IDs suitable for use
with .tpc_commit(xid) or .tpc_rollback(xid).
If the database does not support transaction recovery, it may
return an empty list or raise NotSupportedError.
常见问题(Frequently Asked Questions):
在论坛中经常看到关于DB API规范的重复性的问题。本节包括了一些人们常问的问题。
Question:
当我使用.fetch*()之类的函数获取结果时,如何获取到一个字典形式的结果集而不是tuples。
Answer:
有几个可用工具来解决这个问题。多数都是利用了游标对象的.description
属性作为基础来实现数据行的字典。
注意,之所以没有扩展DB API规范来支持.fetch系列方法来返回字典,是因为
这种方法有几个弊端。
* 一些数据库及服务不支持区分字段名的大小写,或者自动把字段名转化为大
写或小写。
* 查询所生成的结果集中的字段不一定是表的字段名,并且数据库经常为这些列
使用自己的方法来为这些字段生成名字。
因此,要做到在不同的数据库中,都通过使用字典键值来分访问字段值,并且做到可移植的是
不可能的。
1.0到2.0的主要改变(Major Changes from Version 1.0 to Version 2.0):
Python DB API 2.0相对于1.0来说引入了几个很重大的改变。由于其中一些
变动会导致已有的基于DB API 1.0的脚本不能运行。因此做了主版本号的改变,
升级为DB-API 2.0规范来反映这些变化。
下面这些是从1.0 到2.0最重要的改变:
* 不在需要单独的dbi模块,而是直接打包进数据库访问模块当中。
* 日期/时间类型添加了新的构造,RAW类型改名为BINARY。结果集中应该
覆盖现代SQL数据库中的基本数据类型。
* 为了更好的数据库绑定,添加了新的常量(apilevel, threadlevel, paramstyle)
和方法(.executemany(), .nextset())。
* 明确定义了需要用来访问存储过程的的方法.callproc()。
* 方法.execute()的返回值定义有所改变。前期版本中,返回值定义是基于
所执行的SQL语句类型的(这经常比较难以实现)-- 下载它没有了明确的
定义;代替它的是用户应该访问更适合的.rowcount属性。模块作者可以仍然
返回旧式的定义值,但是规范中不再会有明确定义,而且应该认为是取决于
不同的数据访问模块的。
* 例外的类在新的规范中有统一明确的定义。模块作者可以任意的以继承类的
形式来扩展新规范中所定义例外的层次。
DB API 2.0规范的追加扩展规范:
* 定义了附加的可选的对核心DB-API功能的扩展功能集。
遗留问题(Open Issues):
尽管2.0版本阐明了许多1.0版本遗留的问题 ,但是仍有一些遗留问题留待以后的版本来
实现解决:
* Define a useful return value for .nextset() for the case where
a new result set is available.
* Integrate the decimal module Decimal object for use as
loss-less monetary and decimal interchange format.
脚注(Footnotes):
[1] 作为实现准则,连接对象the connection constructor parameters should be
implemented as keyword parameters for more intuitive use and
follow this order of parameters:
dsn Data source name as string
user User name as string (optional)
password Password as string (optional)
host Hostname (optional)
database Database name (optional)
E.g. a connect could look like this:
connect(dsn='myhost:MYDB',user='guido',password='234$')
[2] Module implementors should prefer 'numeric', 'named' or
'pyformat' over the other formats because these offer more
clarity and flexibility.
[3] If the database does not support the functionality required
by the method, the interface should throw an exception in
case the method is used.
The preferred approach is to not implement the method and
thus have Python generate an AttributeError in
case the method is requested. This allows the programmer to
check for database capabilities using the standard
hasattr() function.
For some dynamically configured interfaces it may not be
appropriate to require dynamically making the method
available. These interfaces should then raise a
NotSupportedError to indicate the non-ability
to perform the roll back when the method is invoked.
[4] a database interface may choose to support named cursors by
allowing a string argument to the method. This feature is
not part of the specification, since it complicates
semantics of the .fetch*() methods.
[5] The module will use the __getitem__ method of the parameters
object to map either positions (integers) or names (strings)
to parameter values. This allows for both sequences and
mappings to be used as input.
The term "bound" refers to the process of binding an input
value to a database execution buffer. In practical terms,
this means that the input value is directly used as a value
in the operation. The client should not be required to
"escape" the value so that it can be used -- the value
should be equal to the actual database value.
[6] Note that the interface may implement row fetching using
arrays and other optimizations. It is not
guaranteed that a call to this method will only move the
associated cursor forward by one row.
[7] The rowcount attribute may be coded in a way that updates
its value dynamically. This can be useful for databases that
return usable rowcount values only after the first call to
a .fetch*() method.
[8] Implementation Note: Python C extensions will have to
implement the tp_iter slot on the cursor object instead of the
.__iter__() method.
鸣谢(Acknowledgements):
非常感谢Andrew Kuchling,是他把Python Database API Specification 2.0由
原始的HTML格式转为了PEP格式。
非常感谢James Henstridge领导两阶段提交扩展API的讨论并最终使其标准化。