Further work on adjusting attribute, method and parameter names to be
consistent and to comply with PEP 8 naming guidelines; also adjust implementation of #385 (originally done in pull request #549) to use the parameter name `bypass_decode` instead of `bypassencoding`.
This commit is contained in:
parent
ab6e6f06ef
commit
96f938286d
@ -52,7 +52,7 @@ Cursor Object
|
||||
The DB API definition does not define this attribute.
|
||||
|
||||
|
||||
.. method:: Cursor.arrayvar(data_type, value, [size])
|
||||
.. method:: Cursor.arrayvar(typ, value, [size])
|
||||
|
||||
Create an array variable associated with the cursor of the given type and
|
||||
size and return a :ref:`variable object <varobj>`. The value is either an
|
||||
@ -587,19 +587,19 @@ Cursor Object
|
||||
The DB API definition does not define this attribute.
|
||||
|
||||
|
||||
.. method:: Cursor.var(dataType, [size, arraysize, inconverter, outconverter, \
|
||||
typename, encodingErrors, bypassencoding])
|
||||
.. method:: Cursor.var(typ, [size, arraysize, inconverter, outconverter, \
|
||||
typename, encoding_errors, bypass_encoding])
|
||||
|
||||
Create a variable with the specified characteristics. This method was
|
||||
designed for use with PL/SQL in/out variables where the length or type
|
||||
cannot be determined automatically from the Python object passed in or for
|
||||
use in input and output type handlers defined on cursors or connections.
|
||||
|
||||
The dataType parameter specifies the type of data that should be stored in
|
||||
the variable. This should be one of the
|
||||
:ref:`database type constants <dbtypes>`, :ref:`DB API constants <types>`,
|
||||
an object type returned from the method :meth:`Connection.gettype()` or one
|
||||
of the following Python types:
|
||||
The typ parameter specifies the type of data that should be stored in the
|
||||
variable. This should be one of the :ref:`database type constants
|
||||
<dbtypes>`, :ref:`DB API constants <types>`, an object type returned from
|
||||
the method :meth:`Connection.gettype()` or one of the following Python
|
||||
types:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
@ -642,17 +642,29 @@ Cursor Object
|
||||
specified when using type :data:`cx_Oracle.OBJECT` unless the type object
|
||||
was passed directly as the first parameter.
|
||||
|
||||
The encodingErrors parameter specifies what should happen when decoding
|
||||
The encoding_errors parameter specifies what should happen when decoding
|
||||
byte strings fetched from the database into strings. It should be one of
|
||||
the values noted in the builtin
|
||||
`decode <https://docs.python.org/3/library/stdtypes.html#bytes.decode>`__
|
||||
function.
|
||||
|
||||
The bypassencoding parameter, if specified, should be passed as
|
||||
boolean. This feature allows results of database types CHAR, NCHAR,
|
||||
LONG_STRING, NSTRING, STRING to be returned raw meaning cx_Oracle
|
||||
won't do any decoding conversion. See
|
||||
:ref:`Fetching raw data <fetching-raw-data>` for more information.
|
||||
The bypass_encoding parameter, if specified, should be passed as a
|
||||
boolean value. Passing a `True` value causes values of database types
|
||||
:data:`~cx_Oracle.DB_TYPE_VARCHAR`, :data:`~cx_Oracle.DB_TYPE_CHAR`,
|
||||
:data:`~cx_Oracle.DB_TYPE_NVARCHAR`, :data:`~cx_Oracle.DB_TYPE_NCHAR` and
|
||||
:data:`~cx_Oracle.DB_TYPE_LONG` to be returned as `bytes` instead of `str`,
|
||||
meaning that cx_Oracle doesn't do any decoding. See :ref:`Fetching raw
|
||||
data <fetching-raw-data>` for more information.
|
||||
|
||||
.. versionadded:: 8.2
|
||||
|
||||
The parameter `bypass_encoding` was added.
|
||||
|
||||
.. versionchanged:: 8.2
|
||||
|
||||
For consistency and compliance with the PEP 8 naming style, the
|
||||
parameter `encodingErrors` was renamed to `encoding_errors`. The old
|
||||
name will continue to work as a keyword parameter for a period of time.
|
||||
|
||||
.. note::
|
||||
|
||||
|
||||
@ -68,6 +68,8 @@ if applicable. The most recent deprecations are listed first.
|
||||
- Replace with parameter name `keyword_parameters`
|
||||
* - `keywordParameters` parameter to :meth:`Cursor.callproc()`
|
||||
- Replace with parameter name `keyword_parameters`
|
||||
* - `encodingErrors` parameter to :meth:`Cursor.var()`
|
||||
- Replace with parameter name `encoding_errors`
|
||||
* - `Cursor.fetchraw()`
|
||||
- Replace with :meth:`Cursor.fetchmany()`
|
||||
* - `Queue.deqMany`
|
||||
|
||||
@ -26,6 +26,12 @@ Version 8.2 (TBD)
|
||||
:meth:`cx_Oracle.SessionPool()` in order to permit specifying the size of
|
||||
the statement cache during the creation of pools and standalone
|
||||
connections.
|
||||
#) Added parameter `bypass_decode` to :meth:`Cursor.var()` in order to allow
|
||||
the `decode` step to be bypassed when converting data from Oracle Database
|
||||
into Python strings
|
||||
(`issue 385 <https://github.com/oracle/python-cx_Oracle/issues/385>`__).
|
||||
Initial work was done in `PR 549
|
||||
<https://github.com/oracle/python-cx_Oracle/pull/549>`__.
|
||||
#) Threaded mode is now always enabled when creating connection pools with
|
||||
:meth:`cx_Oracle.SessionPool()`. Any `threaded` parameter value is ignored.
|
||||
#) Eliminated a memory leak when calling :meth:`SodaOperation.filter()` with a
|
||||
|
||||
@ -288,7 +288,7 @@ or the value ``None``. The value ``None`` indicates that the default type
|
||||
should be used.
|
||||
|
||||
Examples of output handlers are shown in :ref:`numberprecision`,
|
||||
:ref:`directlobs` and :ref:`fetching-raw-data`. Also see samples such as `samples/TypeHandlers.py
|
||||
:ref:`directlobs` and :ref:`fetching-raw-data`. Also see samples such as `samples/type_handlers.py
|
||||
<https://github.com/oracle/python-cx_Oracle/blob/master/samples/type_handlers.py>`__
|
||||
|
||||
.. _numberprecision:
|
||||
@ -347,82 +347,73 @@ See `samples/return_numbers_as_decimals.py
|
||||
.. _fetching-raw-data:
|
||||
|
||||
Fetching Raw Data
|
||||
---------------------
|
||||
-----------------
|
||||
|
||||
Sometimes cx_Oracle may have problems converting data to unicode and you may
|
||||
want to inspect the problem closer rather than auto-fix it using the
|
||||
encodingerrors parameter. This may be useful when a database contains
|
||||
records or fields that are in a wrong encoding altogether.
|
||||
Sometimes cx_Oracle may have problems converting data stored in the database to
|
||||
Python strings. This can occur if the data stored in the database doesn't match
|
||||
the character set defined by the database. The `encoding_errors` parameter to
|
||||
:meth:`Cursor.var()` permits the data to be returned with some invalid data
|
||||
replaced, but for additional control the parameter `bypass_decode` can be set
|
||||
to `True` and cx_Oracle will bypass the decode step and return `bytes` instead
|
||||
of `str` for data stored in the database as strings. The data can then be
|
||||
examined and corrected as required. This approach should only be used for
|
||||
troubleshooting and correcting invalid data, not for general use!
|
||||
|
||||
It is not recommended to use mixed encodings in databases.
|
||||
This functionality is aimed at troubleshooting databases
|
||||
that have inconsistent encodings for external reasons.
|
||||
|
||||
For these cases, you can pass in the in additional keyword argument
|
||||
``bypassencoding = True`` into :meth:`Cursor.var()`. This needs
|
||||
to be used in combination with :ref:`outputtypehandlers`
|
||||
The following sample demonstrates how to use this feature:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
#defining output type handlers method
|
||||
def ConvertStringToBytes(cursor, name, defaultType, size, precision, scale):
|
||||
if defaultType == cx_Oracle.STRING:
|
||||
return cursor.var(str, arraysize=cursor.arraysize, bypassencoding = True)
|
||||
# define output type handler
|
||||
def return_strings_as_bytes(cursor, name, default_type, size,
|
||||
precision, scale):
|
||||
if default_type == cx_Oracle.DB_TYPE_VARCHAR:
|
||||
return cursor.var(str, arraysize=cursor.arraysize,
|
||||
bypass_decode=True)
|
||||
|
||||
#set cursor outputtypehandler to the method above
|
||||
cursor = connection.cursor()
|
||||
ursor.outputtypehandler = ConvertStringToBytes
|
||||
# set output type handler on cursor before fetching data
|
||||
with connection.cursor() as cursor:
|
||||
cursor.outputtypehandler = return_strings_as_bytes
|
||||
cursor.execute("select content, charset from SomeTable")
|
||||
data = cursor.fetchall()
|
||||
|
||||
This will produce output as::
|
||||
|
||||
[(b'Fianc\xc3\xa9', b'UTF-8')]
|
||||
|
||||
|
||||
This will allow you to receive data as raw bytes.
|
||||
Note that last \xc3\xa9 is é in UTF-8. Since this is valid UTF-8 you can then
|
||||
perform a decode on the data (the part that was bypassed):
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
statement = cursor.execute("select content, charset from SomeTable")
|
||||
data = statement.fetchall()
|
||||
value = data[0][0].decode("UTF-8")
|
||||
|
||||
This will return the value "Fiancé".
|
||||
|
||||
This will produce output as:
|
||||
If you want to save ``b'Fianc\xc3\xa9'`` into the database directly without
|
||||
using a Python string, you will need to create a variable using
|
||||
:meth:`Cursor.var()` that specifies the type as
|
||||
:data:`~cx_Oracle.DB_TYPE_VARCHAR` (otherwise the value will be treated as
|
||||
:data:`~cx_Oracle.DB_TYPE_RAW`). The following sample demonstrates this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[(b'Fianc\xc3\xa9', b'UTF-8')]
|
||||
|
||||
|
||||
Note that last \xc3\xa9 is é in UTF-8. Then in you can do following:
|
||||
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import codecs
|
||||
# data = [(b'Fianc\xc3\xa9', b'UTF-8')]
|
||||
unicodecontent = data[0][0].decode(data[0][1].decode()) # Assuming your charset encoding is UTF-8
|
||||
|
||||
|
||||
This will revert it back to "Fiancé".
|
||||
|
||||
If you want to save ``b'Fianc\xc3\xa9'`` to database you will need to create
|
||||
:meth:`Cursor.var()` that will tell cx_Oracle that the value is indeed
|
||||
intended as a string:
|
||||
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
connection = cx_Oracle.connect("hr", userpwd, "dbhost.example.com/orclpdb1")
|
||||
cursor = connection.cursor()
|
||||
cursorvariable = cursor.var(cx_Oracle.STRING)
|
||||
cursorvariable.setvalue(0, "Fiancé".encode("UTF-8")) # b'Fianc\xc4\x9b'
|
||||
cursor.execute("update SomeTable set SomeColumn = :param where id = 1", param=cursorvariable)
|
||||
|
||||
|
||||
At that point, the bytes will be assumed to be in the correct encoding and should insert as you expect.
|
||||
with cx_Oracle.connect(user="hr", password=userpwd,
|
||||
dsn="dbhost.example.com/orclpdb1") as conn:
|
||||
with conn.cursor() cursor:
|
||||
var = cursor.var(cx_Oracle.DB_TYPE_VARCHAR)
|
||||
var.setvalue(0, b"Fianc\xc4\x9b")
|
||||
cursor.execute("""
|
||||
update SomeTable set
|
||||
SomeColumn = :param
|
||||
where id = 1""",
|
||||
param=var)
|
||||
|
||||
.. warning::
|
||||
This functionality is "as-is": when saving strings like this,
|
||||
the bytes will be assumed to be in the correct encoding and will
|
||||
insert like that. Proper encoding is the responsibility of the user and
|
||||
no correctness of any data in the database can be assumed
|
||||
to exist by itself.
|
||||
|
||||
The database will assume that the bytes provided are in the character set
|
||||
expected by the database so only use this for troubleshooting or as
|
||||
directed.
|
||||
|
||||
|
||||
.. _outconverters:
|
||||
|
||||
@ -1,75 +0,0 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
import cx_Oracle
|
||||
import sample_env
|
||||
|
||||
"The test below verifies that the option to work around saving and reading of inconsistent encodings works"
|
||||
|
||||
def ConvertStringToBytes(cursor, name, defaultType, size, precision, scale):
|
||||
if defaultType == cx_Oracle.STRING:
|
||||
return cursor.var(str, arraysize=cursor.arraysize, bypassencoding = True)
|
||||
|
||||
connection = cx_Oracle.connect(sample_env.get_main_connect_string())
|
||||
cursor = connection.cursor()
|
||||
|
||||
cursor.outputtypehandler = ConvertStringToBytes
|
||||
|
||||
sql = 'create table EncodingExperiment (content varchar2(100), encoding varchar2(15))'
|
||||
|
||||
print('Creating experiment table')
|
||||
try:
|
||||
cursor.execute(sql)
|
||||
print('Success, will attempt to add records')
|
||||
except Exception as err:
|
||||
# table already exists
|
||||
print('%s\n%s'%(err, 'EncodingExperiment table exists... Will attempt to add records'))
|
||||
|
||||
# variable that we will test encodings against
|
||||
unicode_string = 'I bought a cafetière on the Champs-Élysées'
|
||||
|
||||
# First test
|
||||
windows_1252_encoded = unicode_string.encode('windows-1252')
|
||||
# Second test
|
||||
utf8_encoded = unicode_string.encode('utf-8')
|
||||
|
||||
sqlparameters = [(windows_1252_encoded, 'windows-1252'), (utf8_encoded, 'utf-8')]
|
||||
|
||||
sql = 'insert into EncodingExperiment (content, encoding) values (:content, :encoding)'
|
||||
|
||||
# cx_Oracle string variable in which we will store byte value and insert it as such
|
||||
content_variable = cursor.var(cx_Oracle.STRING)
|
||||
|
||||
print('Adding records to the table: "EncodingExperiment"')
|
||||
for sqlparameter in sqlparameters:
|
||||
content, encoding = sqlparameter
|
||||
# setting content_variable value to a byte value and instert it as such
|
||||
content_variable.setvalue(0, content)
|
||||
cursor.execute(sql, content=content_variable, encoding=encoding)
|
||||
|
||||
sql = 'select * from EncodingExperiment'
|
||||
|
||||
print('Fetching records from table EncodingExperiment')
|
||||
result = cursor.execute(sql).fetchall()
|
||||
|
||||
for dataset in result:
|
||||
content, encoding = dataset[0], dataset[1].decode()
|
||||
decodedcontent = content.decode(encoding)
|
||||
print('Is "%s" == "%s" ?\nResult: %s, (decoded from: %s)'%(decodedcontent, unicode_string, decodedcontent == unicode_string, encoding))
|
||||
|
||||
print('Finished testing, will attempt to drop the table "EncodingExperiment"')
|
||||
# drop table after finished testing
|
||||
sql = 'drop table EncodingExperiment'
|
||||
try:
|
||||
cursor.execute(sql)
|
||||
print('Successfully droped table "EncodingExperiment" from database.')
|
||||
except Exception as err:
|
||||
print('Failed to drop table from the database, info: %s'%err)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
49
samples/query_strings_as_bytes.py
Normal file
49
samples/query_strings_as_bytes.py
Normal file
@ -0,0 +1,49 @@
|
||||
#------------------------------------------------------------------------------
|
||||
# Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# query_strings_as_bytes.py
|
||||
#
|
||||
# Demonstrates how to query strings as bytes (bypassing decoding of the bytes
|
||||
# into a Python string). This can be useful when attempting to fetch data that
|
||||
# was stored in the database in the wrong encoding.
|
||||
#
|
||||
# This script requires cx_Oracle 8.2 and higher.
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
import cx_Oracle as oracledb
|
||||
import sample_env
|
||||
|
||||
STRING_VAL = 'I bought a cafetière on the Champs-Élysées'
|
||||
|
||||
def return_strings_as_bytes(cursor, name, default_type, size, precision,
|
||||
scale):
|
||||
if default_type == oracledb.DB_TYPE_VARCHAR:
|
||||
return cursor.var(str, arraysize=cursor.arraysize, bypass_decode=True)
|
||||
|
||||
with oracledb.connect(sample_env.get_main_connect_string()) as conn:
|
||||
|
||||
# truncate table and populate with our data of choice
|
||||
with conn.cursor() as cursor:
|
||||
cursor.execute("truncate table TestTempTable")
|
||||
cursor.execute("insert into TestTempTable values (1, :val)",
|
||||
val=STRING_VAL)
|
||||
conn.commit()
|
||||
|
||||
# fetch the data normally and show that it is returned as a string
|
||||
with conn.cursor() as cursor:
|
||||
cursor.execute("select IntCol, StringCol from TestTempTable")
|
||||
print("Data fetched using normal technique:")
|
||||
for row in cursor:
|
||||
print(row)
|
||||
print()
|
||||
|
||||
# fetch the data, bypassing the decode and show that it is returned as
|
||||
# bytes
|
||||
with conn.cursor() as cursor:
|
||||
cursor.outputtypehandler = return_strings_as_bytes
|
||||
cursor.execute("select IntCol, StringCol from TestTempTable")
|
||||
print("Data fetched using bypass decode technique:")
|
||||
for row in cursor:
|
||||
print(row)
|
||||
@ -1809,27 +1809,39 @@ static PyObject *cxoCursor_setOutputSize(cxoCursor *cursor, PyObject *args)
|
||||
static PyObject *cxoCursor_var(cxoCursor *cursor, PyObject *args,
|
||||
PyObject *keywordArgs)
|
||||
{
|
||||
static char *keywordList[] = { "type", "size", "arraysize",
|
||||
"inconverter", "outconverter", "typename", "encodingErrors", "bypassencoding",
|
||||
NULL };
|
||||
static char *keywordList[] = { "typ", "size", "arraysize", "inconverter",
|
||||
"outconverter", "typename", "encoding_errors", "bypass_decode",
|
||||
"encodingErrors", NULL };
|
||||
Py_ssize_t encodingErrorsLength, encodingErrorsDeprecatedLength;
|
||||
const char *encodingErrors, *encodingErrorsDeprecated;
|
||||
PyObject *inConverter, *outConverter, *typeNameObj;
|
||||
Py_ssize_t encodingErrorsLength;
|
||||
int size, arraySize, bypassDecode;
|
||||
cxoTransformNum transformNum;
|
||||
const char *encodingErrors;
|
||||
cxoObjectType *objType;
|
||||
int size, arraySize, bypassEncoding;
|
||||
PyObject *type;
|
||||
cxoVar *var;
|
||||
|
||||
// parse arguments
|
||||
size = bypassEncoding = 0;
|
||||
encodingErrors = NULL;
|
||||
size = bypassDecode = 0;
|
||||
arraySize = cursor->bindArraySize;
|
||||
encodingErrors = encodingErrorsDeprecated = NULL;
|
||||
inConverter = outConverter = typeNameObj = NULL;
|
||||
if (!PyArg_ParseTupleAndKeywords(args, keywordArgs, "O|iiOOOz#p",
|
||||
if (!PyArg_ParseTupleAndKeywords(args, keywordArgs, "O|iiOOOz#pz#",
|
||||
keywordList, &type, &size, &arraySize, &inConverter, &outConverter,
|
||||
&typeNameObj, &encodingErrors, &encodingErrorsLength, &bypassEncoding))
|
||||
&typeNameObj, &encodingErrors, &encodingErrorsLength,
|
||||
&bypassDecode, &encodingErrorsDeprecated,
|
||||
&encodingErrorsDeprecatedLength))
|
||||
return NULL;
|
||||
if (encodingErrorsDeprecated) {
|
||||
if (encodingErrors) {
|
||||
cxoError_raiseFromString(cxoProgrammingErrorException,
|
||||
"encoding_errors and encodingErrors cannot both be "
|
||||
"specified");
|
||||
return NULL;
|
||||
}
|
||||
encodingErrors = encodingErrorsDeprecated;
|
||||
encodingErrorsLength = encodingErrorsDeprecatedLength;
|
||||
}
|
||||
|
||||
// determine the type of variable
|
||||
if (cxoTransform_getNumFromType(type, &transformNum, &objType) < 0)
|
||||
@ -1861,10 +1873,9 @@ static PyObject *cxoCursor_var(cxoCursor *cursor, PyObject *args,
|
||||
strcpy((char*) var->encodingErrors, encodingErrors);
|
||||
}
|
||||
|
||||
// Flag that manually changes transform type to bytes
|
||||
if (bypassEncoding) {
|
||||
// if the decode step is to be bypassed, use the binary transform instead
|
||||
if (bypassDecode)
|
||||
var->transformNum = CXO_TRANSFORM_BINARY;
|
||||
}
|
||||
|
||||
return (PyObject*) var;
|
||||
}
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user