Skip to content

Problem with Cyrillic encoding for complex objects in thin mode #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dotX12 opened this issue Mar 18, 2025 · 3 comments
Closed

Problem with Cyrillic encoding for complex objects in thin mode #473

dotX12 opened this issue Mar 18, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@dotX12
Copy link

dotX12 commented Mar 18, 2025

  1. What versions are you using?

    Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
    platform.platform: macOS-15.3.1-arm64-arm-64bit
    sys.maxsize > 2**32: True
    platform.python_version: 3.12.9

  1. Is it an error or a hang or a crash?
    No

  2. What error(s) or behavior you are seeing?
    Incorrect serialization or deserialization of complex objects that contain Cyrillic characters.
    Objects after writing do not look like Cyrillic in the database.
    At the same time, if the procedure accepts simple types - for example, a string - as input, then writing Cyrillic into this string is done correctly, and I see real Cyrillic in the database.
    The problem occurs in the thin client.

  1. Does your application call init_oracle_client()?
    No
  1. Include a runnable Python script that shows the problem.
CREATE TABLE UNI_API.MY_TABLE (
  ID NUMBER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  TEXT_VALUE VARCHAR2(4000)
);

CREATE OR REPLACE TYPE UNI_API.MY_COMPLEX_TYPE AS OBJECT (
  text_value   VARCHAR2(4000),
  number_value NUMBER
);

CREATE OR REPLACE PROCEDURE UNI_API.INSERT_TEXT_EXAMPLE_RETURN_2 (
  p_data      IN UNI_API.MY_COMPLEX_TYPE,
  p_ret_text  OUT VARCHAR2,
  p_ret_id    OUT NUMBER
) AS
BEGIN
  INSERT INTO UNI_API.MY_TABLE(TEXT_VALUE)
    VALUES(p_data.NUMBER_VALUE || ' ' || p_data.TEXT_VALUE)
    RETURNING TEXT_VALUE, ID INTO p_ret_text, p_ret_id;

  COMMIT;
END INSERT_TEXT_EXAMPLE_RETURN_2;
import oracledb
from secrett import SomeSecret

secrets = SomeSecret()

connection = oracledb.connect(
    user=secrets.DB_USER,
    password=secrets.DB_PASSWORD,
    dsn=secrets.DSN,
)

cursor = connection.cursor()

complex_type = connection.gettype("UNI_API.MY_COMPLEX_TYPE")

my_obj = complex_type.newobject()
my_obj.TEXT_VALUE = "SOME CYRILLIC TEXT, ПРИВЕТ МИР"
my_obj.NUMBER_VALUE = 123


ret_text = cursor.var(oracledb.STRING)
ret_id = cursor.var(oracledb.NUMBER)


cursor.callproc("UNI_API.INSERT_TEXT_EXAMPLE_RETURN_2", [my_obj, ret_text, ret_id])


print("Response text:", ret_text.getvalue())
print("Response ID:", ret_id.getvalue())

stmt = "SELECT * FROM UNI_API.MY_TABLE WHERE ID = :id"
id_value = ret_id.getvalue()
res = cursor.execute(stmt, id=id_value)
for row in res:
    print(row)


cursor.close()
connection.close()

Response:

Response text: 123 SOME CYRILLIC TEXT, ПР�ВЕТ М�Р
Response ID: 35
(35, '123 SOME CYRILLIC TEXT, РџР\xa0Р\x98ВЕТ РњР\x98Р\xa0')
Response text: 123 Лпфпофц опфцлп флцвофц вцф двфцдвдцф воцф овфцол влцф
Response ID: 36
(36, '123 Лпфпофц опфцлп флцвофц вцф двфцдвдцф воцф овфцол влцф')

Through datagrip I execute a query to see what is recorded in the database - and there are also characters that do not look like Cyrillic.

SELECT * FROM MY_TABLE;
ID TEXT_VALUE
41 123 тестовая строка
42 123 привет мир
43 123 как дела
@dotX12 dotX12 added the bug Something isn't working label Mar 18, 2025
@dotX12 dotX12 changed the title Problem with Cyrillic encoding for complex objects. Problem with Cyrillic encoding for complex objects Mar 18, 2025
@dotX12 dotX12 changed the title Problem with Cyrillic encoding for complex objects Problem with Cyrillic encoding for complex objects in thin mode Mar 18, 2025
@anthony-tuininga
Copy link
Member

I tried this and didn't have any difficulty. I suspect, however, that this is related to #371. Can you clarify what your database character set is?

@dotX12
Copy link
Author

dotX12 commented Mar 18, 2025

@anthony-tuininga

SELECT value AS db_charset
FROM nls_database_parameters
WHERE parameter = 'NLS_CHARACTERSET';

CL8MSWIN1251

SELECT value AS db_ncharset
FROM nls_database_parameters
WHERE parameter = 'NLS_NCHAR_CHARACTERSET';
AL16UTF16

@anthony-tuininga
Copy link
Member

Thanks for confirming. This is indeed a duplicate of #371. The encoding of an object uses the database encoding regardless of the setting of the client encoding. Changing the database character set to AL32UTF8 will resolve this issue. I am still considering other possible workarounds, however!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants