TinyMCI cannot handle unicode character `ř`

Summary

TinyMCI cannot handle unicode character ř

Step to reproduce

Use character ř on HTML page - save page → OK
Switch to TinyMCE editor → Character ř is shown
Save page again - MODX error message

Observed behavior

grafik

[2023-02-07 09:45:46] (ERROR @ /usr/www/users/micsvm/mics303/core/vendor/xpdo/xpdo/src/xPDO/Om/xPDOObject.php : 1447) Error 22007 executing statement:
UPDATE modx_site_content SET deleted = 0,content = ‘

[[$mics_header_de]]
\r\n<div id="content">\r\n

Konzerte

\r\n

Dvo<span class="box">řák \r\n

’,editedon = 1675759546 WHERE id = 7
Array
(
[0] => 22007
[1] => 1366
[2] => Incorrect string value: ‘\xC5\x99</sp…’ for column micsvm_db1.modx_site_content.content at row 1
)

Expected behavior

Handle this case without error

Environment

Tested with MODX 2.8.2 and 3.0.3
Datenbanktyp: mysql
Datenbank-Version: 10.5.18-MariaDB-0+deb11u1
Datenbank-Zeichensatz (charset): utf8
Browser: Firefox 109.0.1

What charset do you use exactly? MySQL has utf8 charsets (not real UTF-8) and utf8mb4 charsets (real UTF-8).

Maybe you need a utf8mb4 charset to store this character.

(At least when I tested it with utf8mb4, I couldn’t reproduce your problem.)

My database is using utf8mb4 at many places except the character_set_server and collation_server.
Is this something I can fix myself or do I have to ask my provider support?
… and if yes: how?

I do not have access to MySQL configuration file (/etc/my.cnf):

The important part is the collation that is used for the database table modx_site_content. And to be more specific: For the column content of that table.


It’s possible to change the character set with SQL or for example with phpMyAdmin.

I don’t know if it is a good idea to change the charset of a table that already contains data. Possibly the data in the table is wrong after the change.


Alternatively, you could also try to change the TinyMCE configuration, so that &rcaron; doesn’t get replaced at all.

I believe the settings entities and entity_encoding are used for this:

The TinyMCE RTE extra has the system settings tinymcerte.external_config and tinymcerte.settings to change the default settings.


Or if nothing else works, you could write a custom MODX plugin, that runs on the event “OnBeforeDocFormSave” and replaces ř with &rcaron; again in the content.

Heureka!

I ran this command in phpMyAdmin - and it works!
ALTER TABLE modx_site_content CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Thanks alot for your prompt and efficient help!

1 Like

There’s another method here if you run into trouble later. You’d need to modify the script to get the mb settings.

I have installed php on my PC (ver. 7.4.33)
Your script, cdc.config.php and the saved database are in the same directory
I’m getting the following error message:

PS E:\Christoph\MicsWeb\cdc> php .\cdc.php

Warning: “continue” targeting switch is equivalent to “break”. Did you mean to use “continue 2”? in E:\Christoph\MicsWeb\cdc\cdc.php on line 418

Fatal error: Uncaught Error: Call to undefined function mysqli_connect() in E:\Christoph\MicsWeb\cdc\cdc.php:194
Stack trace:
#0 {main}
thrown in E:\Christoph\MicsWeb\cdc\cdc.php on line 194

That suggests that the MySQLi (MySQL Improved) extension is not enabled in your PHP configuration. If you enable it in php.ini, then, restart your web server, it should eliminate that error. The warning above that can be ignored.

I installed xampp, created database and imported backup
I’m sure credentials are properly copied to the config file.
Nevertheless my computer is blocking access to database.

Fatal error: Uncaught mysqli_sql_exception: Access denied for user ‘micsvm_1’@‘localhost’ (using password: YES) in C:\xampp\htdocs\cdc\cdc.php:194 Stack trace: #0 C:\xampp\htdocs\cdc\cdc.php(194): mysqli_connect(‘localhost’, ‘micsvm_1’, Object(SensitiveParameterValue), ‘micsvm_db1’) #1 {main} thrown in C:\xampp\htdocs\cdc\cdc.php on line 194

Double check the config file. That error almost always means a credential problem.

Log in to PhpMyAdmin and check the User Accounts tab to make sure the user you’re referencing has ALL PRIVILEGES to the localhost host.

Got it - extra space in username. Thanks for your help!

I compared the exported databases before and after applying your script.
Only difference is timestamp and tools version (ran on different computers)

Interesting. I wonder if that CONVERT TO CHARACTER SET command existed when I wrote the script. I would have saved me a ton of work and probably run faster. The script was written a very long time ago – 2010 or maybe even earlier.