Тенетні чернетки: tricks

середа, 22 квітня 2015 р.

Forcing MySQL schema to use UTF-8

Being enough lazy, I'd like to have a possibility to control my SQL schema (where I store all my data), under some parameters. The tutorial for Flask and SQLAlchemy says I must create schema and afterwards it can be managed by extensions. Meantime, one of the biggest problem until today was that received data didn't sorted correctly. I mean they really didn't sort correctly. Today I figured out, this has been caused by CHARACTER SET settings in my Database. I should mention, that SQL Alchemy performs on the fly transformation of received data from latin1 to utf8. So my initial idea was to understand how to avoid such behaviour, and (IMHO) this was the reason of improper sorting. So outside of the Python the data has been looked like:

mysql> SELECT id, name FROM divisions ORDER BY name;

+----+-------------------------------------------------------+
| id | name                                                  |
+----+-------------------------------------------------------+
|  4 | Clinic No.4                                           |
|  6 | ÐÐ±ÑÐ¾Ð»ÑŽÑ‚Ð½Ð¾ Ð½Ð¾Ð²Ð° ÐºÐ»Ñ–Ð½Ñ–ÐºÐ°              |
|  5 | ÐÐ¾Ð²Ð° ÐºÐ»Ñ–Ð½Ñ–ÐºÐ°                                |
|  7 | Ð©Ðµ Ð¾Ð´Ð½Ð° ÐºÐ»Ñ–Ð½Ñ–ÐºÐ°                          |
|  1 | global                                                |
+----+-------------------------------------------------------+

Well it is not the sorting I wanted, at least because "Clinic No.4" and "global" should be together, and sorting should ignore letter cases. For the last thing, there is a collate call, that should solve my problem. But it didn't (because of the encoding problem). So in order to get proper encoding there were some prosper ways:

edit MySQL config so it started to process data correctly (can be done on DB level);
force somehow data connection to process data on-the-fly (can be done on connection level);
trying to tell database how data should be stored, so I didn't think about them.

After some readings finally I figured out the correct way for forcing MySQL to process Unicode data correctly.

Firstly I should stress I use pymysql dialect for connection cause it is targets 100% compatibility. So, correct forcing connection to use Unicode on client side is to add "?charset=utf8" to the connection string. So our target will look like: mysql+pymysql://username:password@localhost/db_name?charset=utf8.

Secondly, we should set unicode conversion globally. For this add convertunicode=True and encoding=utf8 as params to sqlalchemy.createengine().

Thirdly we should create MySQL schema with the proper character set and collation. As my DB is entitled clinic, so SQL statement is:

CREATE SCHEMA clinic DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_global_ci;

Indeed, I decided to create default schema, by simple statement CREATE SCHEMA clinic; and tune it on initialization DB from the app, by running the following code:

from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://username:password@localhost/db_name?charset=utf8", encoding=utf8, convert_unicode=True)
SCHEMA = "db_name"

connection = engine.connect()
result = connection.execute("SHOW CREATE SCHEMA " + SCHEMA)
for key in result:
    import re
    charset_pattern = re.compile("DEFAULT CHARACTER SET (?P[A-Za-z0-9]*?) ", re.I)
    charset = re.search(charset_pattern, str(key))
    if charset and 'utf' not in charset.group("charset"):
        connection.execute("ALTER SCHEMA " + SCHEMA +
                           " DEFAULT CHARACTER SET 'utf8'"
                           " DEFAULT COLLATE 'utf8_unicode_ci'")

Here we check if our Schema encoding is UTF8, and set it to correct value if it is not.
And here are the results:

mysql> SELECT * FROM divisions ORDER BY name ASC;
+----+-----------------------------+---------+
| id | name                        | builtin |
+----+-----------------------------+---------+
|  5 | Booble                      |       0 |
|  6 | Bubble                      |       0 |
|  1 | global                      |       1 |
|  4 | Друга клініка               |       0 |
|  3 | Перша клініка               |       0 |
|  2 | Типова клініка              |       1 |
+----+-----------------------------+---------+
6 rows in set (0.00 sec)

mysql> SELECT * FROM divisions ORDER BY name DESC;
+----+-----------------------------+---------+
| id | name                        | builtin |
+----+-----------------------------+---------+
|  2 | Типова клініка              |       1 |
|  3 | Перша клініка               |       0 |
|  4 | Друга клініка               |       0 |
|  1 | global                      |       1 |
|  6 | Bubble                      |       0 |
|  5 | Booble                      |       0 |
+----+-----------------------------+---------+
6 rows in set (0.00 sec)

середа, 29 жовтня 2014 р.

Limiting chrome's memory usage per tab

Last two weeks I observed that one of my favorite games - eRepublik, uses abnormally huge amount of memory if you leave chrome for few hours opened. Finally page becomes unrespondable, and you may observe a usage per page near 1-2Gbs of memory. In the example below, my chrome decided to occupy a bit smaller amount of RAM just in 15 minues of site browsing.

Indeed, such situation didn't make happy neither me nor other people around the world. Some of them suggested, they would like to have a possibility to limit chrome's memory usage. Indeed, the reason for such strange behaviour is caused by design in chrome, that keeps in memory major part of elements for given tab, until you change website. I perfectly understand Googlers' desire to make everything "smoothing" and fast, but indeed present situation looks quite dangerous for some people, that use webpages as their workplace, or if they need to update pages time to time.
Most of the existing solution you may find on the web, provide you with a possibility to limit overall chrome's memory usage by suspending tabs, by "parking" them etc. Yes, in general cases, it helps, cause chrome uses separate process for each tab, so limiting amount of tabs improves overall picture. But in case when one web-page uses extra-and-extra-and-extra memory this is not the solution. Indeed this is not the solution if you need to work with the site, and you don't intend to check old info, but your main purpose is to grab new one. Perfect solution for such situation was if chrome tried to limit its internal memory per page, let's say by 2 or 3 last complete snapshots, but until this, the only possible way is either to close tab with extended memory usage and open new one, or kill it from Chrome's task manager.
The only solution, that solves this problem in the way, described above is python solution at superuser.com. It is a bit old, so I modified it in order to work with python 2.7.8 (that is default on my system) and psutils 2.1.1 (they don't include into python distro, so you should get them by yourself).
It's main feature is it kills ANY tab that overcomes 800 Mb of real memory. You can put your own value if this one is too small for you.
The main disadvantage is it really kills really ANY tab, and some times it kills other tabs assigned with the renderer. So you will loose all your input forms. Therefore I recommend use this script just if you really need such feature. Otherwise, write to Google HQ and ask them to add this feature to browser.
Source code for solution at github.

середа, 22 квітня 2015 р.

Forcing MySQL schema to use UTF-8

середа, 29 жовтня 2014 р.

Limiting chrome's memory usage per tab

середа, 22 квітня 2015 р.

середа, 29 жовтня 2014 р.