Показ дописів із міткою Pet Project. Показати всі дописи
Показ дописів із міткою Pet Project. Показати всі дописи

понеділок, 1 лютого 2016 р.

Accessing MS Access from python with Cyrillic table names

My today's issue is quite interesting and is related to the data migration from MS Access to some regular (and more widely-used) database, i.e. MySQL. So we have as input a database, created in the MS Access 2000 (supposed) with the Cyrillic tables, that contains near 5k of sensible records. And expected result should be any DB that is supported by SQL Alchemy (right now it is MySQL, but I believe it will be easy to migrate to any other). And here is the most complex part: "How"?

Firstly I needed some way to provide ODBC  support for my Linux (Windows users/developers are more lucky at this point) as well as driver for access to MDB files. These packages provide such support for Debian:
  • unixodbc
  • odbcinst
  • libodbc1
  • libiodbc2
  • odbc-mdbtools
I got this list of packages from amirkdv, who implemented ODBC module for PHP. [Note: later I found a plenty of other docs about this subject, so I'm not good at Googl'ing at all]. Among other drivers there was also a web-page for unixodbc.org and Easysoft's page, who asked near 500 pounds for their library. Maybe it worth that money, but I need such access for one-time operation, so that's not my case right now.

So final command on my Debian system looked like:
$sudo apt-get install unixodbc odbcinst odbc-mdbtools

Files /etc/odbc.ini and /etc/odbcinst.ini were created after the installation and the latter one contained following lines:
[MDBTools]
Description=MDBTools Driver
Driver=libmdbodbc.so
Setup=libmdbodbc.so
FileUsage=1
UsageCount=1

It was enough for my case. The line "[MDBTools]" in the mentioned above file states for the name of the used driver, that should be indicated for granting access to the database.

At this point system's preparation is over, and it is a time, to look around about libraries that should provide access to the DB from the python. I discovered few of them:
  • pyodbc - outdated and I discovered this a bit later;
  • pypyodbc;
  • sqlalchemy-access - outdated as well.
Unfortunately, pyodbc failed to connect to the database, so I've been forced to use pypyodbc as the only possible way at that moment. Being inspired by usage examples for pypyodbc I finished with something like this: 

import pypyodbc
from os.path import abspath

drv ='MDBTools'
db_file = abspath('./pl.mdb') # Here "pl.mdb" is my input file
con = pypyodbc.connect("DRIVER={};DBQ={}".format(drv, db_file))
cur = con.cursor()

tables = list(cur.tables())

# Get list of tables from the file
for tbl in tables:
  print tbl


And got something like that:
('', '', '\xd0\x92\xd1\x83\xd0\xbb\xd0\xb8\xd1\x86\xd1\x96\xd0\x9b\xd1\x8c\xd0\xb2\xd0\xbe\xd0\xb2\xd0\xb0', 'TABLE', '')
('', '', '\xd0\x9d\xd0\xb0\xd1\x81\xd0\x9f\xd1\x83\xd0\xbd\xd0\xba\xd1\x82\xd0\xb8\xd0\x9b\xd0\x9e', 'TABLE', '')
('', '', '\xd0\xa2\xd0\xb8\xd0\xbf\xd1\x8b', 'TABLE', '')
('', '', '\xd0\xa2\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd1\x8b', 'TABLE', '')
('', '', '\xd0\xa4\xd1\x96\xd1\x80\xd0\xbc\xd0\xb8\xd0\x9f\xd0\xbe\xd1\x81\xd1\x82\xd0\xb0\xd1\x87\xd0\xb0\xd0\xbb\xd1\x8c\xd0\xbd\xd0\xb8\xd0\xba\xd0\xb8', 'TABLE', '')

Here I should shout "Hooray! I did it!", but unfortunately, there is no reason for joy. As you may notice, tables are named by Cyrillic names, so now I should deal with them in some way.
The only solution that came into my mind was to call SQL statement as Unicode string, with the table's name encoded as regular string, i.e.:

sql_statement = u'SELECT * FROM "\xd0\x92\xd1\x83\xd0\xbb\xd0\xb8\xd1\x86\xd1\x96\xd0\x9b\xd1\x8c\xd0\xb2\xd0\xbe\xd0\xb2\xd0\xb0"' 
res = cur.execute(sql_statement).fetchall()
for item in res:
  print item

So, I got a list of tuples like this one (stripped version), from which records can be converted into regular Unocode.
('\xd0\x94\xd0\xb6\xd0\xb5\xd1\x80\xd0\xb5\xd0\xbb\xd1\x8c\xd0\xbd\xd0\xb0',)
('\xd0\x94\xd0\xb7\xd0\xb8\xd0\xbd\xd0\xb4\xd1\x80\xd0\xb8 \xd0\x84.',)
('\xd0\x94\xd0\xb8\xd0\xb2\xd1\x96\xd0\xb7\xd1\x96\xd0\xb9\xd0\xbd\xd0\xb0',)
('\xd0\x94\xd0\xb8\xd0\xba\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0',)
('\xd0\x94\xd0\xb8\xd1\x82\xd1\x8f\xd1\x87\xd0\xb0',)


The converted version of the records is:

Джерельна
Дзиндри Є.
Дивізійна
Диктова
Дитяча

And now it is a time for "Hooray!!!!"


неділя, 14 червня 2015 р.

Processing flashed messages for different POST requests.

While been working on my project, I want to provide user with feedback messages. This is important i.e. if user tries to do something that is prohibbited, or allowed, and must be confirmed with the appropriate message. Trying to keep everything as simple as possible, I'm using two decorators, one of which checks if user logged in and another checks is user can access given page. Thus my typical method looks like:
@my_blueprint.route('URL_Path')
@requireslogin
@can_access(roles=['administration', 'administrator'])
def method():
   pass

In future I'll try to re-map everything from DB, so it will be more dynamic and more complex, but now it is OK with me. The problem I got today is about accessing flashed messages from POST request, which have been set in can_access().
def can_access(roles=[]):
    def decorator(f):
        @wraps(f)
        def check_access(*args, **kwargs):
            user_rec = db_session.query(User).\
                filter(User.username == session["username"].lower()).\
                first()
            my_role = db_session.query(Role).filter(Role.id == user_rec.role).\
                first()
            logger.debug(my_role)
            if my_role.name.lower() not in roles and my_role.name != 'global':
                logger.debug("Access not allowed for this role.")
                flash(errors.get("access_denied"), "error")
                return  redirect(request.headers.get("Referer", None) or url_for('central'))
            return f(*args, **kwargs)
        return check_access
    return decorator
So the general idea was to turn user back to the previous page OR central (if no previous page) if he doesn't have access to the requested page. While I accessed page using GET requests it seems like everything has been worked OK, but with the POST requests no error message has been displayed. Firstly I thought the problem is that flash() sends signal that lives till the end of the current request, so I decided to create some global variable in the session, that will contain such errors. Indeed it wasn't the case, so I stucked with the problem if I need to re-write each of my methods in order to process the newly-found error. That was not acceptable, so there should be another way. Checking log I figured out there is double call to the same page:
13 06 2015 09:30:25 DEBUG: <Role(id='3', name='administrator')>
13 06 2015 09:30:25 DEBUG: Access not allowed for this role.
13 06 2015 09:30:26 INFO: 127.0.0.1 - - [13/Jun/2015 21:30:26] "POST /customers/delete/3 HTTP/1.1" 302 -
13 06 2015 09:30:26 DEBUG: Calling list_customers
13 06 2015 09:30:26 DEBUG: GET params: 'name': None , 'page': None, 'rps': None
13 06 2015 09:30:26 DEBUG: Retrieving customers
13 06 2015 09:30:26 INFO: 127.0.0.1 - - [13/Jun/2015 21:30:26] "GET /customers/ HTTP/1.1" 200 -
13 06 2015 09:30:26 DEBUG: Calling list_customers
13 06 2015 09:30:26 DEBUG: GET params: 'name': None , 'page': None, 'rps': None
13 06 2015 09:30:26 DEBUG: Retrieving customers
13 06 2015 09:30:26 INFO: 127.0.0.1 - - [13/Jun/2015 21:30:26] "GET /customers/ HTTP/1.1" 200 -

That was not I wanted to have, so another guess was I just pop-up messages on the first request, and on the second I get nothing. And I don't see changes, because there is no delay between these requests. Such idea is correct, because each template has get_flashed_messages() method to display errors. So next step was to figure out where such update happens. I believed this has some connection to my Javascript, so I decided to re-check application's logic. In case of GET request situation is quite simple:
  • user asks for some page with GET request;
  • if page is accessible, server renders it and send it back to the user;
  • if page is not accessible, then server redirects user to its previous or initial page;
  • on the previous or initial page server once again tests if user can login and if user can get access to the page;
  • because this page is accessible by user, it is generated and returned to the user;
  • browser displays data.
As for the POST requests, I use them for 2 different cases: to process forms and to process individual requests like activate/deactivate items, delete items etc. Form processing is similar to the GET requests, so the above case is valid here. But individual requests need another processing logic. These requests are generating by JavaScript code, that expects JSON string as response, and looks like:
result = {
        "performed": False,
        "msg": ""
    }
where result["performed"] contains True if request has been completed by server correctly and request["msg"] contains any support message. There can be other fields as well, but these are mandatory, so tracking if request contains form or not, can be performed by testing "Content-Type" of HTTP header. So, decorator "can_access" needs following changes:
def can_access(roles=[]):
    def decorator(f):
        @wraps(f)
        def check_access(*args, **kwargs):
            user_rec = db_session.query(User).\
                filter(User.username == session["username"].lower()).\
                first()
            my_role = db_session.query(Role).filter(Role.id == user_rec.role).\
                first()
            logger.debug(my_role)
            if my_role.name.lower() not in roles and my_role.name != 'global':
                logger.debug("Access not allowed for this role.")
                flash(errors.get("access_denied"), "error")
                if request.method == "POST" and \
                        "text/plain" in request.headers["Content-Type"]:
                    result = {
                        "performed": False,
                        "msg": errors.get("access_denied")
                    }
                    return jsonify(result)
                return  redirect(request.headers.get("Referer", None) or url_for('central'))
            return f(*args, **kwargs)
        return check_access
    return decorator
From this point GET and POST requests process flashed messages correctly. Any programming, architecture, pythonizing and other tips are welcomed. :)

середа, 22 квітня 2015 р.

Forcing MySQL schema to use UTF-8

Being enough lazy, I'd like to have a possibility to control my SQL schema (where I store all my data), under some parameters. The tutorial for Flask and SQLAlchemy says I must create schema and afterwards it can be managed by extensions. Meantime, one of the biggest problem until today was that received data didn't sorted correctly. I mean they really didn't sort correctly. Today I figured out, this has been caused by CHARACTER SET settings in my Database. I should mention, that SQL Alchemy performs on the fly transformation of received data from latin1 to utf8. So my initial idea was to understand how to avoid such behaviour, and (IMHO) this was the reason of improper sorting. So outside of the Python the data has been looked like:

mysql> SELECT id, name FROM divisions ORDER BY name;

+----+-------------------------------------------------------+
| id | name                                                  |
+----+-------------------------------------------------------+
|  4 | Clinic No.4                                           |
|  6 | Абсолютно нова клініка              |
|  5 | Нова клініка                                |
|  7 | Ще одна клініка                          |
|  1 | global                                                |
+----+-------------------------------------------------------+

Well it is not the sorting I wanted, at least because "Clinic No.4" and "global" should be together, and sorting should ignore letter cases. For the last thing, there is a collate call, that should solve my problem. But it didn't (because of the encoding problem). So in order to get proper encoding there were some prosper ways:
  • edit MySQL config so it started to process data correctly (can be done on DB level);
  • force somehow data connection to process data on-the-fly (can be done on connection level);
  • trying to tell database how data should be stored, so I didn't think about them.
After some readings finally I figured out the correct way for forcing MySQL to process Unicode data correctly.

Firstly I should stress I use pymysql dialect for connection cause it is targets 100% compatibility. So, correct forcing connection to use Unicode on client side is to add "?charset=utf8" to the connection string. So our target will look like: mysql+pymysql://username:password@localhost/db_name?charset=utf8.

Secondly, we should set unicode conversion globally. For this add convertunicode=True and encoding=utf8 as params to sqlalchemy.createengine().

Thirdly we should create MySQL schema with the proper character set and collation. As my DB is entitled clinic, so SQL statement is:
CREATE SCHEMA clinic DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_global_ci;

Indeed, I decided to create default schema, by simple statement CREATE SCHEMA clinic; and tune it on initialization DB from the app, by running the following code:

from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://username:password@localhost/db_name?charset=utf8", encoding=utf8, convert_unicode=True)
SCHEMA = "db_name"

connection = engine.connect()
result = connection.execute("SHOW CREATE SCHEMA " + SCHEMA)
for key in result:
    import re
    charset_pattern = re.compile("DEFAULT CHARACTER SET (?P[A-Za-z0-9]*?) ", re.I)
    charset = re.search(charset_pattern, str(key))
    if charset and 'utf' not in charset.group("charset"):
        connection.execute("ALTER SCHEMA " + SCHEMA +
                           " DEFAULT CHARACTER SET 'utf8'"
                           " DEFAULT COLLATE 'utf8_unicode_ci'")

Here we check if our Schema encoding is UTF8, and set it to correct value if it is not.
And here are the results:

mysql> SELECT * FROM divisions ORDER BY name ASC;
+----+-----------------------------+---------+
| id | name                        | builtin |
+----+-----------------------------+---------+
|  5 | Booble                      |       0 |
|  6 | Bubble                      |       0 |
|  1 | global                      |       1 |
|  4 | Друга клініка               |       0 |
|  3 | Перша клініка               |       0 |
|  2 | Типова клініка              |       1 |
+----+-----------------------------+---------+
6 rows in set (0.00 sec)

mysql> SELECT * FROM divisions ORDER BY name DESC;
+----+-----------------------------+---------+
| id | name                        | builtin |
+----+-----------------------------+---------+
|  2 | Типова клініка              |       1 |
|  3 | Перша клініка               |       0 |
|  4 | Друга клініка               |       0 |
|  1 | global                      |       1 |
|  6 | Bubble                      |       0 |
|  5 | Booble                      |       0 |
+----+-----------------------------+---------+
6 rows in set (0.00 sec)

вівторок, 21 квітня 2015 р.

Intro to my pet project

They say, each old or new programmer must have its own pet project, where he can study some new tricks. After a year, I finally, decided to start my own project: tiny CMS for managing small-size clinic. Before that I tried myself with Java, C and C++. About a year and a half ago I decided to switch myself to Python so I finished some Python courses (indeed there were two of them: Google's Python Class and Introduction to Interactive Programming on Coursera), and now I'n trying to improve my skills in python using Flask micro-framework, MySQL and some other simple stuff for building web-pages in order to get more comfortable with these technologies.

I have to say, that indeed, I am "working" on this project near 4 monthes. And under "working" I mean it is not so regular as I'd like it to be, but I'm trying to work on it each day for 1-2 hours in order to make some progress.

So, what are the objectives for it?

  • Study more deeply Python
  • Study SQL, and related Python-library: SQL Alchemy.
  • Study new (at least for me) technologies like: HTML, CSS (incl. CSS3), JavaScript (Ajax), and jQuery
  • Study Web-oriented Python framework Flask and its extensions.
  • Study some version control system (currently it is git, but I also liked mercurial).

I have no scope to make any deep research in Web or Computer design, and I believe that good design can be made by good designer, but as the present stage it is not so important. Right now I'm re-writing my present code (that I store in private archive at bitbucket) in order to allow people to use it from different places.

In this blog under the category "Pet Project", I'm going to leave some notes that may be usefull for other people, and also describe my own experience until this project will be released.