miniBB ® 

miniBB

®
Support Forums
  
 | Start | Register | Search | Statistics | File Bank | Manual |
The Other miniBB Support Forums / The Other /  
 

Username cutted on UTF-8 forum. Why?

 
Author Dimedrol
Partaker
#1 | Posted: 19 Aug 2009 16:36 
Hello!
I've got fully UTF-8 forum.
I even adjusted MySQL table fields with "utf-8" attributes.

So, I've got 1 problem:
when user post his message, entering cyrillic (in russian) username, it is being cutted at 7-th utf-8 symbol.
It becomes too short. I see only 7 russian symbols.
First of all I thought it was somethere at bb_func_txt.php at function convEnt($str)
but, it seems - no.

So, what happens to my username? (it was unregistered user, simply entered username)

Author Paul
Lead Developer 
#2 | Posted: 20 Aug 2009 02:31 
I think the code for that is defined under bb_func_login.php.
You may check for the line which says

$user_usr=wrapText(15,$user_usr);

So guest names are cropped by 15 symbols (to not split the layout).
UTF-8 strings contain 2 bytes per symbol, so they are cropped by 15 symbols, meaning 7,5 actual characters. The last character may be just not to be displayed, because it's "not full", that's why you get an impression the string cut off by 7th char.

Our recommendation, if you run forums on one language, is always to provide the original "native" encoding for this language. First of all, it take double less space; second, with UTF-8 there are actually a lot of problems, too. Most software developers just choose it by default to cover as whole market as possible, but technically it's wrong solution.

Author Dimedrol
Partaker
#3 | Posted: 20 Aug 2009 03:41 
Paul, thanks for quick answer (as usual!)

Yes! This was exactly what I needed.

About UTF-8... I cannot agree with you 100% ;-)
I'm your latvian neighbour and we always need to keep in mind, that there are 2 language groups of people here: latvian and russian, so, I think you know this situation very well.

About "a lot of problems " I can say, it is not problems of UTF-8 itself, it is problems of PHP which is not UTF-8 compatible.
I've already changed in some important places your standard PHP fucntions to the UTF-8-compatible ones.

Doubled (not always!) disc space for DB, I think, this is not a problem nowadays.

Actually I had much less problems with MySQL databases, which is in UTF-8 format, migrating them from one hosting provider to another.
Having "cyrillic"-only DB on MySQL some years ago, it was a nightmare - to migrate this DB to another host (with different default charset installed on server).

btw, thanks for help!

Author Paul
Lead Developer 
#4 | Posted: 20 Aug 2009 07:21 
Well, saying about Latvian pages it's of course a different situation, I know it perfectly. Yes, UTF-8 is a saver if you run forums in multiple languages at once. I've meant one language forums (I've mentioned it).

By double less space let's not also forget the traffic... of course the space on disk doesn't matter a lot, but the traffic double grows. Sometimes it's critical... specially if you have let's say 300 visitors per second ;-)

Anyway, it doesn't really matter if you choose UTF-8 or not. I just want to say that for about 80% of cases UTF-8 is not the most suitable solution. BTW, PHP6 is planned to work with UTF-8 texts, I don't know when it will happen actually :-) but it would solve many problems.

Regarding migration problems - if you know how to do it properly, it appears very easy. The main problem comes from mysqldump utility here, which (from some time) transfers all databases in UTF-8 by default. In the final result, the original encoding may be lost.

When setting up the site, you must just know precisely how to set up the database in the proper encoding. So if you run the site in windows-1251, the database and all related client/server connections should be set in this mode on server (cp1251). The same appeals to any other specific non-latin encoding. When you transfer the database, you must put the proper encoding flag to mysqldump or whatever exporting tool. Also, if you run the site in UTF-8, then the database itself and all client/server connections must be set up in this mode. But as I know, setting up the whole server in UTF-8 is a painful task. Not every admin is even supposed to do so.

The nightmare appears mySQL itself here. Not the thing for newbies I must say...

The Other miniBB Support Forums / The Other /
 Username cutted on UTF-8 forum. Why?
 Share Topic's Link

Your Reply Click this icon to move up to the quoted message


  ?
Post as a Guest, leaving the Password field blank. You could also enter a Guest name, if it's not taken by a member yet. Sign-in and post at once, or just sign-in, bypassing the message's text.


Before posting, make sure your message is compliant with forum rules; otherwise it could be locked or removed with no explanation.

 

 
miniBB Support Forums Powered by Forum Software miniBB ® Home  Features  Requirements  Demo  Download  Showcase  Gallery of Arts
Compiler  Premium Extensions  Premium Support  License  Contact Us
Get the Captcha add-on: protect your miniBB-forums from the automated spam and flood.


  ⇑