View Single Post
Old 2016-08-29, 21:36   #1126
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

327410 Posts
Default

Quote:
Originally Posted by retina View Post
Actually I think it is a sorting thing. They only purpose I know of for informing the DB that some data are in specific text formats is to sort things "logically" so that letters with accents will sort close to the same letter without an accent, and other similar lexicographic reasons. For all other purposes, like display and comparison for example, simple byte data is just fine.
There are the collations which also play a factor (how things are sorted, case/accent sensitivity, etc.)

My day job may involve working with data in Cyrillic, Polish, CJK, etc. and those DB's in particular have their collation set to an appropriate one. I have a lot of "fun" when data is being transferred across DBs and matching up data, since I have to add some type of COLLATE (kind of like a CAST) to avoid the mismatching collation errors that inevitably arise.

I once knew a fella who insisted all DB's use binary collation... at the time we all thought he was pretty out there for that idea and we actually had to do a project to switch to SQL_Latin1... but when dealing with multiple languages and character sets, it's actually a better argument for binary if you don't mind specifying what collation you want to have your rows sorted, because binary just won't cut it.
Madpoo is offline   Reply With Quote