I wanted to ask when it is used COLLATE SQL_Latin1_General_CP1_CI_AS
and what its main functionality would be, since I do not understand that function very well and I have seen it used inWHERE
I wanted to ask when it is used COLLATE SQL_Latin1_General_CP1_CI_AS
and what its main functionality would be, since I do not understand that function very well and I have seen it used inWHERE
Collations in SQL Server
The collation or collate , known in Spanish as collation , according to the documentation is:
This collation applies at various levels, for example there is a default collation of the SQL Server instance, a default collation of each database and, in the end, each column of character data types, such as
char
,varchar
has its own collation. Columns created without specifying a collation are created using the database's default collation.Databases created without specifying a collation are created with the default collation of the database instance.
This collation, in practical terms, defines when a match occurs when making an equal or non-equal comparison, as well as the ordering of a result set.
In the first case, we are going to create a table with three columns with different collations and insert some data into it:
Comparisons with constant strings
Now, let's run some queries and see their results:
As you can see, the same equality condition, applied to the different columns, returns different results, 2, 4, and 1 rows match each time with the string 'Hernández'.
IC/CS AI/AS
This is because some collations are case sensitive or case insensitive (CS=Case Sensitive or CI=Case Insensitive). In a Case Insensitive collation ,
A
it is equal toa
, while in a Case Sensitive it is not.Also some collations are accent sensitive. (AS=Accent Sensitive and AI=Accent Insensitive). In an Accent Insensitive collation ,
á
it is equal toa
, while in an Accent Sensitive they are not equal.We have all the possible combinations, CI_AI, CI_AS, CS_AI, CS_AS. In CI_AI, it
á
is equal toA
, in CS_ASá
it is only equal to itself.Comparisons between columns
When you try to compare two columns that have different collation, SQL Server will generate an error, in my case:
Generate
This is basically saying that by having different rules to determine the match on each of the strings, the engine is unable to decide which rule to apply.
To solve this, we can explicitly indicate which collation we want to apply to any given column with the syntax
columna collate nombre_intercalación
. So, for example, we can apply to one column the collation of the other:Or we could apply to both a different collation than the one they have, as long as we apply the same one to both sides of the equality:
ordering
Finally, the ordering, I leave it to the reader, this answer is already too long for the StackOverflow format, just try the different statements:
What collations are available:
SQL Server comes with a large number of collations, useful for different world regions, languages, and usage patterns. You can see a list of all available collations on your instance with the following query:
Recommendation
For Latin languages in general, the recommendation is to use
Latin1_General_100*
, for Spanish in particular,Modern_Spanish*