Recently, we have written quite often about the QR сode and how to work with it in our products:


However, we practically did not touch the technical aspects of using the QR code. So, today let's talk about data encoding in QR сode and possible problems associated with it.

The general algorithm for applying encodings

The QR code supports various modes of recording information. Typically, the mode selection algorithm works as follows - the input data is analyzed and checked whether it can be written in one of the compact modes (Numeric, Alphanumeric, Kanji). If they cannot be written, the Byte mode is used.

In the Byte mode, the input data is encoded using one of the encodings in the ECI (Extended Channel Interpretation) list. The algorithm checks each encoding in turn and determines whether it can be used to encode all characters of the input data. If none is suitable, the universal encoding "UTF-8" is used.

This is all done to ensure that the barcode is as small as possible, because in the UTF-8 mode the barcode has a maximum size.

The list of ECI standard encodings for the QR code that are supported in our product:

  • "Cp437";
  • "ISO-8859-1";
  • "ISO-8859-2";
  • "ISO-8859-3";
  • "ISO-8859-4";
  • "ISO-8859-5";
  • "ISO-8859-6";
  • "ISO-8859-7";
  • "ISO-8859-8";
  • "ISO-8859-9";
  • "ISO-8859-11";
  • "ISO-8859-13";
  • "ISO-8859-15";
  • "Shift_JIS";
  • "Windows-1250";
  • "Windows-1251";
  • "Windows-1252";
  • "Windows-1256";
  • "UTF-8".
Note: fewer encodings are supported for the JS version (this is due to technical limitations of technologies):

  • "ISO-8859-1";
  • "Windows-1250";
  • "Windows-1251";
  • "Windows-1252";
  • "Windows-1256";
  • "UTF-8".

Incorrect encoding problem

Different barcode readers may support different sets of encodings for decoding QR codes. For example, barcode readers in some countries only support certain encodings most commonly used in that country.

Also, many mobile applications for reading barcodes only support certain encodings, and some applications do not support the ECI standard at all.

Luckily, nowadays, most scanners are able to handle UTF-8 encoding.

Some of the encodings in this list are similar. For example, ISO-8859-5 and Windows-1251 contain Cyrillic characters. Therefore, it depends only on the input data which of these encodings will be used for the text. Some scanners, for example, do not work with ISO-8859-5 encoding and require Windows-1251 to be used.

Solution

When it is necessary to specify the encoding with which the selection will start, you may use the static property - StiOptions.Engine.BarcodeQRCodeDefaultByteModeEncoding.

For example, let's set Windows_1251 encoding by default:

StiOptions.Engine.BarcodeQRCodeDefaultByteModeEncoding = Stimulsoft.Report.BarCodes.StiQRCodeECIMode.Windows_1251;

The problem of "three extra characters at the start of the barcode"

Sometimes users complain that extra characters appear at the start of the scanned information. This is not a mistake but a feature of encoding information in a barcode. If the input data is encoded using UTF-8, then many programs (and our reporting tool) prefix the data with a BOM (Byte Order Mark). This label is a marker for some applications that helps in determining the encoding. However, not all scanners recognize the BOM mark, and then three extra  characters appear at the beginning of the text.

Solution

Set the StiOptions.Engine.BarcodeQRCodeAllowUnicodeBOM option to false to avoid this and do not add a BOM label to the input data.
If you have any questions, please contact us.
By using this website, you agree to the use of cookies for analytics and personalized content. Cookies store useful information on your computer to help us improve efficiency and usability. For more information, please read the privacy policy and cookie policy.