bardecode.com
barcode reading software
  • Home
  • Download
  • Purchase
  • SDK Licensing
  • Pricing
    • Windows SDK Price List
    • Windows SDK with PDF Extension Price List
    • PDF Extension For Windows Price List
    • Linux SDK Price List
    • Multi-Platform Toolkit for .Net/Docker Price List
    • BardecodeFiler Desktop Only Price List
    • BardecodeFiler Windows Service Price List
    • DOS Command Prompt Barcode Tool Price List
  • Products
    • Barcode Reader Toolkit for Windows
    • Barcode Reader Toolkit for Linux
    • BardecodeFiler Application Desktop Only
    • BardecodeFiler Application & Windows Service
    • Windows DOS Command Prompt Barcode Tool
    • Multi-Platform Barcode Reader Toolkit for .Net Standard on Docker
    • Aquaforest’s Autobahn DX
  • Knowledge Base
    • Documentation
    • Specifications
  • News
  • Contact
    • About Us
    • Resellers
    • Links
Select Page ...

News

QR-Codes and UTF-8 encoding

admin October 2, 2013 Barcode Specification, Documentation, Knowledge Base, Settings No Comments

This article explains issues relating to the use of UTF-8 encoded data in QR-Codes.

There are several ways in which data can be stored in a QR-Code, one of which is called Byte Mode. This mode allows data to be encoded in a sequence of 8-bit byte values, in other words an array of integer values ranging from 0 to 255. The default interpretation of these values is from the ISO/IEC 8859-1 symbol set. In this symbol set each value from 0 to 255 equates to a specific symbol and the number of symbols available is exactly 256. To extend the range of symbols available it is possible to use alternative character sets such as UTF-8, however there are some issues that developers should be aware of….

Auto-detection of UTF-8

A barcode reader does not automatically know that data has been encoded using UTF-8 unless an ECI has been used (see below). But it can make a guess…

A test can be applied to a sequence of byte values to see whether or not it can be interpreted as UTF-8 data, however this does not mean that the sequence isn’t also valid ISO/IEC 8859-1 data. For example, the character æ can is represented by the UTF-8 byte sequence C3 A6 (hex values). But C3 A6 in ISO/IEC 8859-1 represents the characters à and |. So this test relies on the fact that it’s unlikely to see the combination of Ã| in text.

In practice this test works well but developers should be aware of the possibility of incorrect interpretation of valid ISO/IEC 8859-1 data.

Extended Chanel Interpretation (ECI)

A more reliable way to encode UTF-8 data in a QR-Code is to include an ECI block in the data to specifically inform the reader that the next block of bytes is using UTF-8 rather than the default ISO/IEC 8859-1 encoding. The ECI block should have the value 000026. Please refer to section 6.4.2 of ISO/IEC 18004 Second Edition 2006-09-01 for further information.

Versions of SoftekBarcode.dll

Versions of softekbarcode.dll starting from 7.5.1.37 support automatic detection of UTF-8 data and ECI/UTF-8 interpretation. An advanced parameter called QRCodeAutoUTF8 controls automatic detection of UTF-8 and can either be set to 1 (True) or 0 (False). If barcodes are using an ECI block then it is safe to set the QRCodeAutoUTF8 to False or 0. Please contact support@bardecode.com if you require a version with these features.

← Handling PDF Documents in ASP on x64 Systems
Best Practice for QR Codes →
admin

  • Copyright © 2021 Softek Software Ltd. All Rights Reserved