bardecode.com
barcode reading software
  • Home
  • Download
  • Purchase
  • SDK Licensing
  • Pricing
    • Windows SDK Price List
    • Windows SDK with PDF Extension Price List
    • Linux SDK Price List
    • Linux SDK with PDF Extension Price List
    • PDF Extension For Windows Price List
    • Multi-Platform Toolkit for .NET Price List
    • BardecodeFiler Desktop Only Price List
    • BardecodeFiler Windows Service Price List
    • DOS Command Prompt Barcode Tool Price List
    • Terms and Conditions
    • Refund Policy
  • Products
    • Softek Barcode Reader Toolkit for Windows
    • Softek Barcode Reader Toolkit for Linux
    • BardecodeFiler Desktop App
    • BardecodeFiler Windows Service
    • Windows DOS Command Prompt Tool
    • Multi-platform barcode reader toolkit for .NET
    • Aquaforest’s Autobahn DX
  • Knowledge Base
    • Documentation
    • Specifications
  • News
  • Contact
    • About Us
    • Terms and Conditions
    • Resellers
    • Links
Select Page ...

News

Double encoded UTF-8 strings in C#

admin November 27, 2012 BardecodeFiler, Settings, Software Development Kits No Comments

This article shows how to convert a string that has been double encoded using UTF-8.

For example, say you have the string Müller instead of the string Müller.

How did it happen?

The letter ü is encoded in UTF-8 as 2 bytes: 195 and 188

If you encoded the bytes again then the 195 converts to 195 and 131 which is the Ã

And the 188 converts to 194 and 188 which is the ¼

How can it be converted back to what it should look like?

The following function will convert the double encoded string back to the original value…

 

private string decodeUTF8String(String utf8Str)

{

System.Text.Encoding iso = System.Text.Encoding.GetEncoding(“ISO-8859-1”);

System.Text.Encoding utf8 = System.Text.Encoding.UTF8;

byte[] utfBytes = utf8.GetBytes(utf8Str);

byte[] isoBytes = System.Text.Encoding.Convert(utf8, iso, utfBytes);

System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();

return encoding.GetString(isoBytes);

}

 

How does this relate to barcodes?

Some PDF-417 barcodes may contain data that has already been encoded using UTF-8 and when we read the barcode we encode it again using UTF-8, giving a double encoded string. In the win32 DLL interface the work-around is simply to set the Encoding property to 0, but the above is necessary in the .Net interface.

← Regular expressions under 7.5.1.18
BardecodeFiler Version 1.9.1 Released →
admin

  • Copyright © 2023 Softek Software Ltd. All Rights Reserved