{"id":648,"date":"2012-11-27T15:55:52","date_gmt":"2012-11-27T15:55:52","guid":{"rendered":"http:\/\/www.bardecode.com\/newsite\/?p=648"},"modified":"2018-12-17T17:33:41","modified_gmt":"2018-12-17T17:33:41","slug":"how-to-convert-utf-8-data-into-a-string","status":"publish","type":"post","link":"https:\/\/www.bardecode.com\/newsite\/how-to-convert-utf-8-data-into-a-string\/","title":{"rendered":"How to convert UTF-8 data into a String"},"content":{"rendered":"<p>If you have an array of UTF-8 bytes and want to convert them into a String then the following may help&#8230;<\/p>\n<p>As you may know, UTF-8 is a way of encoding every character in the Unicode character set using a variable number of byte per character. For example, the letter A just needs 1 byte but the character\u00a0\ub070 requires 2 bytes (b0 70).<\/p>\n<p>In VB.Net this is pretty straight forward&#8230;<\/p>\n<p>Start out with the 2 bytes in an array&#8230;<\/p>\n<pre>Dim bytes() As Byte = New Byte() {&amp;HB0, &amp;H70}\r\nDim str As String = System.Text.Encoding.UTF8.GetString(bytes)<\/pre>\n<p>And now str contains \ub070<\/p>\n<p>But what if you started out with an IntPtr to an un-managed C style string?<\/p>\n<p>In that case you would need to marshal the data into a byte array and then do the above, as in the following funciton&#8230;<\/p>\n<pre>\u00a0 \u00a0Public Function ConvertUTF8IntPtrtoString(ByVal ptr As System.IntPtr) As String\r\n\u00a0 \u00a0 \u00a0 \u00a0 Dim l As Integer\r\n\u00a0 \u00a0 \u00a0 \u00a0 l = System.Runtime.InteropServices.Marshal.PtrToStringAnsi(ptr).Length\r\n\u00a0 \u00a0 \u00a0 \u00a0 Dim utf8data(l) As Byte\r\n\u00a0 \u00a0 \u00a0 \u00a0 System.Runtime.InteropServices.Marshal.Copy(ptr, utf8data, 0, l)\r\n\u00a0 \u00a0 \u00a0 \u00a0 Return System.Text.Encoding.UTF8.GetString(utf8data)\r\n\u00a0 \u00a0 End Function<\/pre>\n<p>&nbsp;<\/p>\n<p>And the following C++ function will do the same in MFC:<\/p>\n<p>&nbsp;<\/p>\n<pre>int CSampleBarcodeReaderDlg::ConvertUTF8Value(LPCSTR in, CString &amp;out)\r\n{\r\n   int l = MultiByteToWideChar(CP_UTF8, 0, in, -1, NULL, 0);\r\n   wchar_t *str = new wchar_t[l];\r\n   int r = MultiByteToWideChar(CP_UTF8, 0, in, -1, str, l);\r\n   out = str;\r\n   delete str ;\r\n   return r ;\r\n}<\/pre>\n<p>&nbsp;<\/p>\n<p>A frustrating twist on the above is when you have a representation of UTF-8 already in a String and would like to convert it to a normal String. There are probably smarter ways of doing this but here&#8217;s a take on it&#8230;<\/p>\n<p>In this example utf8 starts out a a string that happens to contain a representation of UTF-8 data. This is converted, character by character to a byte array and then back to a String using UTF-8 encoding. In this case str ends up with the value\u00a0?.<\/p>\n<p>&nbsp;<\/p>\n<pre>\u00a0 \u00a0 \u00a0 \u00a0 Dim utf8 As String = \"\u00e7?\u00b3\"\r\n\u00a0 \u00a0 \u00a0 \u00a0 Dim ch() As Char = utf8.ToCharArray()\r\n\u00a0 \u00a0 \u00a0 \u00a0 Dim bytes(ch.Length) As Byte\r\n\u00a0 \u00a0 \u00a0 \u00a0 For i = 0 To (ch.Length - 1)\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 bytes(i) = System.Convert.ToByte(ch(i))\r\n\u00a0 \u00a0 \u00a0 \u00a0 Next\r\n\u00a0 \u00a0 \u00a0 \u00a0 Dim str As String = System.Text.Encoding.UTF8.GetString(bytes)<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>If you have an array of UTF-8 bytes and want to convert them into a String then the following may help&#8230; As you may know, UTF-8 is a way of encoding every character in the Unicode character set using a variable number of byte per character. For example, the letter A just needs 1 byte &hellip; <\/p>\n","protected":false},"author":1,"featured_media":314,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31],"tags":[19,93,177,131,65,62,178,176],"class_list":["post-648","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-software-development-kits","tag-barcode","tag-convert","tag-data","tag-reader","tag-sdk-2","tag-software","tag-string","tag-utf-8"],"_links":{"self":[{"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/posts\/648"}],"collection":[{"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/comments?post=648"}],"version-history":[{"count":6,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/posts\/648\/revisions"}],"predecessor-version":[{"id":2425,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/posts\/648\/revisions\/2425"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/media\/314"}],"wp:attachment":[{"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/media?parent=648"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/categories?post=648"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bardecode.com\/newsite\/wp-json\/wp\/v2\/tags?post=648"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}