Aug 09, 2010 by johnbentley

Byte Order Mark wrongly reported in HEX edit mode for UTF encodings

issue number: 94
type: bug
status: invalid
priority: n/a
in version: 4.5.4 (2356)
tags:
stars: 0

What steps will reproduce the problem?
* Open or create a file.
* Enter some text.
* Select Format > UTF-8.
* View > Hex Edit Mode
* First four Hex Characters display as FFFE (regardless of whether "Ident. Bytes in UTF-8 encoding" in settings is unchecked, or not).
* FFFE wrongly displayed for UTF encodings selected (UTF-8, and UTF-16 BE). (It should be FFFE for UTF-16 LE but only if "Ident. Bytes in UTF-8 encoding" is checked).

Open in Hex Editor (which reads bytes as on disk) seems to work fine.

What is the expected output?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = [No BOM]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = [No BOM]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = [No BOM]

"Ident. Bytes in UTF-8 encoding": checked + UTF-8 = EFBB BF
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 BE = FEFF

What do you see instead?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = FFFE
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = FFFE

"Ident. Bytes in UTF-8 encoding": checked + UTF-8 = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 BE = FFFE

What's your operating system?
Windows XP Pro, SP3


Need this feature?
Promise money, wait for fix and pay with PayPal.

Aug 09, 2010 by pspad

When you switch from text to HEX, you will see ALWAYS encoding UTF-16LE - you edit text in it. You will see memory representation.
If you want to see real file content, open file directly in HEX editor.

It's program philosofy, you can read it in help

Aug 09, 2010 by pspad

  • status changed from new to invalid

Aug 12, 2010 by johnbentley

Jan, continued thanks for your enormous effort in building the best Text Editor in the world.

Given that this is by design (a program philosophy) then I appreciate it is not a bug. It continues, then, as a suggestion.

For I see nothing in PSPad help 4.5.4 on
* Working with PSPad > Hex editor; nor
* Menu Description > Format Menu

that would explain why when working on a UTF-8 File (in memory) anyone would want to see a UTF-16LE Byte Order Mark. Nor, if "Ident. Bytes in UTF-8 encoding" is unchecked why anyone would want to see any byte order marks (for files under Hex-In-Memory view).

Incidentally, in my previous post I meant to write
<quote>
What do you see instead?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = [No byte order mark]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = [No byte order mark]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = [No byte order mark]
...
</quote>

Aug 28, 2010 by Orlando

@ johnbentley

This longstanding bug is not acknowledged by the PSPad author.

http://forum.pspad.com/read.php?4,44014

Log in for email notifications for this issue
To comment, please log-in.