Byte Order Mark wrongly reported in HEX edit mode for UTF encodings
What steps will reproduce the problem?
* Open or create a file.
* Enter some text.
* Select Format > UTF-8.
* View > Hex Edit Mode
* First four Hex Characters display as FFFE (regardless of whether "Ident. Bytes in UTF-8 encoding" in settings is unchecked, or not).
* FFFE wrongly displayed for UTF encodings selected (UTF-8, and UTF-16 BE). (It should be FFFE for UTF-16 LE but only if "Ident. Bytes in UTF-8 encoding" is checked).
Open in Hex Editor (which reads bytes as on disk) seems to work fine.
What is the expected output?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = [No BOM]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = [No BOM]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = [No BOM]
"Ident. Bytes in UTF-8 encoding": checked + UTF-8 = EFBB BF
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 BE = FEFF
What do you see instead?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = FFFE
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-8 = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 LE = FFFE
"Ident. Bytes in UTF-8 encoding": checked + UTF-16 BE = FFFE
What's your operating system?
Windows XP Pro, SP3
Aug 09, 2010 by pspad
- status changed from new to invalid
Aug 12, 2010 by johnbentley
Jan, continued thanks for your enormous effort in building the best Text Editor in the world.
Given that this is by design (a program philosophy) then I appreciate it is not a bug. It continues, then, as a suggestion.
For I see nothing in PSPad help 4.5.4 on
* Working with PSPad > Hex editor; nor
* Menu Description > Format Menu
that would explain why when working on a UTF-8 File (in memory) anyone would want to see a UTF-16LE Byte Order Mark. Nor, if "Ident. Bytes in UTF-8 encoding" is unchecked why anyone would want to see any byte order marks (for files under Hex-In-Memory view).
Incidentally, in my previous post I meant to write
<quote>
What do you see instead?
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-8 = [No byte order mark]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 LE = [No byte order mark]
"Ident. Bytes in UTF-8 encoding": unchecked + UTF-16 BE = [No byte order mark]
...
</quote>
Aug 28, 2010 by Orlando
@ johnbentley
This longstanding bug is not acknowledged by the PSPad author.
http://forum.pspad.com/read.php?4,44014
Aug 09, 2010 by pspad
When you switch from text to HEX, you will see ALWAYS encoding UTF-16LE - you edit text in it. You will see memory representation.
If you want to see real file content, open file directly in HEX editor.
It's program philosofy, you can read it in help