Thextedit user manual - version 0.1.0

A simple text editor focused on character sets encoding.
It can be useful to analyze the binary data of text allowing to experiment with codecs (character set encoding), BOM (Byte Order Mark), EOL (End Of Line mark), and to convert files from an encoding to another.
It features a basic text editor, and an hex viewer that allow to identify which character maps to which byte(s) (that may be not stright forward for multy-byte encodings). Powered by the "Qt" widgets toolkit, it supports most existing encodings (from ASCII to Unicode).



Contents

Release info

Version: 0.1.0, 10/10/2010

Linux version build with QtCreator 2.0.0 based on Qt 4.7.0 (32 bit)

Windows version build with QtCreator 2.0.0 based on Qt 4.6.3 (32 bit)

This is the first beta version. It may be considered as a prototype to explore the various facilities the editor could feature, but is still incomplete and has several performance issues. In particular it does not have large file support (it becomes very slow with files larger than 1MB).
Planned developing:

Install

The latest binary and source packages can be downloaded from: http://sourceforge.net/projects/thext/
Thextedit ships as a single executable and has no setup: just extract the archive and double click the executable.
The linux version is linked against Qt4 that should be present in your system (otherwise is promptly available in most distribution).
The windows version is statically linked, so no external library is requested.

User Interface reference

Thextedit is a TDI (Tabbed Document Interface) application. Each document consists of a text/hex editor, described in the next section.
Here's a list of the UI commands menus/toolbars/keyboard-shortcats:

The editor

Plain text view

In plain text view a very basic text editor is available, nevertheless it can be useful to experiment with various text encoding, and to convert a text document.
Please note the encoding and BOM selections are intenctionally left indepent, as to allow to reproduce commonly displayed text decoding error. Only if the encoding matches the BOM, this is hidden from the text, as all Unicode enabled programs will do. If converting a BOM marked document from the correspondenting encoding, also the BOM is converted, while it will produce extravagant (thought realistic) results, if encoding mismatched BOM.


When editing the status bar keeps updated information about the document.
Left to right the following information are displayed:

Hex view

In HEX view the editor is splitted in the usual 3 areas:

Note that while as usual all non printable characters are rendered with a '.' in the text area, the "extra" bytes of a multybyte characters are rendered with a space and not a dot, to better recognise the char/bytes association.
Moving the cursor in both the text/binary area the highlighted selection are keept syncroniced, and a different highlight color is used when a multibyte character is encountered.

License

Copyright (C) 2010 Attilio Pavone <tilly@utillyty.eu>

Thext is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Thext is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with thext. If not, see http://www.gnu.org/licenses/.

And here's your copy of the licence: GNU General Public License version 3

Develope

Source packages: sourceforge.net/projects/thext
SVN repository: svn co https://thext.svn.sourceforge.net/svnroot/thext thext

Revisions history

Version: 0.1.0, 10/10/2010
Build late at night in Macerata...
This is the first public release