Posts Tagged ‘utf-8’

convert a file to an UTF-8-encoded text

December 16, 2017 Leave a comment

I wrote a simple script that takes an input file, changes its character encoding to UTF-8, and prints the result to the screen.

It’s actually a wrapper around the Unix commands “file” and “iconv“. The goal was to make its usage as simple as possible. The script is here:


$ input.txt

The program tries to detect the encoding of the input file.


Categories: bash, python Tags: ,

Reading (writing) unicode text from (to) files

August 6, 2015 Leave a comment

You want to write some special characters to a file (e.g. f.write("voilá")) but you get immediately some unicode error in your face.

Instead of messing with the encode, decode methods, use the codecs module.

import codecs

# read
with, "r", "utf-8") as f:
    text =

# write
with, "w", "utf-8") as to:

As can be seen, its usage is very similar to the well-known open function.

This tip is from here.

Categories: python Tags: , ,

Print unicode text to the terminal

September 2, 2012 2 comments

I wrote a script in Eclipse-PyDev that prints some text with accented characters to the standard output. It runs fine in the IDE but it breaks in the console:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 11: ordinal not in range(128)

This thing bugged me for a long time but now I found a working solution.

Insert the following in your source code:

import sys

I found this trick here. “This allows you to switch from the default ASCII to other encodings such as UTF-8, which the Python runtime will use whenever it has to decode a string buffer to unicode.”


Categories: python Tags: , , , ,