remove punctuations from a text

February 5, 2017 Leave a comment

Problem
You have a text and you want to remove punctuations from it. Example:

in:
"Hello! It is time to remove punctuations. It is easy, you will see."

out:
"Hello It is time to remove punctuations It is easy you will see"

Solution
Let’s see a Python 3 solution:

>>> import string
>>> tr = str.maketrans("", "", string.punctuation)
>>> s = "Hello! It is time to remove punctuations. It is easy, you will see."
>>> s.translate(tr)
'Hello Its time to remove punctuations Its easy youll see'

Docs: str.maketrans(), str.translate().

Advertisements
Categories: python Tags: , ,

4k input limit in terminal

January 23, 2017 Leave a comment

Problem
Today I ran into a strange problem. Take this code:

s = input("text> ")
print(len(s))

If the input is very long, then it is truncated to 4096 characters (I tried it under Linux). The same happens when you do “cat | wc” and paste in a long string. What???

Solution
It turns out that there’s a 4k kernel line length limit on terminal input (link). But how to overcome this problem?

0) Well, probably the best way is not to insert such a long string in the terminal. Pass it through a pipe (“cat long.txt | wc” does work) or read it from a file.

But, if you really want to paste in a long string, here is what you can do:

1) With the command “stty -icanon” you can disable the canonical mode. Paste in the string, and then I think it’s a good idea to enable the canonical mode again with “stty icanon” (link).

2) Under Python I found a simple way. Just “import readline” and it solved the issue for me. I tried it with a 11,000 characters long string and it worked.

Thanks to #python on IRC for helping to solve this issue.

Categories: python Tags: , ,

moving from unipath to pathlib

January 10, 2017 Leave a comment

Unipath is a very nice 3rd-party library for an object-oriented approach to Python file/directory operations. Just look at this sane API:

>>> from unipath import Path
>>> p = Path("/usr/lib/python2.5/gopherlib.py")
>>> p.parent
Path("/usr/lib/python2.5")
>>> p.name
Path("gopherlib.py")
>>> p.ext
'.py'
>>> p.stem
Path('gopherlib')
>>> q = Path(p.parent, p.stem + p.ext)
>>> q
Path('/usr/lib/python2.5/gopherlib.py')
>>> q == p
True

However, a very similar module landed in Python 3 called pathlib. It is almost the same as unipath but since it’s in the standard library, I think I’ll switch to it. It means one less external dependency, which is always a good thing.

Let’s see what it looks like:

>>> from pathlib import Path
>>> p = Path("/usr/lib/python2.5/gopherlib.py")
>>> p.parent
PosixPath('/usr/lib/python2.5')
>>> p.name
'gopherlib.py'
>>> p.suffix    # !!! called suffix, not ext !!!
'.py'
>>> p.stem
'gopherlib'
>>> q = Path(p.parent, p.stem + p.suffix)
>>> q
PosixPath('/usr/lib/python2.5/gopherlib.py')
>>> q == p
True
>>>

One important difference though. Unipath’s Path is a subclass of str, thus whenever a function needs a string, you can pass a Path object. However, it’s not true for pathlib’s PosixPath. It means that if you need the string representation of a PosixPath, you need to convert it manually.

Example:

>>> import os
>>> from pathlib import Path
>>> p = Path("/usr/lib/python2.5/gopherlib.py")
>>> os.path.exists(p)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/genericpath.py", line 19, in exists
    os.stat(path)
TypeError: argument should be string, bytes or integer, not PosixPath
>>> os.path.exists(str(p))    # here
False
>>> 

Some other features

>>> from pathlib import Path
>>> Path.home()
PosixPath('/home/jabba')    # Were you also fed up with os.path.expanduser('~') ?
>>> p = Path('/tmp/na.txt')
>>> p.chmod(0o644)
>>> p.exists()
True
>>> p.is_file()
True
>>> p.is_dir()
False
>>> 
>>> p = Path('/tmp/ehh.txt')
>>> p.exists()
False
>>> p.touch()    # At last! We have `touch` in the stdlib!
>>> p.exists()
True

Painless read from / write to file

>>> p = Path('my_text_file')
>>> p.write_text('Text file contents')    # newline is NOT added automatically
18
>>> p.read_text()
'Text file contents'

More details in the official docs.

Categories: python Tags: ,

validate an IP address (either IPv4 or IPv6)

January 9, 2017 Leave a comment

Problem
You want to validate an IP address. However, it can be either IPv4 or IPv6.

Solution
Python 3 has a built-in module for this: ipaddress. Example:

>>> ipaddress.ip_address('192.168.0.1')
IPv4Address('192.168.0.1')
>>> ipaddress.ip_address('2001:db8::')
IPv6Address('2001:db8::')

If the IP is invalid, you get a ValueError exception.

Categories: python Tags: , , ,

Bash-Utils updated

January 9, 2017 Leave a comment

I have several projects on GitHub (link) but some of them are either abandoned or outdated. So I want to review and update all of them.

I started this process with Bash-Utils. The Python 2 codebase was moved entirely to Python 3. The old Python 2 source is tagged and available under the “release” link, but I won’t touch that anymore. Only the current version (Python 3) will be updated. The README file is converted to Markdown, and new scripts are also documented.

Today I added a script called “rep.py” that allows you to execute a bash command several times. Example:

$ rep 3 echo hello
hello
hello
hello

It will execute “echo hello” three times.

Categories: python Tags: , ,

update all packages with pip in your virtual environment

January 5, 2017 Leave a comment

Problem
You want to update all installed packages in your virtual environment.

Solution

$ pip install pip-review
$ pip-review --local --interactive

Tip from here.

Categories: python Tags: ,

Type text to an application from a script

December 28, 2016 Leave a comment

Problem
Today I saw a nice motivational video: Girl does push ups for 100 days time lapse. Great, let’s do the same! I sit in front of my computer several hours a day, so some pushups won’t hurt :) But how to track the days?

I use Trello for some TODO lists, and it allows you to create a checklist. When you type a text and press Enter, a new checklist item is created. But typing “Day 1<Enter>”, “Day 2<Enter>”, … “Day 100<Enter>” is too much, I would die of boredom by the end… How to automate the input?

Solution
Under Linux there is a command called “xdotool” that (among others) lets you programmatically simulate keyboard input. “xdotool key D” will simulate pressing “D”, “xdotool key KP_Enter” is equivalent to pressing the Enter, etc.

Here is the script:

#!/usr/bin/env python3
# coding: utf-8

import os
from time import sleep

PRE_WAIT = 3

REPEAT = 100
WAIT = 0.3

def my_type(text):
    for c in text:
        if c == " ":
            key = "KP_Space"
        elif c == "\n":
            key = "KP_Enter"
        else:
            key = c
        #
        cmd = "xdotool key {}".format(key)
        os.system(cmd)

def main():
    print("You have {} seconds to switch to the application...".format(PRE_WAIT))
    sleep(PRE_WAIT)
    #
    for i in range(1, REPEAT+1):
        text = "Day {}\n".format(i)
        my_type(text)
        print("#", text)
        sleep(WAIT)

##############################################################################

if __name__ == "__main__":
    main()

Create a checklist in Trello, start adding a new item, launch this script and switch back to Trello. The script will automatically create the items for the days.

screenshot

screenshot

In the screenshot the dates were added manually. As you can see, I could do 20 pushups the very first day. Not bad :)

Update (20170302)
If you want to figure out the key code of a key, then start “xev -event keyboard” and press the given key. For instance, if you want xdotool to press “á” for you, the command above will tell you that the key code of “á” is “aacute“, thus the command to generate “á” is “xdotool key aacute“.

To avoid the special key codes, here is another idea: copy the text to the clipboard (see the command xsel for instance), then paste it with xdotool key "shift+Insert".

Categories: python Tags: , , , ,