Archive

Posts Tagged ‘cron’

APScheduler examples

August 6, 2013 7 comments

Advanced Python Scheduler (APScheduler) is a light but powerful in-process task scheduler that lets you schedule functions (or any other python callables) to be executed at times of your choosing.” (source)

The simplest way to schedule jobs using the built-in triggers is to use one of the shortcut methods provided by the scheduler:

Let’s see an example to each.

(1) simple date-based scheduling

The official doc. is here. “This is the simplest possible method of scheduling a job. It schedules a job to be executed once at the specified time. This is the in-process equivalent to the UNIX “at” command.

#!/usr/bin/env python

import sys
from time import sleep
from apscheduler.scheduler import Scheduler
sched = Scheduler()
sched.start()        # start the scheduler

# define the function that is to be executed
# it will be executed in a thread by the scheduler
def my_job(text):
    print text

def main():
    # job = sched.add_date_job(my_job, datetime(2013, 8, 5, 23, 47, 5), ['text'])
    job = sched.add_date_job(my_job, '2013-08-05 23:47:05', ['text'])
    while True:
        sleep(1)
        sys.stdout.write('.'); sys.stdout.flush()

##############################################################

if __name__ == "__main__":
    main()

Meaning: at the specified date and time, call the function my_job with the parameter “text“. The line with “sched.add_date_job” registers the task and the execution of the script goes on with the next line! If it were the last line, the script would terminate. Thus we need an infinite loop too. At the specified time, the registered function will be triggered and executed in a thread, but the infinite loop goes on parallelly.

(2) interval-based scheduling

The official doc. is here. “This method schedules jobs to be run on selected intervals. The execution of the job starts after the given delay, or on start_date if specified. After that, the job will be executed again after the specified delay.

The frame of the source code is the same as in the first example. Here I will only show the difference.

# from now on, execute my_job every minute
job = sched.add_interval_job(my_job, minutes=1, args=['text'])

# or:

# start at start_date (my_job is called) and then execute my_job every minute
job = sched.add_interval_job(my_job, minutes=1, start_date='2013-08-06 00:09:12', args=['text'])

In the first case: if you launch the script at 09:10:12 (hh:mm:ss), my_job will be called at 09:11:12 for the first time, then at 09:12:12, 09:13:12, etc.

In the second case: you specify when to call my_job for the first time (on August 6, 2013 at 00:09:12), then it will be executed again at 00:10:12, 00:11:12, etc.

(3) cron-style scheduling

The official doc. is here. “This is the most powerful scheduling method available in APScheduler. You can specify a variety of different expressions on each field, and when determining the next execution time, it finds the earliest possible time that satisfies the conditions in every field. This behavior resembles the “Cron” utility found in most UNIX-like operating systems.

The frame of the source code is the same as in the first example. Here I will only show the difference.

job = sched.add_cron_job(my_job, minute="*/15", args=['text'])

The syntax is similar to cron’s syntax. Here is a visual crontab utility called corntab.

The example above means: execute my_job in each hour at every 15 minutes. So, if you launch the script at Xh8 (8 minutes after X hour), it will be executed for the first time at Xh15, then at Xh30, Xh45, (X+1)h0, (X+1)h15, etc.

Common

If you want to unregister a task, do this:

sched.unschedule_job(job)

This is why we stored the returned values in a variable called “job“.

You can also print the scheduled jobs in a human-readable format. It also prints when the job is executed next time, so it’s great for debugging:

job = sched.add_...
sched.print_jobs()

Sample output:

Jobstore default:
    my_job (trigger: date[2013-08-06 23:47:05], next run at: 2013-08-06 23:47:05)
Categories: python Tags: , ,

Create a temporary file with unique name

February 19, 2011 Leave a comment

Problem

I wanted to download an html file with Python, store it in a temporary file, then convert this file to PDF by calling an external program.

Solution #1

#!/usr/bin/env python

import os
import tempfile

temp = tempfile.NamedTemporaryFile(prefix='report_', suffix='.html', dir='/tmp', delete=False)

html_file = temp.name
(dirName, fileName) = os.path.split(html_file)
fileBaseName = os.path.splitext(fileName)[0]
pdf_file = dirName + '/' + fileBaseName + '.pdf'

print html_file   # /tmp/report_kWKEp5.html
print pdf_file    # /tmp/report_kWKEp5.pdf
# calling of HTML to PDF converter is omitted

See the documentation of tempfile.NamedTemporaryFile here.

Solution #2 (update 20110303)

I had a problem with the previous solution. It works well in command-line, but when I tried to call that script in crontab, it stopped at the line “tempfile.NamedTemporaryFile”. No exception, nothing… So I had to use a different approach:

from time import time

temp = "report.%.7f.html" % time()
print temp    # report.1299188541.3830960.html

The function time() returns the time as a floating point number. It may not be suitable in a multithreaded environment, but it was not the case for me. This version works fine when called from crontab.

Learn more

Update (20150712): if you need a temp. file name in the current directory:

>>> import tempfile
>>> tempfile.NamedTemporaryFile(dir='.').name
'/home/jabba/tmpKrBzoY'

Update (20150910): if you need a temp. directory:

import tempfile
import shutil

dirpath = tempfile.mkdtemp()    # the temp dir. is created
# ... do stuff with dirpath
shutil.rmtree(dirpath)

This tip is from here.

Categories: python Tags: , , , , , ,