Archive

Posts Tagged ‘news’

News extraction

January 25, 2014 Leave a comment

With newspaper, you can do “news extraction, article extraction and content curation in python. Built with multithreading, 10+ languages, NLP, ML, and more!”

I haven’t tried it yet but if you need a corpus with news articles, this project can help.

Categories: python Tags: ,

Python news in French

January 13, 2014 Leave a comment

I just came across the site http://news.humancoders.com which is a news collector in French. Users can submit and discuss news here. It has a subpage dedicated to Python.



Human Coders News
est un service permettant de partager les meilleures ressources trouvées sur la toile à propos d’un thème précis. Vous pouvez consulter l’ensemble des news sur la page d’accueil, ou bien, cliquer sur un sujet pour filtrer.

Categories: python Tags: ,

Guido leaves Google for Dropbox

December 9, 2012 Leave a comment
Categories: python Tags: , , ,

The News Television Project (HírTV)

October 18, 2010 Leave a comment

In this post I describe how to watch news on a Hungarian site. Although the video that we want to play is in Hungarian, you might get some ideas that you can use in a different project.

Project description

Currently I live abroad and sometimes I want to watch news in my mother tongue. So, the Hungarian News Television (HírTV) collects its news programs at http://www.hirtv.hu/view/videoview/hirado . Here, a video has the following URL: http://www.hirtv.net/filmek/hirado21/hiradoYYYYMMDD.wmv , where YYYYMMDD is the date (for instance http://www.hirtv.net/filmek/hirado21/hirado20101018.wmv). Instead of starting a web browser, visiting this page and clicking on a link, I want to launch the news video with a Python script.

Difficulty

When the script is executed, it may be possible that the news of the current day is not yet uploaded. So we need to verify if the URL exists. However, if we want to get a WMV file that doesn’t exist, the web server of HirTv will return an HTML page instead of indicating that the given URL is missing. So we will have to verify the Content-Type of the URLs. If it’s text/html => error, if it’s video/x-ms-wmv => OK.

Solution

#!/usr/bin/env python

import datetime
import urllib
import os

WMV  = 'video/x-ms-wmv'

base = 'http://www.hirtv.net/filmek/hirado21/hirado'
ext = '.wmv'

def get_content_type(url):
    d = urllib.urlopen(url)
    return d.info()['Content-Type']

def date_to_str(d):
    return "%d%02d%02d" % d

def prettify(d):
    return "%d-%02d-%02d" % d

def play_video(video_url):
    print "> " + video_url
    command = 'mplayer %s 1>/dev/null 2>&1' % video_url
    #command = 'vlc %s 1>/dev/null 2>&1' % video_url    # if you prefer VLC
    os.system(command)

today = datetime.date.today().timetuple()[:3]
video_today = base + date_to_str(today) + ext
if get_content_type(video_today) == WMV:
    play_video(video_today)
else:
    yesterday = (datetime.date.today() - datetime.timedelta(days = 1)).timetuple()[:3]
    video_yesterday = base + date_to_str(yesterday) + ext

    print "The video for today (%s) is not available." % prettify(today)
    val = raw_input( "Do you want to watch the video of yesterday (%s) [y/n]? " % prettify(yesterday) )
    if val == "y":
        if get_content_type(video_yesterday) == WMV:
            play_video(video_yesterday)
        else:
            print "Sorry. The video of yesterday (%s) is not available either." % prettify(yesterday)

First we determine the today’s date and using this information we create a URL for the video file. If it really exists (i.e. the Content-Type is correct), then we play it calling mplayer. If the Content-Type is incorrect, then the video of today was not yet uploaded. In this case we offer the user to play the video of yesterday.

Update (20101107): A bug in date_to_str() and prettify() was corrected. Months and days must be padded with 0s, i.e. 6 must become 06 for instance. VLC support is also added, it’s put in comment.

Categories: python Tags: , , , , ,