Category: programming (Page 6 of 6)

Derelict house in Sintra, Portugal (Photo: Luis).

Python code to simulate the Monty Hall problem

2011-04-24 / Luis

I always struggled to understand the Monty Hall Problem, which popped up again in William Briggs’s site. The program assumes that initially the prize is equally likely to be behind one of the three doors. Once the contestant (player in the program) chooses a door, the game host (who knows where is the prize) opens another door that does not contain the prize. The host then offers the contestant the choice between keeping his original choice or switch to the unopened door.

Initially the contestant has 1/3 probability of choosing the door with the prize and 2/3 of choosing the wrong door. If the contestant has chosen the right door, switching means losing; however, 2/3 of the time Monty Hall is in fact offering the prize.

from __future__ import division
from random import randint

# Number of games and game generation
ngames = 10000

prize = [randint(1,3)*x for x in ngames*[1]]
player = [randint(1,3)*x for x in ngames*[1]]

# Evaluating success
default = 0
monty = 0

for game in range(ngames):
    # Keeping decision
    if prize[game] == player[game]: 
        default = default + 1

    # Taking chance with Monty
    if prize[game] == player[game]:
        # Switching means losing
        pass
        # Switching means winning
    else:
        monty = monty + 1
    
print 'Probability of winning (own door)', default/ngames
print 'Probability of winning (switching doors)', monty/ngames

The final probabilities should be pretty close to 1/3 and 2/3, depending on the number of games (ngames) used in the simulation.

Moka pot on stove starting to produce coffee

Publon: my simple publication Python script

2010-05-21 / Luis

After using PmWiki (PHP wiki software) for nearly five years to maintain my web site I was a bit tired of fiddling with it. I then rolled out the following very simple and crummy Python script. Besides a basic Python distribution it only requires Python Markdown and Pygments.

There are no external templates or any other niceties–although it deals with unicode–but it just works.

#!/usr/bin/python
# coding: utf-8

import codecs, getopt, markdown, os, os.path, sys 

# Some important constants
base_path = '/Users/luis/Dropbox/website/'
site_path = 'http://apiolaza.net/'


# Basic inside-code template, making
# script self-contained
def inline_template():
    template = u'''
    <!DOCTYPE html>
    <html>
    <head>
        <title><!--title--></title>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <meta name='author' content='Luis A. Apiolaza' />
        <meta name='keywords' content='<!--tags-->' />
        <link rel='stylesheet' href='http://apiolaza.net/css/mANDo.css' type='text/css' />
        <link rel='stylesheet' href='http://apiolaza.net/css/pygments.css' type='text/css' />
    </head>
    <body>
        <div id='container'>
            <div id='page-header'>
            <h1><a href='http://apiolaza.net/'>Sketchbook</a></h1>
                <div id='page-menu'>
                <ul>
                <li><a href='http://apiolaza.net/ego-sum.html' title='Ego sum qui sum'>about</a></li>
                <li><a href='http://apiolaza.net/asreml-r/' title='asreml-r cookbook'>asreml-r</a></li>
                <li><a href='http://quantumforest.com/' title='Quantum Forest blog'>blog</a></li>
                <li><a href='http://apiolaza.net/code/' title='software'>code</a></li>
                <li><a href='http://apiolaza.net/colophon.html' title='Gory technical details'>colophon</a></li>
                <li><a href='http://apiolaza.net/publications.html'>publications</a></li>
                <li><a href='http://apiolaza.net/writings/'>writings</a></li>
                </ul>
                </div>
            </div>
            <div id='page-meta'>
                <!--navbar-->
                <p>This page was updated on <!--date--> (NZST) and is tagged 
                <!--tags-->.</p>
            </div>
            <div id='page-body'>
                <!--body-->
            </div>
            <div id='footer'>
                <p>All bits sowed, harvested and baked in Christchurch, New 
                Zealand—43º31'S, 172º32'E—by <a href='http://apiolaza.net'>Luis 
                Apiolaza</a> with 
                <a href='http://creativecommons.org/licenses/by-nc-sa/3.0/nz/'>some rights reserved</a>.
                A longer blurb and gory technical details can be found in the 
                <a href='http://apiolaza.net/colophon.html'>colophon</a>.</p>
            </div>
        </div>
    </body>
    </html>
    '''
    return(template)


# In case of using external template file
def read_template(filename = 'page-template.html'):
    try:
        f = codecs.open(filename, mode = 'r', encoding = 'utf-8')
    except IOError:
        print 'Template file not found'
        sys.exit()
    
    text = f.read()
    return(text)


# Obtains body and meta values from single page
def process_page(text):
    # Instance of class Markdown with two extensions
    md = markdown.Markdown(extensions = ['meta', 'codehilite', 'footnotes'])
    body = md.convert(text)
    meta = md.Meta
    return([body, meta])


# Combines body, meta information and template
def build_page(body, meta, template):    
    for key in meta.keys():
        lookfor = '<!--' + key + '-->'
        template = template.replace(lookfor, meta[key][0])

    template = template.replace('<!--body-->', body)
    return(template)


# Creates location bar
def location(filename, base_path, site_path):
    full_name = os.path.abspath(filename)
    base_length = len(base_path)
    nav = u'<p>Document tree: <a href="' + site_path + u'">home</a>'

    if full_name.find(base_path) >=0:
        rel_path = full_name[base_length:-5] + 'html'
        rel_path_parts = rel_path.rsplit('/')
    else:
        print 'File is not in site folder'
        sys.exit()

    for i in range(len(rel_path_parts)):
        nav = nav + u' « ' + u'<a href="' + \
              (site_path + u'/'.join(rel_path_parts[:1+i])) + \
              u'">' + rel_path_parts[i] + u'</a>'

    nav = nav + '</p>'
    return(nav)


# Writes page encoded as utf8
def write_page(page, name):
    (root, ext) = os.path.splitext(name)
    newname = root + '.html'
    f = codecs.open(newname, 'w', encoding = 'utf8')
    f.write(page)
    return()


# Help function
def usage():
    print 'Usage:'
    print 'python publon.py -d file.markdown (draft default) OR'
    print 'python publon.py -p file.markdown (publish)'
    return()


# Dictionary defining Castilian replacements
castilian = {u'á': '&aacute;', u'é': '&eacute;', u'í': '&iacute;',
             u'ó': '&oacute;', u'ú': '&uacute;', u'ñ': '&ntilde', 
             u'¿': '&iquest;', u'¡': '&iexcl;', u'ü': '&uuml;'}


# Convert castilian code to html entities
def convert_castilian(text, dictionary):
    for key in dictionary:
        text = text.replace(key, dictionary[key])

    return(text)


if __name__ == '__main__':
    # Processing command line options and arguments
    try:
        opts, args = getopt.getopt(sys.argv[1:], 'dp', ['draft', 'publish'])
    except  getopt.GetoptError, err:
        print 'Error detected: ', err
        usage()
        sys.exit()

    # Reading file
    filename = args[0]
    try:
        f = codecs.open(filename, mode = 'r', encoding = 'utf-8')
    except IOError:
        print 'Markdown file not found'
        sys.exit()

    text = f.read()
    html, meta = process_page(text)
    template = inline_template()
    navbar = location(filename, base_path, site_path)
    meta[u'navbar'] = [navbar]
    page = build_page(html, meta, template)
    write_page(page, filename)

And that’s it.

Gratuitous picture: cabbage trees, or mushy brain?, in Christchurch (Photo: Luis).

Dynamic Google maps using Python

2010-04-20 / Luis

I originally wrote this code on 2 February 2009, as a quick hack to publish air pollution in Santiago. The air quality stations now present data in a different way, so the mapping doesn’t work, but the general idea is still useful. A barebones version of my Python code to generate the KML file is:

#!/usr/bin/env python
# encoding: utf-8

import urllib, random

# Charting function
def lineChart(data, size = '250x100'):
    baseURL = 'http://chart.apis.google.com/chart?cht=lc&chs='
    baseData = '&chd=t:'
    newData = ','.join(data)
    baseData = baseData + newData
    URL = baseURL + size + baseData    
    return URL

# Reading test data: connecting to server and extracting lines
f = urllib.urlopen('http://gis.someserver.com/TestData.csv')
stations = f.readlines()
kmlBody = ('')

for s in stations:
    data = s.split(',')
    # Generate random data
    a = []
    for r in range(60):
        a.append(str(round(random.gauss(50,10), 1)))

    chart = lineChart(a)

    # data is csv as station name (0), long (1), lat (2), y (3)
    kml = (
        '<Placemark>\n'
        '<name>%s</name>\n'
        '<description>\n'
        '<![CDATA[\n'
        '<p>Value: %s</p>\n'
        '<p><img src="%s" width="250" height="100" /></p>\n'
        ']]>\n'
        '</description>\n'
        '<Point>\n'
        '<coordinates>%f,%f</coordinates>\n'
        '</Point>\n'
        '</Placemark>\n'
        ) %(data[0], data[3], chart, float(data[1]), float(data[2]))

    kmlBody = kmlBody + kml

# Bits and pieces of the KML file
contentType = ('Content-Type: application/vnd.google-earth.kml+xml\n')

kmlHeader = ('<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n'
             '<kml xmlns=\"http://earth.google.com/kml/2.1\">\n'
             '<Document>\n')

kmlFooter = ('</Document>\n'
             '</kml>\n')


print contentType
print kmlHeader
print kmlBody
print kmlFooter

Well, this was not exactly barebones, because we also wanted to generate dynamic graphs for each placemark, in the easiest possible way. My first idea was to use one of the multiple javascript libraries available in the net However, a quick search revealed that KML files do not support javascript in the description tag. That was the time when I remembered playing with Google Charts a long time ago. The lineChart function above is simply a call to create a line chart using the charts API. Because this is a test, I used 60 randomly generated data points, which explains the presence of random as an imported library. The actual code is even larger, because I was (and am) also doing data scrapping from the Chilean meteorological service web pages.

Originally, I did not want to use javascript at all, so inserted the code as a search in maps, generating a link like http://maps.google.com/maps?q=http://some-server.com/AirePuro.py However, I wanted to embed it in a blog post and I was struggling to do it. The solution was to click on the ‘Link’ link in the generated map to copy the ‘Paste HTML to embed in website’ link. This gave an iframe block that can be copied in any page or blog post, like so:

While helping a friend to create another map, we faced the problem that the data set was being updated every five minutes. What is the problem? The map was not being refreshed often enough. I am not sure if the problem was a browser cache or Google Maps, but it could be solved by calling the KML file with a random extra argument (the script does not need take any arguments, so anything after the question mark is ignored). In my case I needed a frequent random argument, so I use the current time (using the date would work for once a day updates). This meant inserting the map using javascript (and using a Google Maps key). The code for a simple page–from the header onwards–would look like:

<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<title>A simple dynamic python generated map</title>
<script src="http://maps.google.com/maps?file=api&amp;v=2&amp;key=my_key"
type="text/javascript"></script>
<script type="text/javascript">
    //<![CDATA[ 
    function load() {
      if (GBrowserIsCompatible()) {
        var map = new GMap2(document.getElementById("map"));
        map.setCenter(new GLatLng(-33.458943, -70.658569), 11);
        var pollution = new GGeoXml("http://gis.uncronopio.org/testmapscsv.py?"+
                        (new Date()).getTime());
        map.addOverlay(pollution);
      }
    }
    //]]>
 </script>
 </head>
<body onload="load()" onunload="GUnload()">
<div id="map" style="width:750px;height:600px"></div>
</body>

It was not too bad for mucking around on a Friday in between doing house chores.

Back to Python

2006-10-07 / Luis

I started using Python way back at the end of 1996 or early 1997. I was working in my PhD, for which the first project involved writing some simulations in FORTRAN. Originally I was using FORTRAN90, but then I needed to move my project to a server that had only FORTRAN77, so I was stuck with something that looked—at least to me—really ugly. While I was looking for alternatives (I used Mathematica, Matlab, SAS, Python and ASReml in my PhD), I stumbled on an article by Konrad Hinsen discussing using Python to glue FORTRAN programs. Intrigued, I downloaded Python and ordered a copy of Mark Lutz’s Programming Python (the October 1996 first edition). After reading the book for a while I was hooked on the language.

I used Python in and out for small projects, and later dropped almost all programming (that was not stats) around 2001. I have missed that quite a bit until yesterday when working with a list of words that Orlando is using. We had typed around 450 words in Spanish (and he uses around the same number in English) and I wanted to check if we had repeated words. I downloaded Python, wrote a few lines and presto! We did have around 20 repeated words and it was so nice to be able to write something in Python.

After that I did check a few web pages and I realised that the language has evolved quite nicely (although I rarely use the object oriented stuff) and there are at least two books that I will be browsing soon:

Dive into Python, by Mark Pilgrim and
How to think like a computer scientist: learning with Python, by Allen B. Downey, Jeffrey Elkner and Chris Meyers.

Both books are available as free downloads in a variety of formats, as well as in real old-fashioned paper. I will certainly buy the nicest one in a paper copy.

I forgot to mention that one of the great things about Python was the existence of an excellent set of libraries for matrix operations (at the time was Numpy) that has grown in to a great set of resources for scientific computing called SciPy.