GP standings

Some quick notes on how to get the GP score tables into a useful shape, in order to be able to compute something like the below top 10 by best four results in the first five GPs (using preliminary results for GP 5). So I don’t have to figure this out again next time, and so you can play along at home.

 rank |            name            | points 
------+----------------------------+--------
    1 | Ken Endo                   | 3706.7
    2 | Ulrich Voigt               | 3285.9
    3 | Hideaki Jo                 | 3235.5
    4 | Kota Morinishi             | 3102.1
    5 | Nikola Zivanovic           | 3054.5
    6 | Robert Vollmert            | 2905.2
    7 | Will Blatt                 | 2896.7
    8 | Prasanna  Seshadri         | 2834.7
    9 | Yuki Kawabe                | 2834.5
   10 | Michael Ley                | 2806.7

The scripts used to do this are available here.

First part is getting the HTML tables into a more useful shape; I wanted CSV. Copy and paste into a spreadsheet works, but gets tedious quickly. The following monstrosity does the job for me (using the Mac OS X command line, so your awk might well behave differently in subtle ways). EDIT: small improvements.

$ curl http://gp.worldpuzzle.org/content/preliminary-results-wpf-gp-puzzle-5-competitive-section \
        | egrep '(<td|<tr)' \
        | tr -d '\t\n\r' | tr ',' ';' \
        | sed 's/]*>//g' \
        | awk '
                BEGIN {
                        RS=" *</tr> *"
                        FS=" *</td> *"
                        OFS=","
                }
                /[0-9]\./ {
                        $1 = $1
                        print $0
                }
        '
        > gp5.csv

And do the same for GPs 1 through 4. Then we throw this into postgres.

$ createdb gp
$ psql gp
# create table raw1 (rk text, name text, country text, nick text, points double precision, sent int, ok int, time interval);
CREATE TABLE
# create table raw2 (like raw1);
...
# copy raw1 from '/tmp/gp1.csv' (FORMAT 'csv');
COPY 348
...

Then we combine the raw tables.

# create view combined as select *, 1 as round from raw1 union select *, 2 as round from raw2 union select *, 3 as round from raw3 union select *, 4 as round from raw4 union select *, 5 as round from raw5;

This is enough to get a total top 10.

# select name, sum(points) as points, count(*) as played from combined group by name order by points desc limit 10;
       name       | points | played 
------------------+--------+--------
 Ken Endo         | 4466.2 |      5
 Nikola Zivanovic | 3724.5 |      5
 Hideaki Jo       | 3651.5 |      5
 Kota Morinishi   | 3562.1 |      5
 Yuki Kawabe      | 3447.5 |      5
 Will Blatt       | 3404.7 |      5
 Robert Vollmert  | 3387.2 |      5
 James McGowan    | 3355.1 |      5
 Michael Ley      | 3299.7 |      5
 Ulrich Voigt     | 3285.9 |      4

Eventually, two results will be dropped, so for an intermediate state, let’s drop one.

# create view ranked as select *, rank() over (partition by name order by points desc) as prank from combined;

Now we can get, for example, the current leader’s results ordered by score.

# select round, points, time from ranked where nick = 'EKBM' order by prank;
 round | points |   time   
-------+--------+----------
     5 |    991 | 00:57:36
     3 |  942.3 | 00:50:40
     4 |  901.2 | 00:56:53
     2 |  872.2 | 00:53:53
     1 |  759.5 | 01:18:49

And an overall leaderboard considering the best four results so far.

# create view best4 as select name, sum(points) as points, count(*) as played from ranked where prank <= 4 group by name;
# select rank() over (order by points desc) as rank, name, points, played from best4 order by points desc;
 rank |            name            | points | played 
------+----------------------------+--------+--------
    1 | Ken Endo                   | 3706.7 |      4
    2 | Ulrich Voigt               | 3285.9 |      4
    3 | Hideaki Jo                 | 3235.5 |      4
    4 | Kota Morinishi             | 3102.1 |      4
    5 | Nikola Zivanovic           | 3054.5 |      4
    6 | Robert Vollmert            | 2905.2 |      4
    7 | Will Blatt                 | 2896.7 |      4
    8 | Prasanna  Seshadri         | 2834.7 |      4
    9 | Yuki Kawabe                | 2834.5 |      4
   10 | Michael Ley                | 2806.7 |      4
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s