Fun with manpages

When reading about a Unix command or C library function it's relatively common to see it suffixed with a number in brackets. This is to make it clear what exactly you're talking about, so if someone is discussing mknod they might write mknod(1) they're talking about the shell command or mknod(2) if they mean the syscall. The number used refers to the manpage section, so to see the manpage for the shell function:

$ man 1 mknod

And to see what the syscall is all about:

$ man 2 mknod

According the man manpage on my system there are eight standard manpage sections in total, with one non-standard section:

  1. Executable programs or shell commands
  2. System calls (functions provided by the kernel)
  3. Library calls (functions within program libraries)
  4. Special files (usually found in /dev)
  5. File formats and conventions eg /etc/passwd
  6. Games
  7. Miscellaneous (including macro packages and conventions), e.g. man(7), groff(7)
  8. System administration commands (usually only for root)
  9. Kernel routines [Non standard]
Since something can be present in more than one section, I wondered which symbol had the most manpages so I wrote a script to look through each of the directories in /usr/share/man/man[1-8], list and parse the gzipped filenames (they're usually named symbol.sectionnumber.gz) and then find out the sections they're all present in:
import os
import re
from collections import defaultdict

manpage_gz_pattern = "(.*)\.\w+.gz"
manpage_dir_base = "/usr/share/man"
manpage_sections = range(1, 9)
manpage_entries = defaultdict(list)

for manpage_section in manpage_sections:
    manpage_section_dir = os.path.join(manpage_dir_base, f"man{str(manpage_section)}")
    manpage_section_contents = os.listdir(manpage_section_dir)

    for manpage_entry_filename in manpage_section_contents:
        gz_entry = re.match(manpage_gz_pattern, manpage_entry_filename)
        manpage_entry = gz_entry.groups()[0] 
        manpage_entries[manpage_entry] += [(manpage_section, manpage_entry_filename)]

for section_count in manpage_sections:
    number_of_manpages = len([ m for m in manpage_entries if len(manpage_entries[m]) == section_count])
    print(f"number of manpages in {section_count} sections: {number_of_manpages}")

The results are:
$ python -i mancount.py
number of manpages in 1 sections: 10763
number of manpages in 2 sections: 107
number of manpages in 3 sections: 7
number of manpages in 4 sections: 0
number of manpages in 5 sections: 0
number of manpages in 6 sections: 0
number of manpages in 7 sections: 0
number of manpages in 8 sections: 1
There's seemingly a clear winner, you can find a single symbol in all eight standard manpage sections. However this is a little misleading because after a bit of inspection this symbol is "intro" - it is not a shell command, syscall, stdlib function, game or anything like that - it's a manpage that describes a bit about each section.

So ignoring intro the most common symbols and their manpage entries are
  • mdoc (mdoc.1.gz, mdoc.5.gz, mdoc.7.gz)
  • locale (locale.1.gz, locale.5.gz, locale.7.gz)
  • hostname (hostname.1.gz, hostname.5.gz, hostname.7.gz)
  • passwd (passwd.1ssl.gz, passwd.1.gz, passwd.5.gz)
  • time (time.2.gz, time.3am.gz, time.7.gz)
  • readdir (readdir.2.gz, readdir.3am.gz, readdir.3.gz)
  • random (random.3.gz, random.4.gz, random.7.gz)
This reveals something else interesting - the section needn't be a number. The two commands are both valid and access completely separate manpages:

$ man 1ssl passwd
$ man 1 passwd

I used to think that each time I opened a manpage I learn something completely new and unexpected - but I never thought I'd find something interesting just by looking at the manpages' gzipped filenames!

F# - Web applications with Angular and Giraffe

tl;dr - If you want a barebones .NET web app written in F# with Angular and Giraffe all hooked up already then clone my giraffe-ng repo on GitHub as a starting point. If you want to know the steps involved in setting this up, then read on.

The Angular CLI is a powerful and easy way to build rich web applications. However in some cases you might want to use something other than the NodeJS backend that it provides by default. This is a little guide to show how an Angular application can be created with a backend powered by F# using the Giraffe framework

To begin with we need to ensure that Angular CLI and .NET Core SDK 2.2 are installed and in our PATH:

$ dotnet --version
2.2.101
$ ng --version
     _                      _                 ____ _     ___
    / \   _ __   __ _ _   _| | __ _ _ __     / ___| |   |_ _|
   / △ \ | '_ \ / _` | | | | |/ _` | '__|   | |   | |    | |
  / ___ \| | | | (_| | |_| | | (_| | |      | |___| |___ | |
 /_/   \_\_| |_|\__, |\__,_|_|\__,_|_|       \____|_____|___|
                |___/
    

Angular CLI: 7.0.4
Node: 9.11.2
OS: linux x64
Angular: 
... 

Package                      Version
------------------------------------------------------
@angular-devkit/architect    0.10.4
@angular-devkit/core         7.0.4
@angular-devkit/schematics   7.0.4
@schematics/angular          7.0.4
@schematics/update           0.10.4
rxjs                         6.3.3
typescript                   3.1.3

First create the folder, the  .NET project and install the Giraffe and Microsoft.AspNetCore.App packages:

  $ mkdir giraffe-ng
  $ cd giraffe-ng 
  $ dotnet new console -lang F#
  $ dotnet add package Giraffe --version 3.4.0
  $ dotnet add package Microsoft.AspNetCore.App --version 2.2.0 

We'll now create a simple landing page using Giraffe's own view engine just to test everything is working.

  let index = 
      html [] [
          head [] [
              title [] [ str "Giraffe!" ]
          ]
          body [] [
              h1 [] [ str "Hello!" ]
              p [] [ str "A test of Giraffe and .NET Core"]
          ]
      ]
  
  let webApp =
      choose [ route "/" >=> (index |> renderHtmlDocument |> htmlString) ]
  
  let configureApp (app : IApplicationBuilder) =
      app.UseGiraffe(webApp)
               
  let configureServices (services : IServiceCollection) =
      services.AddGiraffe() |> ignore
  
  []
  let main _ =
      WebHostBuilder()
          .UseKestrel()
          .Configure(Action configureApp)
          .ConfigureServices(configureServices)
          .Build()
          .Run()

We can test that this works nicely by going to http://localhost:5000 - the reason the port is important will be apparent later:

Looks good, so now we can setup our frontend, so we'll use the Angular CLI:

$ ng new frontend

And we'll check out http://localhost:4200 to make sure it's working:

What we want to do is take the Angular application and serve it with out F#\Giraffe app via the route "/app" - so that viewing http://localhost:5000 will serve the Angular app that was previously served by NodeJS on http://localhost:4200. We'll start off by making sure our application is setup to find all its resources under the /app route.

$ ng build --base-href /app/
We'll next change our app so that it'll serve this route:
let configureApp (app : IApplicationBuilder) =
  app.UseStaticFiles(
      StaticFileOptions(
              FileProvider = new PhysicalFileProvider(
                      Path.Combine(Directory.GetCurrentDirectory(), "frontend", "dist")),
                      RequestPath = PathString("/app")))
      .UseGiraffe(webApp)

We now need to make sure the files in the frontend/dist directory is copied to the .NET application's bin directory when it's built by adding the following to our .fsproj file:
  <ItemGroup>
    <Content Include="frontend/dist/*.*">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </Content>
  </ItemGroup>

Next we can change the definition of our index view so that our Angular scripts and application root are loaded:

  let ngApp = tag "app-root"  [] []
  
  let ngScripts = 
      [ "runtime.js"; "polyfills.js"; "styles.js"; "vendor.js"; "main.js" ]
      |> List.map (function js -> script [ _type "text/javascript"; _src js ] [] )
  
  let index = 
      html [ _lang "en" ] [
          head [] [
              meta [ _charset "utf-8"; _name "viewport"; _content "width=device-width, initial-scale=1" ] 
              title [] [ str "Angular + Giraffe" ]
              ``base`` [ _href "/app/" ]
              link [ _rel "icon"; _type "icon"; _href "favicon.ico" ]
          ]
  
          body [] 
              (ngApp :: ngScripts)
      ]

Now if we run the Giraffe app and navigate to http://localhost:5000 we should see our Angular application:

And there we have it - we set up an application with an F# and Giraffe powered backend and an Angular 7 frontend. Extending this to have the Angular app fetch data from a Giraffe service is pretty straight-forward. As an example if we want our page to display a list of users it has requested from the backend, we can add the following to our Program.fs:

type User = { login:string; email:string }

let userList:(User list) = [ 
  { login = "sean"; email = "sean@example.com" }
  { login = "lucka"; email = "lucka@example.com" }
  { login = "alfie"; email = "alfie@example.com" }
  { login = "ivy"; email = "ivy@example.com" }
]

Then we can add a new “users” route to our webapp to serve this data:

let webApp =
  choose [
      route "/"       >=> (index |> renderHtmlDocument |> htmlString) 
      route "/users" >=> (json userList)
  ]

Then in our main app.component.ts we'll add a type to represent the User

interface User {
    login: string;
    email: string;
}

And we can now modify our AppComponent so that it hits this service

  export class AppComponent implements OnInit {
    title = 'frontend';
  
    public users: User[] = [];
  
    constructor(private http: HttpClient) {
    }
  
    ngOnInit(): void {
      this.http.get('/users').subscribe(users => {
        this.users = users;
      });
    }
  }

And we can modify the component's template to render this

<ul>
    <li *ngFor="let user of users">
      <a href="mailto:{{user.email}}">{{user.login}}</a>
    </li>
</ul>
So there we are, an Angular frontend served by Giraffe and F#! Just for reference the completed app is available on GitHub under my "giraffe-ng" repo.


F# - Polymorphic parameter overflow

I was mucking around in F# over lunch and started thinking about how it assigns names to type variables (see "Automatic Generalization" of the Type Inference page of the F# docs for more info). By way of introduction, let's create a function foo that takes a single parameter without an explicitly defined type and see what its signature looks like:
> let foo x = x;;
val foo : x:'a -> 'a
So the parameter x is assigned the type variable 'a - which makes sense, the first unknown type gets named after the first letter of the alphabet. And of course it follows that if we have a function with two parameters the second type is called ...
> let foo x y = (x,y);;
val foo : x:'a -> y:'b -> 'a * 'b
... 'b! Ok now what happens when we've got a function with a ridiculous amount of parameters? We'll run out of lower-case letters eventually. When that happens do start using upper-case letters? Non-ASCII? Something else? I quickly hacked together a python script to generate functions with arbitrary parameters, created one with 40 and pasted it into the F# REPL and ...
> let foo x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 = (x0,x1,x2,x- 3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,x38,x39);;
val foo :
  x0:'a ->
    x1:'b ->
      x2:'c ->
        x3:'d ->
          x4:'e ->
            x5:'f ->
              x6:'g ->
                x7:'h ->
                  x8:'i ->
                    x9:'j ->
                      x10:'k ->
                        x11:'l ->
                          x12:'m ->
                            x13:'n ->
                              x14:'o ->
                                x15:'p ->
                                  x16:'q ->
                                    x17:'r ->
                                      x18:'s ->
                                        x19:'t ->
                                          x20:'a1 ->
                                            x21:'a2 ->
                                              x22:'a3 ->
                                                x23:'a4 ->
                                                  x24:'a5 ->
                                                    x25:'a6 ->
                                                      x26:'a7 ->
                                                        x27:'a8 ->
                                                          x28:'a9 ->
                                                            x29:'a10 ->
                                                              x30:'a11 ->
                                                                x31:'a12 ->
                                                                  x32:'a13 ->
                                                                    x33:'a14 ->
                                                                      x34:'a15 ->
                                                                        x35:'a16 ->
                                                                          x36:'a17 ->
                                                                            x37:'a18 ->
                                                                              x38:'a19 ->
                                                                                x39:'a20 ->
                                                                                  'a *
                                                                                  'b *
                                                                                  'c *
                                                                                  'd *
                                                                                  'e *
                                                                                  'f *
                                                                                  'g *
                                                                                  'h *
                                                                                  'i *
                                                                                  'j *
                                                                                  'k *
                                                                                  'l *
                                                                                  'm *
                                                                                  'n *
                                                                                  'o *
                                                                                  'p *
                                                                                  'q *
                                                                                  'r *
                                                                                  's *
                                                                                  't *
                                                                                  'a1 *
                                                                                  'a2 *
                                                                                  'a3 *
                                                                                  'a4 *
                                                                                  'a5 *
                                                                                  'a6 *
                                                                                  'a7 *
                                                                                  'a8 *
                                                                                  'a9 *
                                                                                  'a10 *
                                                                                  'a11 *
                                                                                  'a12 *
                                                                                  'a13 *
                                                                                  'a14 *
                                                                                  'a15 *
                                                                                  'a16 *
                                                                                  'a17 *
                                                                                  'a18 *
                                                                                  'a19 *
                                                                                  'a20
So 'a up to 't ... but then numbered 'as after that. I've no idea why it stops at 't, then just counts up from 'a. Interestingly this is the case with F# under Mono on Linux and under the official MS runtime on VS2017 on Windows. OCaml, which F# is heavily influenced by, does not exhibit this behaviour - it refers to the unknown types as a..z, a1..z1, a2...z2, etc.

That's reasonbaly interesting and all, but since I had already written a script to generate these F# files I figured I'd keep trying with more and more parameters to see what happens. After adding a hundred parameters at a time I soon hit an interesting warning between 400 and 500 params:

Warning: line too long, ignoring some characters
- 
It turned out we hit some line-length restriction in either the parser or the readline equivalent that fsi.exe uses. A bit of experimentation later I found that when the function had 429 parameters it parsed just fine ... but 430 parameters caused the warning.

429 parameters: 4093 characters
430 parameters: 4104 characters

Minor sidebar - I was going to just say we hit a restriction where 4096 bytes is the max line length, but obviously 1 character isn't necessarily 1 byte if we're using some Unicode representation, and a nice way to check that we're using Unicode is to punch in identifiers that would definitely be in different encodings in an ASCII representation - I used Georgian and Czech characters:
> let სამი = 3;;
val სამი : int = 3

> let čtyři = 4;;
val čtyři : int = 4

> სამი + čtyři;;
val it : int = 7
When I changed my program to start each parameter with the Georgian character "ა" (i.e. let bar ა0 ა1... etc) which cannot be represented in a single byte in UTF-8 we hit the error after only 162 parameters. So it's not a character limit, but a buffer of bytes that we filled - and the size of the buffer is ... uh 4096 bytes.

Since I wanted to pull this thread until I hit some logical conclusion I tweaked my program to split the insanely long lines so they wouldn't hit this error and continued to add more parameters. The next issue I encountered was after trying a function with a couple thousand parameters, where I encountered the following error:
  x1328,
  ^^^^^

/home/sean/stdin(6636,1): error FS0039: The value or constructor 'x1328' is not defined. Maybe you want one of the following:
   x132
   x138
   x1238
   x128
   x1321
So it seems we've bumped up into another limit - somehow 1327 parameters is ok but F# loses track of the 1328th one, thinking it doesn't exist. Is this a bug? Probably, there should maybe be a more helpful error message. Is it important? Probably not, if your code contains a function with upwards of a thousand parameters then this is the least of your problems.

Debian - Building XMMS 1.2.11 on a modern linux system

Like most people I've recently been consuming all my media via paid streaming services like Netflix and iTunes. The other day however I needed to play an MP3 on my laptop running Debian and instinctively wanted to reach for xmms. Sadly nowadays the original xmms isn't available on Debian, only an "xmms2" package which is much newer and was reworked into some client/server model. I don't really want to figure out how to configure this correctly, to the extent that I was willing to build the original xmms from source ...

Trying the naive "./configure && make && sudo make install" method doesn't go very well when running Debian stretch:
sean@seoul ~/d/s/xmms-1.2.11> ./configure
checking build system type... x86_64-unknown-linux-gnu

MANY LINES LATER

*** The glib-config script installed by GLIB could not be found
*** If GLIB was installed in PREFIX, make sure PREFIX/bin is in
*** your path, or set the GLIB_CONFIG environment variable to the
*** full path to glib-config.
configure: error: *** GLIB >= 1.2.2 not installed - please install first ***
As it turns out I wasn't able to find a pre-built version of GLIB 1.x or (a subsequent dependency) GTK 1.x, I found some sources (GLIB 1.2 and GTK+ 1.2) but these were hitting an error when running ./configure which indicated that the CPU wasn't supported. These libraries pre-date the x86-64 era so my processor wasn't recognised. The fix was to simply drop in a newer config.sub. There was one more issue with the G_GNUC_PRETTY_FUNCTION macro but I resolved that too - I put them onto GitHub as glib-1.x and gtk-1.x in case anyone else wants to use this. Installing them is easy:
$ git clone https://github.com/smcl/gtk-1.x
$ cd gtk-1.x
$ ./configure --prefix=/usr && make
$ sudo make install

$ git clone https://github.com/smcl/glib-1.x
$ cd glib-1.x
$ ./configure --prefix=/usr && make
$ sudo make install
Once these are in place we can grab the "latest" old XMMS sources from xmms.org and build those:
$ curl -LO http://www.xmms.org/files/1.2.x/xmms-1.2.11.tar.gz
$ tar -xzf xmms-1.2.11.tar.gz
$ cd xmms-1.2.11
$ ./configure && make
$ sudo make install

Then if all is well then the original (and best!) xmms should be installed into your path, so you can go download some lovely skin of a brushed aluminium late-90s Sony CD player ... though it might be a little bit tiny if you use a HiDPI screen:


Czech - Cases are hard

Grammatical cases are a common stumbling block for native English speakers learning another language. This might be because cases are sort of invisible in English and they're not taught in school so it's it's hard to see why they would even matter.

However in languages like Czech cases are a really important concept, and if you don't wrap your head around them you'll struggle to make yourself understood. To fully comprehend this importance we need an example - Petr and Pavel, who are not friends.

Accusative - Peter is punching Pavel

In Czech the verb "to punch" is "bít", whose third person conjugation is "bije" so if we want to describe this situation in Czech we might start with something like this ...

Petr bije Pavel

However this isn't quite enough because it's not actually clear who is punching who. You can rearrange this sentence in many ways in Czech, so our sentence can also read Pavel bije Petr and bije Pavel Petr etc. So how do we tell our Czech friends who is the puncher and who is the punchee? Through the magic of cases!

In Czech we indicate the subject (puncher) using the nominative case. In our sentence this is Petr and the nominative case of Petr is simply Petr.

The object (the punchee) of the sentence is indicated using accusative case - which for Pavel is Pavla. Which gives us:

Petr bije Pavla

So hopefully you can see why exactly cases are so important, if you don't learn them you're going to confuse a lot of Czech people and have a lot of frustrating conversations. Let's take Petr and Pavel and explore some other cases.

Genitive - Petr reads Helena's book

If Petr is finished punching Pavel and just wants chill with his friend Helena's book, we need to use the Genitive case to indicate it belongs to her:

Petr čte knihu Heleny

Subject = nominative of Petr = Petr

Verb = third person singular conjugation of čist = čtu

Object = kniha in accusative case = knihu

Possessor = Helena in genitive case = Heleny

Dative - Petr reads to Pavel

If he decides to read the book to Pavel in a curious attempt at reconciliation, we need to use the Dative case:

Petr čte knihu Pavlu

Receiver = Pavel in dative case = Pavlu

Instrumental - Petr reads with Pavel

If this reconciliation is successful and Petr reads the book with Pavel we need to use the Instrumental case:

Petr čte knihu s Pavlem

Instrument = Pavel in instrumental case = Pavlem

Locative - Petr reads about Pavel

Maybe it's weird to describe Pavel as the "instrument", but just go with it because in the next sentence Petr is reading a book about Pavel and in this situation we use the Locative case:

Petr čte knihu o Pavlu

preposition "o" (meaning "about") requires locative case of Pavel = Pavlu

Vocative - Petr says goodbye to Pavel

Finally, to draw this ridiculous situation to a close Petr says goodbye to Pavel where we use the Locative case:

Na shledanou Pavle!

addressing Pavel requires vocative case of Pavel = Pavle!

Conclusion 

This is only a quick-n-dirty summary restricted to the seven Czech cases, but it should indicate why each case is important. The Ústav pro jazyk český Akademie věd ČR (Language Institute of the Czech Academy of Science) have a really useful page if you want to see what a word looks like in different cases: http://prirucka.ujc.cas.cz. Just enter your word into the "Slovníková část" section and hit "Hledej"

IIS - Annoying error with .NET Core

I deployed a .NET Core application recently into an environment which previously had only run .NET Framework apps, but got a tricky HTTP Error 500.19 - Internal Server Error suggesting there was a problem with my Web.config:


What was also peculiar was that when I tried to enable stdout logging using the IIS UI I got an error "There was an error while performing this operation" with blank "Details" and "Error" but with the "Filename" set to the path to my Web.config. This was true for any of the configuration sections:


However after a lot of head scratching and googling turned up nothing my colleague realised that this environment was missing the Windows Server Hosting bundle, available from the .NET Core downloads section under the "Other Windows Downloads" section. Once this was installed everything worked as expected.

Czech - Telling The Time

I've been learning Czech on and off for the last few years, but now and again I discover that there are some basic things that I never quite learned properly. The most recent of these was my ability to tell the time. When I wanted to learn this properly I found that there are very few places that teach this at the level I wanted - they were either too simplistic, misleading or written in a way that is much too hard to quickly digest. I ended up writing the below for myself and for anyone else who wants to learn.

To simplify each explanation I'm assuming you're familiar with the following:
  • what I mean by "Nominative", "Genitive" and "Accusative" case
  • the verb "být"
  • the noun "hodina"
  • numbers from 1 - 60

Additionally plurals behave kinda funny in Czech. My friends and I all first learned this when counting beer so I'm gonna use "beer plurals" as shorthand for the following behaviour when you have ...

  • one of something you give the noun in singular Nominative (jedno pivojedna hodina, jedna minuta)
  • two, three or four of something you give the noun in plural Nominative (dvě piva, dvě hodiny, dvě minuty)
  • five or more of something you give the noun in plural Genitive (pět pivpět hodinpět minut)

Quick Reference

Firstly in case you want to just quickly reference, here's a table that demonstrates most of the cases:

time Czech time Czech
1:00 je jedna hodina 4:00 jsou čtyři hodiny
1:10 je deset minut po jedné 4:10 je deset minut po čtvrté
1:15 je čtvrt na dvě 4:15 je čtvrt na pět
1:30 je půl druhé 4:30 je půl paté
1:45 je tři čtvrtě na dvě 4:45 je tři čtvrtě na pět
1:50 je za deset minut dvě 4:50 je za deset minut pět
2:00 jsou dvě hodiny 5:00 je pět hodin
2:10 je deset minut po druhé
5:10 je deset minut po paté
2:15 je čtvrt na tři
5:15 je čtvrt na sest
2:30 je půl třetí
5:30 je půl sesté
2:45 je tri čtvrtě na tři
5:45 je tři čtvrtě na sest
2:50 je za deset minut tři
5:50 je za deset minut šest hodin
3:00 jsou tři hodiny
3:10 je deset minut po třetí
3:15 je čtvrt na čtyři
3:30 je půl čtvrté
3:45 je tři čtvrtě na čtyři
3:50 je za deset minut čtyři

Whole Hours

If it's exactly H o'clock, we say something like "it is H hours". The word for "hours" is "hodina" and this is the first example of the beer plurals I described above. There are only 12 possibilities here so it's not too hard to just memorise the below:

Half Hours

Times that in English are exactly half past an hour in Czech are said to be a half of the next hour, so instead of "half past one" we say something like "a half of two". There's a minor complication - the hour is given in the genitive singular feminine. They behave like adjectives because ultimately they are adjectives modifying the noun "hodina". So je půl jedné means "it is half of the first (hour)"

Again there are only 12 combinations so if my description doesn't make much sense you can just memorise the below without too much trouble:

Quarter Hours

Similar to half-hours when we talk about quarter past or quarter to an hour in Czech we talk about quarters of an hour. So 12:15 is "a quarter of one" or čtvrt na jednu (the hour part is in the Accusative case) and 12:45 is "three quarters of one" or tři čtvrtě na jednu (with čtvrtě being the plural of čtvrt)

Minutes After The Hour

If we have a time up to half past the hour, we write it really similar to the English - so 1:10 is "je jedna minuta po jedné". It's tricky:
  • "minuta" behaves like our "beer plurals" - 1 minuta, 2/3/4 minuty, 5/6/7/etc minut
  • the hour is the genitive singular, and since "hodina" is feminine we have jedné, druhé, třetí, etc

Minutes Before The Hour

Finally for minutes before an hour, we write something like "in M minutes it is H o'clock" - so 1:50 this is "je za deset minut dve". Again we have the "beer plural" for minutminuta, minuty, minut.

24 Hour Time. 

When reading from a watch, computer of phone it seems many Czechs will just say the hour and limit component separately. For example the other day when we were leaving the Maximus spa my girlfriend asked the lady at the counter what time the shuttle bus to the nearest tram stop left, she pulled up a schedule and said “šestnáct dvacet” - 16:20.

The only quirk is 1-9 minutes past the hour where you say the minutes with a leading nula - so 16:05 would be šestnáct nula pět.


Firefox Developer - Gnome and Debian 9 quickstart

Using Firefox on Debian 9 is a little frustrating as the packages available from default APT sources are older “ESR” releases (firefox-esr, 52.0 at the time of writing). Getting the Dev or Nightly builds is pretty straight forward but if you use Gnome you probably want a launcher, and it might not be very obvious how to do this.

First grab the sources or prebuilt binaries:

    $ curl -LO "https://download.mozilla.org/?product=firefox-devedition-latest-ssl&os=linux64&lang=en-US"

Extract them into /opt/firefox-dev:

    $ tar -xjf firefox-57.0b12.tar.bz2 && sudo mv firefox /opt/firefox-dev

Open up a text editor and create /use/share/applications/firefox-dev.desktop as follows

    [Desktop Entry]
    Name=Firefox Dev
    Comment=Browse the World Wide Web
    GenericName=Web Browser
    X-GNOME-FullName=Firefox Dev Web Browser
    Exec=/opt/firefox-dev/firefox %u
    Terminal=false
    X-MultipleArgs=false
    Type=Application
    Icon=firefox-dev
    Categories=Network;WebBrowser;
    MimeType=text/html;text/xml;application/xhtml+xml;application/xml;application/vnd.mozilla.xul+xml;application/rss+xml;application/rdf+xml;image/gif;image/jpeg;image/png;x-scheme-handler/http;x-scheme-handler/https;
    StartupNotify=true

Copy the icon and run gtk-update-icon-cache so that the icon appears as expected.

    $ sudo cp /opt/firefox-dev/browser/icons/mozicon128.png /usr/share/icons/hicolor/128x128/apps/firefox-dev.png
    $ sudo gtk-update-icon-cache -f /usr/share/icons/hicolor

And that's it! You should have a nice desktop icon for Firefox Developer Edition you can use. I also did the same for Firefox Nightly:

For updates you can clean out the directory and repeat the process with the latest tar.bz2 file ... or you can change the permissions in the firefox-dev directory so you have write access, and auto-updates will work.







Arduino - 5x8 ISO 8859-2 font

A while ago I took an existing 3x8 font, converted it by hand for use with AdaFruit's graphics library and subsequently modified it to use ISO 8859-2 characters (in a presumably innocent coincidence this popped up in the Adafruit GFX Library a few months later). Anyway, I recently received a request to perform the same modification to the standard 5x8 fonts and only recently got round to looking at this. As a quick reminder, ISO 8859-2 covers ~128 extra characters for alphabets used by Central and Eastern and Southern European languages - these characters look like this:

After a little bit of experimentation I realised that the extra 2 columns aren't really all that useful for adding the extra ligatures (čarky, hačky, umlauts etc) - we're really constrained by vertical space. Many of the existing letters need to be modified or reworked entirely, since they consume the almost entire 8 rows and we need 2 rows for some ligatures. For example Ä is currently implemented as the following:

    0x7D, 0x12, 0x11, 0x12, 0x7D

Which looks like this:

This is visually a little confusing, but more importantly we cannot really re-use it for ISO 8859-2 since some of the ligatures we need to add to the "A" require at least two rows. Instead of having an "A" which jumps around depending on the ligature, I've created a single A for when a ligature is used and left the ungarnished original letter alone.

Just as another example of why this can be tricky the existing ä looks really weird to me, the umlauts are skewed off to the left and look like they're joined to the letter itself. 

I've moved this up into a central position which is the same on all letters involving umlauts. This is purely based on personal taste, but I think it looks better - below is the original style compared to my modified version:

There are similar considerations in some of the other letters that are left as an exercise for the reader - see if you can devise a neat system to fit all the letters below into 3x8 grid in a way that is consistent and legible, it's pretty tricky. I've made an initial stab at this (see GitHub gist below), but after revisiting this I've realised how flakey and error-prone this process of creating fonts is. 



F# - some Project Euler patterns

To help me get used to F# and relearn the ways of functional programming I've been working through Project Euler in a Jupyter IfSharp notebook and keeping my solutions on GitHub at https://github.com/smcl/ProjectEulerJupyter

After around 50 problems so far I've spotted a handful of patterns which had either a number of possible solutions or were a pain to type out (or copy/paste) each time I used them. I explored them each in a little more detail to find the most optimal implementation of each pattern. The reason I wrote this up is that even though the problems are pretty simple, some of the results were pretty surprising.

For each of the patterns I've got a simple definition, some solutions and a set of benchmark results in a table. In each results table I've highlighted the most optimal solution that fully expands the result set (so the lazily evaluated solutions that "complete" in a millisecond don't count) so that we can have a realistic idea of what the best solution is.

Combinations of two lists 

The first item is the simplest of the four problems - if we have two lists foo and bar, produce a list of pairs featuring all combinations of elements of foo with bar. So given something like ...

let foo = [ 'a'; 'b'; 'c' ]
let bar = [ 1; 2; 3 ]

We expect to see a list like this ...

[ 
    ('a', 1); ('a', 2); ('a', 3)
    ('b', 1); ('b', 2); ('b', 3)
    ('c', 1); ('c', 2); ('c', 3)
]

I came up with only three solutions - I don't feel like I have an ideal solution to this, just the least hacky variant of the first solution that popped up in my mind.

pair_list: The first solution, call List.map for every member of one list, then in the function argument call List.map again for every member of the second - flatten the resulting list using List.concat

let pair_list l1 l2 =
    List.map (fun x -> l2 |> List.map (fun y -> (x,y))) l1
    |> List.concat    

pair_seq: as above, but assume we can have sequences as input, so we can produce the (fairly large) output array lazily:

let pair_seq s1 s2 =
    Seq.map (fun x -> s2 |> Seq.map (fun y -> (x,y))) s1
    |> Seq.concat

pair_seq_expanded: as above, expand fully to a List for an idea of how long it takes to operate on the whole output:

let pair_seq_expanded s1 s2 =
    Seq.map (fun x -> s2 |> Seq.map (fun y -> (x,y))) s1
    |> Seq.concat
    |> Seq.toList

pair_seq_for: it's pretty clear that using a sequence is preferable here, especially if you need to work with 1000 x 1000 collections, so the final variant is a slight rewrite of the second, using a for loop and yield-ing the resulting tuple.

let pair_seq_for s1 s2 = 
    [ for x in s1 do 
        for y in s2 do 
            yield (x,y) ]

To compare the performance of these three I've defined 100 and 1000 element lists/sequences and measured how long it takes to iterate through each sequence pair performing a simple operation (accumulating the difference between the pairs).

time to create n-element collection of pairs in milliseconds

method n=10000 n=100000 n=1000000 n=10000000
pair_list 1.5096 14.7937 226.0501 2927.2477
pair_seq 0.8690 0.8690 0.8846 0.9028
pair_seq_expanded 3.3952 21.5028 184.3353 2264.2805
pair_seq_for 3.2361 12.5183 180.1352 1997.3700

So thankfully the cleanest looking pair_seq_for solution is actually the fastest when we get to larger data sets. This isn't quite where the story ends though. There's a nice discussion here on Stack Overflow about a similar but slightly different problem - finding n element combinations of a single list - so for

let someList = [ 1; 2; 3 ]

... we wanted a function combs (n:'a) (lst:'a list) which would produce something like the below for combs 2 someList

[
    ( 1, 2 ); ( 1, 3 )
    ( 2, 1 ); ( 2, 3 )
    ( 3, 1 ); ( 3, 2 )
]

This is different from the problem I posed, but I've got a GitHub gist here where I've turned them all loose on the same set of data, and performed some simple measurements. 

Pairing elements of Collections with their indexes

In a couple of places I found myself wondering if F# collections had an equivalent of python's enumerate - which is a function which wraps a list and returns an index/element pair for each loop iteration:

letters = [ "a", "b", "c", "d" ]
for i, c in enumerate(letters):
    print "%d: %s" % (i, c)

# output:
#     0: a
#     1: b
#     2: c
#     3: d

It took a little while before I spotted Array.mapi so I ended up working through and measuring a handful of different ways first - some are obviously pretty poor (particularly those using recursion) but I left them in nonetheless:

enumerate_by_for_seq - using a Seq to generate the index and return a pair

let enumerate_by_for_seq (a:string []) =
    seq { for i in 0 .. (a.Length-1) -> (i, a.[i]) }

enumerate_by_for_seq_expanded - as previous, but returning a List to fully expand the sequence

let enumerate_by_for_seq_expanded (a:string []) =
    seq { for i in 0 .. (a.Length-1) -> (i, a.[i]) }
    |> Seq.toList

enumerate_by_for_list - iterating over each index using a for loop, returning a (int * string) list

let enumerate_by_for_list (a:string []) =
    [ for i in 0 .. (a.Length-1) -> (i, a.[i]) ]

enumerate_by_for_array - as above but returning (int * string[], note that this seems ridiculously similar, but differs surprisingly in performance (I discovered this by accident and included it in this experiment because of the difference!)

let enumerate_by_for_array (a:string []) =
    [| for i in 0 .. (a.Length-1) -> (i, a.[i]) |]

enumerate_by_map - generating a list of indexes and then using |> and List.map to create the index/element pair (i.e. the same as the first approach, but using List)

let enumerate_by_map (a:string []) =
    [0..(a.Length-1)]
    |> List.map (fun i -> (i, a.[i]))

enumerate_by_recursion_array - bonkers approach, abusing Array.append and recursing. Just don't do this...

let rec enumerate_by_recursion_array' i (a:string[]) =
    match i with
    | 0 -> [||]
    | _ -> Array.append [| (i, a.[i]) |] (enumerate_by_recursion_array' (i-1) (a.[1..]))

let enumerate_by_recursion_array (a:string[]) =
    enumerate_by_recursion_array' (a.Length-1) a

enumerate_by_recursion_list - List variant of the above. Don't do this either

let rec enumerate_by_recursion_list' i (a:string[]) =
    match i with
    | 0 -> []
    | _ -> [ (i, a.[i]) ] @ (enumerate_by_recursion_list' (i-1) (a.[1..]))
    
let enumerate_by_recursion_list (a:string[]) =
    enumerate_by_recursion_list' (a.Length-1) a

enumerate_by_for_zip - Using Array.zip - shortest solution, the best until I spotted Array.mapi

let enumerate_by_zip (a:string[]) =
    Array.zip a [|0..(a.Length-1)|]

enumerate_by_for_mapi - Probably the most "correct" solution, using Array.mapi

let enumerate_by_mapi (a:string[]) =   
    Array.mapi (fun i x -> (i,x)) a

enumerate_by_for_parallel_mapi - As above but naively switching in Array.Parallel.mapi without any other changes

let enumerate_by_parallel_mapi (a:string[]) =
    Array.Parallel.mapi (fun i x -> (i,x)) a

time taken to enumerate n element collection (milliseconds)

method n=10000  n=100000 n=1000000 n=10000000
enumerate_by_for_seq 0.3385 0.3496 0.3471 0.3540
enumerate_by_for_seq_expanded 2.6177 18.8341 205.4403 3610.3913
enumerate_by_for_list 1.3487 22.1703 248.5039 4200.8530
enumerate_by_for_array 2.1619 12.8186 192.3148 3178.5893
enumerate_by_map 2.0391 26.2468 287.2852 4179.3407
enumerate_by_recursion_array 7760.3141 n/a*  n/a* 
n/a*
enumerate_by_recursion_list 5368.5472 n/a* 
n/a* 
n/a*
enumerate_by_zip 7.1136 9.4388 170.0941 1917.8617
enumerate_by_mapi 2.6911 13.0303 116.5348 1268.8625
enumerate_by_parallel_mapi 8.1293 17.7548 102.2350 1379.0431

* = this took way too long so I killed it

Obviously Array.mapi was the fastest overall. However it wasn't as much faster than Array.zip as I would have imagined, and I suspect that I'm doing something wrong with Array.Parallel.mapi. Also interesting is that while the super-fast performance of the enumerate_by_for_seq method dissipates somewhat when fully evaluated, it is still faster than the equivalent enumerate_by_for_list version.

Pandigitals

"Pandigital" numbers feature relatively frequently in Project Euler. An n-digit pandigital number will contain all digits from 0..or 1..(n-1) once in some order. For example 41523 is a 1-5 pandigital, and 43210 is 0-4 pandigital. These numbers are mentioned in 32, 38, 41, 104, 118, 170 (and perhaps more) so a relatively efficient way to recognise them is pretty useful to have at hand. 

Again there's a few ways we can do this - in each case I can think of we start with taking the string representation of the input number and splitting it up using ToCharArray() and with this we can do a number of different things.

pandigital_strcmp - sort array, map each element to string, sort, create string + compare to "123..n"

let pandigital_strcmp (n:int) = 
    let sorted = new string (string(n).ToCharArray() |> Array.sort)
    sorted = pandigitalString

pandigital_intcmp - sort array, map each element to string, sort, create string, cast to int + compare to 123..n

let pandigital_intcmp (n:int) = 
    let sorted = new string (string(n).ToCharArray() |> Array.sort)
    int(sorted) = pandigitalInt

pandigital_arrcmp - sort array, string, cast to int + compare to existing array [| '1'; '2'; .. n |]

let pandigital_arrcmp (n:int) = 
    pandigitalArray = (string(n).ToCharArray() |> Array.sort)

pandigital_set_difference - convert to Set and compute difference from precalc'd set, pandigital if empty

let pandigital_set_difference (n:int) = 
    string(n).ToCharArray()
    |> Set.ofArray
    |> Set.difference pandigitalSet
    |> Set.isEmpty

pandigital_array_contains - for each element in precalculated pandigital array, check it's present in array and use List.fold to ensure all true

let pandigital_array_contains (n:int) = 
    let a = string(n).ToCharArray()
    pandigitalArray 
    |> Array.map (fun c -> Array.contains c a) 
    |> Array.fold (fun e acc -> e && acc) true

So I tested these against using the code to measure how quickly each method was in applying 

// where panDigitalInt is the upper limit ("n" in the table)
let testNumbers = [ 0 .. pandigitalInt ]
let bench name f =
    let sw = Stopwatch.StartNew()
    let res = testNumbers |> List.filter f |> List.length
    sw.Stop()
    printfn "%s: %f ms" name sw.Elapsed.TotalMilliseconds

time taken to filter pandigitals in [0..n] in milliseconds

method n=1234 n=12345 n=123456 n=1234567
pandigital_strcmp 2.1081 11.2639 113.2086 1356.1985
pandigital_intcmp 0.9716 9.7646 89.3238 947.0513
pandigital_arrcmp 2.4441 6.1932 59.7014 618.0665
pandigital_set_difference 2.5024 17.2115 199.2863 1986.9592
pandigital_array_contains 0.9790 4.8161 50.447 565.6698

So it seems Array.contains wins overall. The Set.difference approach was pretty dismal which was disappointing - it came to me when I was out walking my dog and I rushed back to write it and benchmark it. I think Set.ofArray is perhaps a little slow, but I haven't done any investigation into it.

It's worth noting that you probably shouldn't do something like [0..bigNumber] |> List.filter pandigital_array_contains to start with - maybe it's worth approaching the problem from a different angle in some cases.

Sorting a 3-element tuple

OK this only came up once and was part of a pretty poor solution I had for problem 39 (original, solution) - I generated thousands of tuple featuring 3 of integers and then tested whether they could represent the sides of right-angled triangles using Pythagoras' theorem. However since they were in no particular order I thought I needed identify the hypotenuse. I wrote this out long-form since there's only a handful of cases and solved the problem relatively quickly.

Regardless of whether this was a suitable solution for the problem, I was curious as to what approach works best for sorting these tiny collections of 3 elements.

I had only three approaches:

sort_nested_if - use nested if statements to reduce the number of comparisons needed while introducing branches
let sort_nested_if (a,b,c) =
    if a >= b then
        if b >= c then (a,b,c) else (a,c,b)
    elif b >= a then
        if a >= c then (b,a,c) else (b,c,a)
    else
        if a >= b then (c,a,b) else (c,b,a)

sort_flat_if - have a separate if for each result at the top level

let sort_flat_if (a,b,c) =
    if a >= b && b >= c then       (a,b,c)
    elif a >= b && b >= c then     (a,c,b)
    elif b >= a && a >= c then     (b,a,c)
    elif b >= a && c >= a then     (b,c,a)
    elif (*c >= b &&*) a >= b then (c,a,b)
    else (*c >= b && b >= a then*) (c,b,a)

sort_array - create an array, use Array.sort and map the results back into a tuple when returning the result

let sort_array (a,b,c) =
    let sorted = Array.sort [| a;b;c |]
    (sorted.[0], sorted.[1], sorted.[2])

To test these I generated large arrays of size 4000, 40000 and 400000 3-element tuples and timed how long each method took to sort them.

time taken to sort n 3-element tuples, in milliseconds

method n=4000 40000 400000
sort_nested_if 1.2626 13.9014 193.3619
sort_flat_if 1.7864 23.4633 258.2538
sort_array 1.2424 11.9907 132.4312

OK now it's probably obvious why I didn't just bin this little experiment - sort_array is surprisingly the clear winner. I would have guessed that the overhead of building an array and calling Array.sort function on a list way smaller than you'd normally need to sort would be insane. But apparently it's not!

Conclusion

It's surprising how many different ways you can write some relatively simple algorithms. Some of them are obviously pretty awful, like the recursive enumerate function (though I'm sure I can rearrange it to take advantage of tail call elimination) - and some were surprisingly performant, like the sort_array function in the final example. I've noticed that some other Project Euler people maintain a sort of library of functions they can re-use. Eventually I'd like to do something like this, but until it becomes unmanageable I'll just keep a Gist on GitHub: