Posts for Tag: fsharp

F# - Polymorphic parameter overflow

I was mucking around in F# over lunch and started thinking about how it assigns names to type variables (see "Automatic Generalization" of the Type Inference page of the F# docs for more info). By way of introduction, let's create a function foo that takes a single parameter without an explicitly defined type and see what its signature looks like:
> let foo x = x;;
val foo : x:'a -> 'a
So the parameter x is assigned the type variable 'a - which makes sense, the first unknown type gets named after the first letter of the alphabet. And of course it follows that if we have a function with two parameters the second type is called ...
> let foo x y = (x,y);;
val foo : x:'a -> y:'b -> 'a * 'b
... 'b! Ok now what happens when we've got a function with a ridiculous amount of parameters? We'll run out of lower-case letters eventually. When that happens do start using upper-case letters? Non-ASCII? Something else? I quickly hacked together a python script to generate functions with arbitrary parameters, created one with 40 and pasted it into the F# REPL and ...
> let foo x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 = (x0,x1,x2,x- 3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,x38,x39);;
val foo :
  x0:'a ->
    x1:'b ->
      x2:'c ->
        x3:'d ->
          x4:'e ->
            x5:'f ->
              x6:'g ->
                x7:'h ->
                  x8:'i ->
                    x9:'j ->
                      x10:'k ->
                        x11:'l ->
                          x12:'m ->
                            x13:'n ->
                              x14:'o ->
                                x15:'p ->
                                  x16:'q ->
                                    x17:'r ->
                                      x18:'s ->
                                        x19:'t ->
                                          x20:'a1 ->
                                            x21:'a2 ->
                                              x22:'a3 ->
                                                x23:'a4 ->
                                                  x24:'a5 ->
                                                    x25:'a6 ->
                                                      x26:'a7 ->
                                                        x27:'a8 ->
                                                          x28:'a9 ->
                                                            x29:'a10 ->
                                                              x30:'a11 ->
                                                                x31:'a12 ->
                                                                  x32:'a13 ->
                                                                    x33:'a14 ->
                                                                      x34:'a15 ->
                                                                        x35:'a16 ->
                                                                          x36:'a17 ->
                                                                            x37:'a18 ->
                                                                              x38:'a19 ->
                                                                                x39:'a20 ->
                                                                                  'a *
                                                                                  'b *
                                                                                  'c *
                                                                                  'd *
                                                                                  'e *
                                                                                  'f *
                                                                                  'g *
                                                                                  'h *
                                                                                  'i *
                                                                                  'j *
                                                                                  'k *
                                                                                  'l *
                                                                                  'm *
                                                                                  'n *
                                                                                  'o *
                                                                                  'p *
                                                                                  'q *
                                                                                  'r *
                                                                                  's *
                                                                                  't *
                                                                                  'a1 *
                                                                                  'a2 *
                                                                                  'a3 *
                                                                                  'a4 *
                                                                                  'a5 *
                                                                                  'a6 *
                                                                                  'a7 *
                                                                                  'a8 *
                                                                                  'a9 *
                                                                                  'a10 *
                                                                                  'a11 *
                                                                                  'a12 *
                                                                                  'a13 *
                                                                                  'a14 *
                                                                                  'a15 *
                                                                                  'a16 *
                                                                                  'a17 *
                                                                                  'a18 *
                                                                                  'a19 *
                                                                                  'a20
So 'a up to 't ... but then numbered 'as after that. I've no idea why it stops at 't, then just counts up from 'a. Interestingly this is the case with F# under Mono on Linux and under the official MS runtime on VS2017 on Windows. OCaml, which F# is heavily influenced by, does not exhibit this behaviour - it refers to the unknown types as a..z, a1..z1, a2...z2, etc.

That's reasonbaly interesting and all, but since I had already written a script to generate these F# files I figured I'd keep trying with more and more parameters to see what happens. After adding a hundred parameters at a time I soon hit an interesting warning between 400 and 500 params:

Warning: line too long, ignoring some characters
- 
It turned out we hit some line-length restriction in either the parser or the readline equivalent that fsi.exe uses. A bit of experimentation later I found that when the function had 429 parameters it parsed just fine ... but 430 parameters caused the warning.

429 parameters: 4093 characters
430 parameters: 4104 characters

Minor sidebar - I was going to just say we hit a restriction where 4096 bytes is the max line length, but obviously 1 character isn't necessarily 1 byte if we're using some Unicode representation, and a nice way to check that we're using Unicode is to punch in identifiers that would definitely be in different encodings in an ASCII representation - I used Georgian and Czech characters:
> let სამი = 3;;
val სამი : int = 3

> let čtyři = 4;;
val čtyři : int = 4

> სამი + čtyři;;
val it : int = 7
When I changed my program to start each parameter with the Georgian character "ა" (i.e. let bar ა0 ა1... etc) which cannot be represented in a single byte in UTF-8 we hit the error after only 162 parameters. So it's not a character limit, but a buffer of bytes that we filled - and the size of the buffer is ... uh 4096 bytes.

Since I wanted to pull this thread until I hit some logical conclusion I tweaked my program to split the insanely long lines so they wouldn't hit this error and continued to add more parameters. The next issue I encountered was after trying a function with a couple thousand parameters, where I encountered the following error:
  x1328,
  ^^^^^

/home/sean/stdin(6636,1): error FS0039: The value or constructor 'x1328' is not defined. Maybe you want one of the following:
   x132
   x138
   x1238
   x128
   x1321
So it seems we've bumped up into another limit - somehow 1327 parameters is ok but F# loses track of the 1328th one, thinking it doesn't exist. Is this a bug? Probably, there should maybe be a more helpful error message. Is it important? Probably not, if your code contains a function with upwards of a thousand parameters then this is the least of your problems.