F# Memoization: The Recursive Functions

Take this naive definition of Fibonacci sequence:

let rec fib n =
  if n < 2I
  then 1I
  else fib (n-1I) + fib (n-2I)

*Good*: correct, easy to write. *Bad*: performance. Calculates
the same values over and over again.

How about saving up the results? There is this function
I got from Don Syme's blog:

open System.Collections.Generic
 
let memoize f =
  let cache     = Dictionary<_, _>();
  let compute x = let res = f x in (cache.[x] <- res; res);
  fun x -> if cache.ContainsKey(x) then cache.[x] else compute x

Can we use this to patch up Fibonacci? Well, not directly. Test it.
memoize fib 30 is hardly faster than fib 30. Why? Obviously,
since the recursive calls are not cached.

The easiest way to work around that is to lambda-abstract the recursion
from fib. It is not as fancy as it sounds. Simply do this (and notice
there's no more rec in let):

let rfib self n =
  if n < 2I
  then 1I
  else self (n-1I) + self (n-2I)

Introduce a fixed-point combinator:

let rec fix f x = f (fix f) x

In Haskell that would be simply

fix f = f (fix f)

But the following does not work (think about the reason).

let rec fix f = f (fix f)

Now, the naive Fibonacci function can be expressed as

let fib = fix rfib

However, by providing a memoizing fixed-point memofix, its performance
can be drastically improved:

let memofix f = let rec g = memoize (fun x -> f g x) in g
let fib = memofix rfib

Below is Problem 2 again from Project Euler, just to show it works with
this machinery. The solution is hardly the best one for this particular
problem. But recursive memoization is natural in other dynamic
programming contexts.

#light
 
open System.Collections.Generic
 
let memoize f =
  let cache     = Dictionary<_, _>();
  let compute x = let res = f x in (cache.[x] <- res; res);
  fun x -> if cache.ContainsKey(x) then cache.[x] else compute x
 
let memofix f =
  let rec g = memoize (fun x -> f g x) in g
 
let fib =
  let f self n =
    if n < 2I
    then 1I
    else self (n-1I) + self (n-2I)
  in memofix f
 
let euler_0002 =
  Seq.init_infinite (fun x -> BigInt x)
  |> Seq.map fib
  |> Seq.filter (fun x -> x % 2I = 0I)
  |> Seq.take_while ((>) 4000000I)
  |> Seq.sum
 
printfn "%A" euler_0002

Setting up F# ASP.NET on Linux

Just got the first F#-enabled ASP.NET page working on my Fedora. As far as I understand, deploying and registering the F# compiler only makes sense for the developer's machine, as it enables the aspx pages to recompile on the fly. For deployment, one presumably can pre-compile and package .NET assemblies with the application, so that the deployment machine does not need to have F# installed.

  • Install everything Mono with the GNU incantation ./configure; make; sudo make install
  • For F#, the Makefile is missing. I never figured out how to make it behave, so I just registered all DLLs with the Global Assembly Cache, something like this:
sudo -s
cd /usr/local/src/FSharp-1.9.6.2/bin/
for f in *.dll; do gacutil -i $f; done
cd gac/
for f in *.dll; do gacutil -i $f; done
  • Now run mono fsi.exe to see if it works.
  • The fsc (F# compiler) needs to sit in the system path. One way to do it is to create /usr/bin/fsc.exe like this (I needed –warn 0 to avoid a weird error during ASP.NET pages compilation):
#!/bin/sh
exec mono /usr/local/src/FSharp-1.9.6.2/bin/fsc.exe --warn 0 $@
  • May make sense to make it runnable simply as fsc by a symlink:
sudo ln -s /usr/bin/fsc.exe /usr/bin/fsc
  • Make sure your Apache loads the mod_mono.conf generated by installing mod_mono.
  • $ locate web.config and edit the file to register the F# compiler. I added the following into the root node:
<system.codedom>
  <compilers>
    <compiler
       language="F#;f#;fs;fsharp"
       extension=".fs"
       type="Microsoft.FSharp.Compiler.CodeDom.FSharpAspNetCodeProvider,
 	      FSharp.Compiler.CodeDom,
 	      Version=1.9.6.2,
 	      Culture=neutral,
 	      PublicKeyToken=a19089b1c74d0809"/>
  </compilers>
</system.codedom>
  • Restart Apache!
  • Drop this time.aspx into the document root:
<%@ Page Language="F#" %>
 
<script language="F#" runat="server">
member this.f(_,_) =
    this.Time.Text <- DateTime.Now.ToString()
</script>
 
<div runat="server" OnLoad="f">
  <asp:Label runat="server" id="Time"/>
</div>
  • Now you should be able to access http://localhost/time.aspx. Be prepared for seconds of waiting. After the page compiles, it should run faster. It should display the current time.

Caching the Web

These are some lessons in web development I learned the hard way. By the way, web development is an easy area to hate. It seems to be all about staying compatible with very bad tools and platforms.

  1. Use W3C Tidy on your HTML output;
  2. Use an HTTP Accelerator / reverse proxy where appropriate (Squid, Varnish Cache, Nginx?), this gives easily a 10x speed gain on certain dynamic files;
  3. Test that your pages are served compressed and uncompressed appropriately, and cached;
  4. Test that your pages are served over HTTP 1.0 as well as HTTP 1.1; Varnish helpfully gave out a blank screen for certain HTTP 1.0 before the trunk 2.0 beta version;
  5. Optimize your caching. Cookies are your enemies, as well as improper headers. When cookies are present, the HTTP cache cannot prove that they will not be used by the content object; therefore, it is often necessary to forcefully remove cookies from those requests that you know are not using them: mimetex.cgi is a good example.
  6. Test your configuration with HEAD, POST and ab tools. Don't forget to test with different headers, such as -H Cookie: a=b or -H Accept-Encoding: gzip,deflate.
  7. Design your software to make use of HTTP-level caching. RESTful services especially start to shine when cached appropriately.
  8. When hacking third-party software, be prepared to waste a lot of time. And refrain from modifying its core files. When the next version comes out it will not be compatible with your changes. jQuery can do a lot of cosmetic changes on the end HTML level, and connecting it to a few RESTful endpoints can add functionality without hacking the legacy app.

POTD: Slava Akhmechet

Person of the day is Slava Akhmechet the (Common) LISPer with his relational database rant.

This comes to my mind after almost a day of failing attempts to convince MySQL optimize a simple query. The smart database does not like to use the index on a WHERE ... IN (SELECT ...) clause even when the inner query returns 2-3 results. The database prefers to go through 600K records. This looks like the bug #17952.

O liquid crystal gaz’rs of Tar2,
Resist the call of Durin’s drum!
Beware the fat black gol’m of war
And Grima’s tails.

N-Queens Revisited: A Haskell Reply

A Curious Programmer wrote C++ and OCaml solutions. According to the blog, “the fixed C++ completes 100 solutions in 9.5 seconds and the Ocaml completes 100 solutions in 19.3 seconds.” Now of course this is apples to oranges because the algorithm is likely to be different, but here is a Haskell solver that goes a lot faster than that:

`time for ((n=1;n<=100;n++)); do ./nqueens 8 > /dev/null; done

real	0m1.115s
user	0m0.856s
sys	0m0.232s

And the solver is really simple, almost brute-force. The only optimization it uses is the knowledge that every line of the board will have a queen in the solution. So, when looking for free locations for the nth queen, it filters them to only consider locations on line n.

import System
import Data.Array
 
type N = Int
newtype QLoc    = QLoc (N, N)
newtype NQueens = NQueens (N, Array N N, [QLoc])
 
clearBoard :: N -> NQueens
clearBoard n = NQueens (n,
                        array (1, n) (take n (zip [1..] (repeat 0))),
                        map QLoc [(i,j) | i <- [1..n], j <- [1..n]])
 
(~~) :: QLoc -> QLoc -> Bool
(QLoc (a, b)) ~~ (QLoc (i, j)) = or [i==a, j==b, i-j==a-b, i+j==a+b]
 
freeLocs :: NQueens -> [QLoc]
freeLocs (NQueens (_, _, f)) = f
 
putQueen :: QLoc -> NQueens -> NQueens
putQueen q@(QLoc c) (NQueens (n, p, f))
    = NQueens (n, p // [c], filter (not . (~~q)) f)
 
instance Show NQueens where
    show (NQueens (n, p, f)) =
        let s x = take x $ repeat '.'
            f 0 = s n
            f x = s (x-1)  ++ "Q" ++ s (n-x)
        in  unlines (map f (elems p))
 
nqueens :: N -> [NQueens]
nqueens n = gen n where
    gen 0 = [(clearBoard n)]
    gen n = [putQueen loc sol |
             sol <- gen (n-1),
             loc@(QLoc(i,_)) <- freeLocs sol,
             i == n]
 
 
main = do [n] <- fmap (map read) getArgs
          sequence_ $ map (putStrLn.show) $ nqueens n

The program correctly finds 92 solutions for the 8-queens problem. And here is a sample output for ./nqueens 16:

Q...............
..Q.............
....Q...........
.......Q........
.............Q..
........Q.......
............Q...
...............Q
.Q..............
..............Q.
.........Q......
.....Q..........
...Q............
...........Q....
......Q.........
..........Q.....

UPDATE: This is compressible to an n-liner (modulo nice printing):

q :: Int -> [[Int]]
q n = map fst (gen n) where
    gen 0 = [([], [(i,j) | i <- [1..n], j <- [1..n]])]
    gen n = [(j:s, filter (not.(t f)) fs)
                 | (s,fs) <- gen (n-1), f@(i,j) <- fs, i==n ]
    t (a,b) (i,j) = or [i==a, j==b, i-j==a-b, i+j==a+b]

And this is almost as good as the solution from the Haskell wiki, H-99: Ninety-Nine Haskell Problems.

Project Euler 10

Problem 10 is about primes again, and it's getting tedious. Thankfully the sieve from Problem 8 can be factored out into a separate module. Now we need to make a correction there - Integer instead of Int, as the numbers are getting large.

import Control.Monad
import Data.Array.ST
import Control.Monad.ST
 
sieve n
    = runSTUArray $
      do p <- newArray(2, n) True :: ST s (STUArray s Integer Bool)
         sequence_ [ readArray p i >>=
                     flip when ( sequence_ [ writeArray p j False
                                             | j <- [2*i, 3*i .. n]] )
                     | i <- [2 .. maxdiv n] ]
         return p
 
maxdiv   = ceiling . sqrt . fromInteger . toInteger

And then:

module Euler0010 where
import Data.Array.IArray
import Primes
 
euler0010 = sum $  map fst $ filter snd $ assocs $ sieve 2000000

Takes a while in interactive mode, about 20 sec. When compiled, however, takes 1.28 sec. It's OK.

Project Euler 9

Problem 9 is a lovely simple problem. The equations can be solved to a closed form to speed the search up. I am totally inept at solving equations, but I have Maxima. So:

(%i1) e1: a + b + c = 1000;
(%o1)                          c + b + a = 1000
(%i2) e2: a^2 + b^2 = c^2;
                                  2    2    2
(%o2)                            b  + a  = c
(%i3) solve([e1,e2],[b,c]);
                                            2
                    1000 a - 500000        a  - 1000 a + 500000
(%o3)         [[b = ---------------, c = - --------------------]]
                       a - 1000                  a - 1000
(%i4)

Now we have a closed-form for b as a function of a (or vice versa). We need to solve it in natural numbers. Coming back to Haskell (I am not proficient with Maxima yet), this can be straightforward:

module Euler0009 where
import Control.Monad
import Data.Maybe
 
mdiv a b | a `mod` b == 0 = Just $ a `div` b
         | otherwise      = Nothing
 
b a = (1000*a - 500000) `mdiv` (a - 1000)
 
euler0009 = let a' = fromJust $ msum $ map b [1..]
                b' = fromJust $ b a'
                c' = 1000 - a' - b'
            in  (a'*b'*c')

Project Euler 8

Problem 8 in Haskell is really about I/O. ByteString is great. Fast, lazy, intuitive and monadic I/O. Especially since the latest Parsec can be run on ByteStrings… But no need for that here. Just a simple problem:

module Euler0008 where
import Data.Char
import qualified Data.ByteString.Lazy as BS
 
 
ascii0 = ord '0'
fives bs | BS.length bs >= 5 = BS.take 5 bs : fives (BS.drop 1 bs)
         | otherwise         = []
 
mul5 = BS.foldl' (\a b -> a*(ii b-ascii0)) 1
 
euler0008 = do bs <- fmap BS.init $ BS.readFile "Euler0008.txt"
               putStrLn $ show $ maximum $ map mul5 $ fives bs
 
ii = fromInteger . toInteger

The only ugliness comes with Word8 to Int adapter ii, and with the need to do BS.init to get rid of the newline. I looked for chomp but could not find.

Project Euler 7

Problem 7 is primes again. In Haskell, it is always tempting to have an infinite list of primes and filter it. However, this is not nearly as efficient as the simple sieve. Sieving, however, inherently assumes mutation. This means it requires some dancing in Haskell to get right. The ST monad is one way to go. The solution below does strict sieving first and then constructs an infinite list, keeping the usable interface but making it slightly more efficient. Basic tests show that when the sieve size is properly chosen, it speeds up the program by a factor of 3.

module Euler0007 where
import System
import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.Array.IArray
 
sieve n = runSTUArray $
          do p <- newArray(2, n) True :: ST s (STUArray s Int Bool)
             sequence_ [ readArray p i
                         >>= flip when
                                 ( sequence_ [ writeArray p j False
                                                   | j <- [2*i, 3*i .. n]] )
                         | i <- [2 .. maxdiv n] ]
             return p
 
 
maxdiv   = ceiling . sqrt . fromInteger . toInteger
primes n = let  s       = sieve n
		slist   = map fst $ filter snd $ assocs s
		p       = head [i | i <- [n, n-1 .. ], s ! i]
		primes  = slist ++ filter prime [p+2, p+4 ..]
		prime x = not $ any ((==0).(mod x)) $
                          takeWhile (<= (maxdiv x)) primes
           in  primes
 
 
euler0007 = (primes 100000) !! 10000