Monday, July 6, 2009

Emacs 23

What I enjoy in the (beta) Emacs 23:

  1. It finally looks like a native Linux application
  2. Buffers automatically split horizontally
  3. It highlights the text selection by default

On Ubuntu, get it from this PPA. On Win, there's a decent EmacsW32 patched build.

screenshot

Ubuntu apt sources.list:

## emacs elisp ppa 
deb http://ppa.launchpad.net/ubuntu-elisp/ppa/ubuntu jaunty main
deb-src http://ppa.launchpad.net/ubuntu-elisp/ppa/ubuntu jaunty main

Sunday, July 5, 2009

posting formulas in blogger

Posting math in Blogger posts can be tricky.

Luckily, images can be inlined in HTML:

<img src="data:image/png;base64,iVBORw0KGgoAAAA...

This code will be readable by major browsers including IE.

Knowing that, one can set up a toolchain that extracts LaTeX math declarations from source, processes them with TeX, and embeds the resulting PNGs in the HTML output.

For example, the following:

$$\begin{align}
  \left(1+x\right)^n  =& 1 + nx + \frac{n\left(n-1\right)}{2!}x^2 +\\
  +& \frac{n\left(n-1\right)\left(n-2\right)}{3!}x^3 +\\
  +& \frac{n\left(n-1\right)\left(n-2\right)\left(n-3\right)}{4!}x^4 +\\
  +& \dots
\end{align}$$

Becomes:

\begin{align} \left(1+x\right)^n = 1 + nx + \frac{n\left(n-1\right)}{2!}x^2 +\\ + \frac{n\left(n-1\right)\left(n-2\right)}{3!}x^3 +\\ + \frac{n\left(n-1\right)\left(n-2\right)\left(n-3\right)}{4!}x^4 +\\ + \dots \end{align}

Unfortunately Pandoc does not do this inlining, nor does it interface TeX. This is a design decision, as Pandoc tries to be zero-dependency.

The solution I came up with involves hacking the texvc OCaml program that ships with Wikipdia. I now have it do XML processing, substituting code of the form [EQ]\frac{1}{2}[/EQ] to an embedded image. The whole toolchain is still unfortunately quite ugly, and looks like this:

pandoc --standalone --no-wrap --gladtex $@ \
    | xmllint --dropdtd --recover - 2/dev/null \
    | texmi \
    | tidy --show-body-only true - 2/dev/null \
    | pandoc -f html -t html --no-wrap 

So the first step is converting from markdown to HTML with [EQ]-style mathematics (--gladtex), then the output is xmllinted so that the parser does not choke, then temxi renders the mathematics into HTML or inline images, and finally tidy makes sure the XML output is OK HTML. The last line is only necessary for Blogger's non-standard whitespace handling.

I will try to improve on this as time permits.

markdown blogging in emacs

Blogging in Markdown from Emacs is nice. Most people, me included, start out by writing a file, running a processor on it (such as Pandoc) to get the HTML, then copying the HTML to the web form to publish.

There is a way to do it faster with a little .emacs hacking (I got here with some help from #emacs IRC, and probably there are better ways to do this, but still) :

(defun pandoc () 
"Runs the contents of the current buffer through Pandoc and
copies the produced HTML to clipboard."
(interactive)
(let ((fn (make-temp-file "pandoc"))
(pbuf (get-buffer-create "*pandoc-output*")))
;; copy the contents of the current buffer to the temp file
(write-region (buffer-end -1) (buffer-end 1) fn)
;; switch to *pandoc-output*, then return to the curr. buffer;
;; clean it, call pandoc on the temp file, copy to clipboard.
(save-current-buffer
(progn
(set-buffer pbuf)
(erase-buffer)
(call-process "pandoc" fn pbuf nil "--no-wrap")
(clipboard-kill-region (buffer-end -1) (buffer-end 1))))
(kill-buffer pbuf) ; cleanup the buffer
(delete-file fn))) ; .. and the temp file
(global-set-key "\C-cp" 'pandoc)

Now you can just type the post in a buffer, hit C-c p, switch to Firefox and paste the Pandoc output.

texmath

John McFarlane announced the texmath Haskell library for converting LaTeX formula markup to MathML. This is great - see the demo! I hope it makes it into the next pandoc release.

I am not sure how MathML fares with the recent IE versions, this used to be a problem..

Saturday, March 14, 2009

the ultimate language

Periodically, a manager will mandate that I perform my work using some particular language or technique (buzzword). I usually comply by writing (adapting) a translator from SCM to that language -- Aubrey Jaffer

Friday, March 13, 2009

ANN: open lecture series in algorithms in Kyiv

The lecture series will be based on Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein, with supplementary materials from MIT-OCW.

Instructors: Oleg Smirnov, Ivan Veselov

The schedule is to be confirmed yet.

First lecture: Saturday, March 14, 17:00

Venue: G-Club (courtesy of GlobalLogic), Bozhenko 86D, Kyiv, Ukraine

Mailing list (russian): kiev-clrs/googlegroups.com

Sunday, March 8, 2009

Pandoc. Can we use it ouside of Haskell?

Pandoc is great. It a Markdown implementation on steroids, supporting a number of different input/output formats besides HTML, such as TeX. And yet it is Markdown: the source still looks great, just like plain email! As an extra bonus, Pandoc, being written in Haskell and compiled to native code, works a lot faster than many alternative markup language processors.

The problem is.. It is a Haskell library. If it is indeed the great software it seems to be, one would want to use it from scripting languages (PHP, Python, Perl), OCaml, Schemes, Java, .NET and so on.

The first step towards that is to bind it to C.

Good news: yes, it is possible, and Pandoc DLL or Shared Object builds on both Windows and Linux. It is callable from other languages (currently C and PLT Scheme).

Bad news: the binary is large, as GHC compiles everything, including its runtime, into it. The binding is untested, poorly configured and very incomplete. But, well, it is open-source and will get better with time. You can see for yourself: libpandoc