next :: Java -> Haskell

You might have read Seeing Java where I described my experience writing a project for a Java class as an exercise in learning some modern Java. Today I did a similar experiment but on a much smaller scale and with one of the polar opposites of the Java language: Haskell.

Haskell is something I’ve been thinking quite a bit about lately as I’ve been pushing myself to learn more about “functional programming.” Haskell and Clojure are two languages that break the normal paradigm I’m used to and encourage a dramatically different style of programming. They  take very different approaches to that concept, though, and I’ll only be discussing Haskell today. By the way, today is the day I wrote my first program ever in Haskell.

Functional programming is a misnomer. Most software is functional. The term comes from a technical aspect about passing around functions whereas many languages only allow passing around data. Functional programming is basically the result of the worlds of mathematics and computer science crashing into each other. I’ll describe it more fully in a future post.

Java and Haskell have strong typing in common, but Java’s typing system feels like a bear compared to Haskell’s, which helps more than it annoys. Typing is lightweight in Haskell, which means that the language encourages creating all sorts of names for a certain type of data. Instead of a list of numbers in a certain form, for example, we might create  a phoneNumber type. This can be done in most programming languages but it can require more effort to do well.

Haskell is a purely functional language. This comes from the math side of things and that means we can make a bunch of assumptions about the code we write, which in turn makes code that might otherwise be very slow to run work fast. Usually when programming there is a way to write code that is simple to understand but slow and another way that is harder to read but runs faster. Because we “promise” not to do certain things, the Haskell compiler can translate our easier-to-read code into faster-to-run code for us.

Haskell has a bad reputation for being incredibly cryptic. My first bit of Haskell was pain-free, but I’m not doing anything particularly complicated. I enjoyed the language syntax (the format for how you are supposed to write the program) and the error messages were pretty good when I wrote incorrect code. As a programmer I didn’t find this any more cryptic or hard-to-read than just about any other language.

One of the things I really enjoy is the declarative nature of this code verses the imperative or procedural nature of what I’m used to. When we typically learn how to program, we are taught how to break complicated processes into smaller and smaller steps and then order those steps in code to accomplish our goals. The math-side of functional languages prefer to focus on the relationships between data (or input and output) and let the computer “figure out” how to make it work. Let me illustrate with a description of pattern matching.

Pattern matching is a way to specify different behaviors for input that is structurally distinct. A little code will hopefully make sense.

pluralize :: takes a word and a plurality indicator to make a new word
    noun 1 = noun
    noun 2 = "a couple {noun}s"
    noun n = "{noun}s"

Disclaimer: the above snippet isn’t real code from any language, but it’s similar to how Haskell makes functions to operate on data. We can see here in the second line that if we call pluralize with a noun and a count of 1, then we just return the noun. If, however, we want the noun for a count of 2, the function pieces together a simple fragment, “a couple of things.” Finally, if we call it with any other number besides 1 or 2, we simply add an s on the end of the noun and call it a day. This crude pluralizer won’t get an A in grammar class, but it’s not a bad way to describe the behavior we want from pluralizing nouns.

This code snippet leaves much to be imagined on how to accomplish the pluralizing. In a traditional imperative language, we would have to construct it in a way that includes lots of “support” code, which figures out what circumstances we are in before giving a new value.

pluralize( noun, count )
    if count is 1 then noun
    else if count is 2 then "a couple {noun}s"
    else "{noun}s"

The differences are subtle in a simple function like this, but in the first snippet we don’t have to perform checking on the input to see if it is a particular value and then to respond accordingly. Sometimes we end up with more code to check the input than we do to actually get.stuff.done. Again, this is a result of the mathematical focus on the algorithm in Haskell and other functional programming languages.

This was a small project but a fun one. Reasons for choosing Haskell for a project include the language’s ability to prove its behavior the way we construct mathematical proofs such as for  the Pythagorean Theorem, its ability to concurrently run code safely, its type safety, and the love of the purity and expressiveness of the language.

The only real trip-up I had came from the fact that I was running my code through my text editor – a “set-it-and-forget-it” kind of deal. Unfortunately, a few of the errors I made in the process caused the code to run forever without ending, so I was surprised to see a popup telling me that my computer’s memory was full and I had to close some applications or it would crash. Whoops.

Want to see the code? Here it is. How much of what it is doing can you figure out? How would you attempt to add a new function in this code to compute the standard deviation?

 


import Data.Monoid;
x1 = [1..100]
x2 = [2,4..100]
x3 = [3,6..100]
data Summary = Summary {µ::Float, n::Int} deriving (Show)
instance Monoid Summary where
mempty = Summary { µ = 0, n = 0 }
mappend s1 s2 = Summary {
µ = combinedMean s1 s2,
n = (n s1) + (n s2)
}
count :: [Float] -> Int
count xs = length xs
mean :: [Float] -> Float
mean xs =
let s = sum xs
n = fromIntegral $ count xs
in s / n
describe :: [Float] -> Summary
describe xs = Summary { µ = mean xs, n = count xs }
combinedMean :: Summary -> Summary -> Float
combinedMean s1 s2 =
let n1 = fromIntegral $ n s1
n2 = fromIntegral $ n s2
µ1 = µ s1
µ2 = µ s2
in (µ1 * n1 + µ2 * n2) / (n1 + n2)
main :: IO ()
main = do
putStrLn $ " x1 – " ++ (show $ d1)
putStrLn $ " x2 – " ++ (show $ d2)
putStrLn $ " x3 – " ++ (show $ d3)
putStrLn $ " x1 + x2 – " ++ (show $ d1 <> d2)
putStrLn $ "x1 + x2 + x3 – " ++ (show $ mconcat [d1, d2, d3])
where
[d1, d2, d3] = map describe [x1, x2, x3]

No more code rot

A key security component of Windows…has a validity of 15 months and is going to expire in 25 hours.

This is what I read on the blogosphere today and it made think about doing software design. We all hate old code because we’ve grown and learned since the time we wrote it and now think that we’d never make those mistakes again.

Nevertheless, changing code once it’s checked-in to a project can be harder than landing a probe on an astroid. Code moves in and settles down, bringing all its baggage and bad habits with it – we call this code rot because the longer a certain piece of code is in a system, the worse it tends to look or smell when we review it.

What if we couldn’t let this happen though? What if everything we wrote would self-destruct after a predefined amount of time and if we failed to update it, our software would blow up? All traces of the code vanish, all of its history disappears unless someone looks over it and updates something. Small updates would extend the deadline for a little bit and bigger refactors would add prolonged delays.

This might be a novel way of prioritizing bug fixing and performance over adding new features.

Lost in async-hole.

The past week and a half or so I have been living up the reputation of the typical basement-dwelling geek. I like to call it “my cave” regardless of where it is or whether or not it has windows to the outside. It’s a tunnel-vision state of mind where I spend a significant amount of time in a sort of a trance; I recently drove myself crazy while trying to solve two particularly sneaky problems with the part of WordPress that shows you your notifications.

tunnel-336693_1280
Tunnel-vision, in the zone, in my cave, focussed, thinking…

The criminal in one of the problems was caused by an asynchronous worker – a piece of code that was running in the background without making itself well-known. Imagine, for instance, that you are a grocery store manager and have instructed your staff to keep the storefront tidy and clean. One day, you decide to make a display of your fresh breads in the window to entice all who might walk by to come in for some pastries or pretzels. Strangely, every time you return to the front of the store you notice the missing bread and replace it, confused as to where it’s going. Meanwhile, unbeknown to you, one of your employees routinely makes his rounds and sees the bread, thinking it was misplaced, and promptly brings it to the bakery section, placing it on the shelf. This is what was happening to me in the code.

Despite the fact that the solution was simple – instruct the clerk to leave the bread alone – you spend hours trying to figure out who might be swiping it away or how it keeps disappearing. Programming and engineering large systems is often like this: trying to see the whole picture where multiple actors are at work and where different components from different teams all interact.

Remember the Mars Climate Orbiter? Thankfully my mistakes were much more forgiving than the wrong unit conversions were for that space probe. Still, I learned a few lessons through this experience.

Sizzle sizzle sizzle...not what NASA wanted to hear.
Sizzle sizzle sizzle…not what NASA wanted to hear.

First, it was a good reminder to take real breaks from my work when I get stuck in a rut. We can spin our gears in a certain direction because we think the answer lies at its end, but if we keep on turning them and get nowhere, we (at least I) need a chance to let my mind reorient itself. “Thinking outside of the box” might even be a little too cliché and overkill to describe this because we really just need to get out of the tedious cycle we put ourselves in. Try something different. In this case, I solved the problem this morning after a full night’s rest and after pushing aside all of the work I have spent on it over the past few days. In the end there was an “aha moment” and I had the solution within a few minutes.

Second, I learned to remember the things that aren’t visible. Maybe it’s a piece of code that someone else wrote that you never saw that runs in the background or maybe its a whole repressed segment of society, but there’s always more to be seen than meets the eye and we’re not capable enough or thoughtful enough to always keep that in mind.

http://en.wikipedia.org/wiki/Jackson_Pollock
http://en.wikipedia.org/wiki/Jackson_Pollock

In conclusion, lots of my experience with engineering is like this: you spend heaps of time trying to visualize the end but when it’s all said and done, the solution is basic – like an honest-to-goodness Jackson Pollock.