You have probably heard the virtues of Haskell and Ocaml being extolled by professors and intellectuals online. They have a cult/evangelist-like following that reminds me of Ruby a few years ago. The only problem is that they are almost as old as Ruby, yet they have not gained much of a mindshare. Here is an attempt to explain that situation.
First, here is why Haskell and Ocaml should be considered for a mainstream language.
The Good
- Strong Typing
- Better crash protection than C. No dynamic duck-typing uncertainty. Faster programs.
- Type Inference
- As many C# programmers have found out, its possible to have the convenience of type checking without having to type “ExtremelyLongTypeDeclaration” for each variable.
- Easier Parallelism
- The functional model makes it easy to create safe parallel code.
With these 3 killer features, it is confusing why so many people are still doing C# and Java.
The Bad and Ugly
Haskell
Refusal to fix hash tables.
Hash tables are a widely used data structure. Who can resist O(1)1 access? Apparently Haskell users. Haskell is notorious for having a slow hash table that people have complained about for years. The problem is that the Haskell garbage collector traverses the entire array “needlessly”. You can see a benchmark posted by Jon Harrop2 which shows the problem.
Haskell supporters will tell you that you should be using a tree structure anyway. However, no tree-like structure has the performance of a hash table. The best suggestion I had was to use a ternary tree which was O(log n + k) where n is the number of words stored, and k is the length of the key. This is still quite a bit worse than a hash table. Even though I have not done any actual benchmarks, a quick calculation will tell you that for k=10, a ternary tree would be slower for only 60,000+ elements. Not a big number.
1 actually O(keylength) where keylength is likely to be small
2 Harrop sells Ocaml documents, so he is not unbiased. However this is an actual gripe.
Lazy Evaulation
This is a big one that is unlikely to be fixed. Many Haskell supporters like to think that lazy evaluation is the best thing since sliced bread. However, I believe that it causes bigger problems than it solves.
It becomes very difficult to reason how long your code will run for. Everything becomes a series of thunks inside thunks, which also means that your memory usage will grow like a balloon, and then pop as its being evaluated. If you really need infinite lists (you don’t), you can always implement them as generators in a strict language.
Another side effect of laziness is in multi-threaded programs. It becomes possible for all the thunks to end up being evaluated on 1 thread, even though you specified it to run on multiple threads.
Simon Peyton Jones has said “the next Haskell should be strict”. It will probably be a very very long time before we see a strict Haskell.
In Windows and OS X, GHC (the main Haskell compiler) automatically forces your program to be GPL (open source) upon compilation.
This is a little known fact. When GHC compiles your program in those OS’s, it will statically link the LGPL GMP library. A static link (and distributing the program to others) will trigger the GPL clause which forces you to release the source code of your program. Although a fix is supposedly coming in a couple months, it requires the user to recompile GHC.
It is just a bit disconcerting to think that this was the default all these years.
Ocaml
Lack of parallel GC. Lack of concurrency.
Ocaml can only use 1 processor. It effectively has a GIL just like Python and Ruby. Here is a quote from the creator of Ocaml:
Too complex; too hard to debug (despite the existence of a machine-checked proof of correctness);and dubious practical interest… In summary: there is no SMP support in OCaml, and it is very very unlikely that there will ever be. If you’re into parallelism, better investigate message-passing interfaces.
- Xavier Leroy
Bizzare Syntax
This is a small point, but is probably one of the biggest reasons why it is not more mainstream.
Crashes on Windows
I used Ocaml a year ago, and experienced actual hard crashes when running the interpreter on Windows. Hard crashes are not supposed to happen in strongly typed languages if you don’t use foreign functions.
Related posts:








Based on what do you assert that these are reasons for not adopting the languages? While I can buy that lazy evaluation is a barrier to entry, I am *extremely* skeptical that the other points you raise about Haskell are at all relevant in people’s decision to adopt the language and toolchain.
Why? Hashtables and the GMP story on Windows are side stories. Users are simply not asking about them, because they are non-issues:
Hashtables: you can use hsjudy from hackage if you need mutable, imperative hashtables. It’s been available for years, http://hackage.haskell.org/package/HsJudy
GMP: the IHG has already funded development to make GMP a non-required component of GHC, and all systems already support dynamically linked GMP, http://haskell.forkio.com/gmpwindows I simply do not see developers avoiding Haskell because of libgmp.
There are plenty of reasons Haskell isn’t a mainstream language. These aren’t those.
Thanks for the response Don. How did you reply so quickly?
I took a look at HsJudy. I wondered, if its so good, why isn’t it incorporated into GHC? I found out that the answer is because it is LGPL and it is actually a tree structure (and thus still not as fast as a hash table).
http://judy.sourceforge.net/
http://www.nothings.org/computer/judy/
Also about GMP, I noticed that link. But my point was that if there are gotchas like that when people are compiling with default settings, it could put a business at risk if they inadvertently didn’t know they would make their source code GPL.
Be very careful not to confuse issues, as you’ve made a number of mistakes in this article.
Judy arrays: you’re misunderstanding them. They provide a fast, scalable hashtable type. They’re famous in fact, as a general purpose mutable dynamic array library.
“Judy can replace many common data structures, such as arrays, sparse arrays, hash tables, B-trees” … they’re a great hashtable library for C, that’s also available in Haskell.
If you find a compelling need for a data structure, that isn’t available on Hackage, please let the Haskell community know.
Regarding licensing — licensing is *always* a risk to a business. That’s why any sensible manager will evaluate the license of the tools they intend to use. The GHC use of libgmp – an LGPL library that is used in a wide number of language implementations – is well documented.
It is incorrect to say that statically linking libgmp makes your “source code GPL.”. It simply doesn’t, and doesn’t say anything in fact about what happens to your source code.
Now, redistributing a binary that is statically linked against an *L*GPL library, like libgmp, imposes a relinking requirement. That’s certainly something we might want to avoid, hence the process for dynamically linking.
I work at a company that sells applications implemented in Haskell. We pay attention to the licenses of all parts of our toolchain. Anyone considering using any software, in a business context, should be doing the same.
I haven’t confused any such issues.
I’ve taken a look at Judy arrays, which as described by their author, is a 256-ary trie with pipeline hacks. It appears to be very competitive with hash tables, but as the benchmark and theory shows, it still has some performance issues. If it was as good as you actually say it was, then we would all be using it, right?
And about LGPL, I have studied these licenses in depth. If you don’t distribute the program, the law is effectively unenforceable anyway.
Its just that for most of Haskell’s life, companies have been making money by selling distributed binaries rather than relatively recently selling it as a web service on the Internet (and thus avoiding the GPL clause).
If you take a look at mainstream toolchains such as GCC, Python, or Visual Studio, they all have license exceptions for the runtime library, and Python even has a multiprecision number library written from scratch. There is no need to worry about accidentally GPLing your source code should you choose to sell your software.
“If it was as good as you actually say it was, then we would all be using it, right? ”
Slick argument.
I think you missed my point about HsJudy: HsJudy shows that you can write a Haskell binding to any imperative, mutable hashtable implementation of your choice. There are even existing ones on Hackage for you to choose from. If you need a mutable hashtable for some reason, grab one from Hackage.
@John
I’m going to assume you are being sarcastic due to your blog content. Ignore this if you aren’t.
Its perfectly reasonable to question Judy arrays. Its been out for 8 years, yet this is the first time I’ve heard of it, and it is not being taught in any CS curriculum that I know of. Isn’t it silly to claim that it is as good or even better than a hash table? Extraordinary claims require extraordinary proof.
Even more amusing is the fact that in the Great Programming Language Shootout, you have people writing their own hash table from scratch in Haskell because the default one is so slow.
The primary reason I haven’t switched from Perl to Haskell (or similar languages) is the lack of an analog for CPAN. CPAN does have a lot of dreck, but it also has a decent implementation (at least partial) of pretty much everything I could ever want to do.
The language can be as neat as you could want, but unless there’s a decent toolchain, it’s nothing more than a toy.
Are you familiar with Hackage? We have > 1500 open source libraries and tools there, for pretty much everything you could want. More than any FP language (by a wide margin): http://hackage.haskell.org/
Just thought I’d point out that as a long-time Perl hacker, I find HackageDB to be almost as good as CPAN. Rarely have I been writing Haskell and been impeded by a lack of a library. (I can’t say the same about Common Lisp or even Ruby.)
cabal, cabal-install, and GHC’s handling of packages is also light-years ahead of CPAN.
It is a shame that people won’t check this out for themselves and instead assume that since Haskell originated in academia it’s unusable for “real problems”. Oh well, people are weird
“Its perfectly reasonable to question Judy arrays. Its been out for 8 years, yet this is the first time I’ve heard of it, and it is not being taught in any CS curriculum that I know of. Isn’t it silly to claim that it is as good or even better than a hash table? Extraordinary claims require extraordinary proof.”
It is perfectly reasonable to question them, but to argue that they aren’t a good solution to a given problem because we aren’t all using them or this is the first time you’ve heard of them, is not.
My apologies for the sarcastic response, that served no one.
My main point was that 8 years is plenty of time for a good groundbreaking algorithm to become mainstream.
It is basic knowledge that tree structures do not have the same performance characteristics as a hash table.
And it’s basic knowledge that hash tables are not god’s answer to sliced bread. Just because Perl, Python, Java, and friends ship with them out of the box does not mean they’re the most efficient map structure available. For one thing, they suck entirely at doing unions. For another, they have high memory overheads which can lead to catastrophic slowdowns with many small hashtables. For a third, in industrial-scale computing tries will generally win out due to structure sharing (why do you think they teach them for search-engine design?).
The reason they teach hashtables is because their suckiness is evenly spread so they perform well on average. For any task where you have deep knowledge about your data or access patterns there exists a datastructure which will outperform hash tables for that set of data and access patterns.
There’s another mistake (or oversight?) in this article. You talk about laziness make it possible for computations to migrate from one thread to another, which can have an impact on how much work is done in parallel.
This is true. Parallelism requires a careful mix of lazy and strict evaluation. Too strict, and you do too much work up front, too lazy, and work is deferred.
Finally, laziness as a concept is absolutely vital to futures, speculative evaluation, and the entire class of “parallel strategies” implemented in Haskell’s parallel library.
It’s not “all or nothing”. It’s the right evaluation for the right task, and true multicore parallelism seems to require a delicate mix of laziness, eager eval, and strictness.
We can all agree that parallelism requires both laziness and strictness.
However, it is the amount that you are mistaken about. In many parallel programs in strict languages, you have a small set of “lazy” jobs distributed to many different threads that evaluate that logic strictly, and most importantly in that thread. In a language that is strict by default, it is easy guarantee that this happens.
Having everything lazy by default in Haskell, makes this harder. It goes against the principle “common situations should be the default”. If you still don’t think so, I would like to remind you that Simon Peyton Jones and many mobile developers believe otherwise.
I’m the author of the strict-concurrency package, so yes, I’m well aware of this.
In particular, synchronization mechanisms should be strict by default — something that Haskell provides.
Just because Ocaml doesn’t do threads doesn’t mean it can’t do concurrency.
Mauricio Fernandez did very well in Wide Finder 2 with Ocaml using fork():
http://wikis.sun.com/display/WideFinder/Results
http://eigenclass.org/hiki/widefinder2-conclusions
You haven’t specified exactly *which* operations hash tables are fast at. Insertion? Union? Deletion? Intersection? Lookup? Difference? Also, which hash datastructure?
Just to illustrate my point, here’s a benchmarked, mutable associative array structure hacked together in a few hours. Anyone with a real need to “fix hashes” could have done this:
http://donsbot.wordpress.com/2009/09/26/very-fast-scalable-mutable-maps-and-hashes-for-haskell/
It scales well. Enjoy.
How about a non-LGPL version?
> Even more amusing is the fact that in the Great Programming Language Shootout, you have people writing their own hash table from scratch in Haskell because the default one is so slow.
Is that a fact!
You mean they didn’t just translate the hash table from Clean, that I translated from the simple_hash.h that Doug Bagley provided for his olde The Great Computer Language Shootout back at the end of the millenium?
Is your fact also why the Pascal program includes a Pascal translation of simple_hash.h ?
> Is that a fact!
Yep it is a fact. Did you disprove it? No. The benchmark is right there in front of you showing that Haskell is taking 10x longer than C++.
> Is your fact also why the Pascal program includes a Pascal translation of simple_hash.h ?
What is your point? Haskell supporters often cite its speed and brevity in GPLS. Instead of using the bundled hash table, the fastest Haskell program needed to implement their own, losing out in brevity. If this is not indicative of a problem in GHC, then what is?
> Yep it is a fact. Did you disprove it? No. … What is your point?
Listen and you might hear the point.
Initially, simple_hash.h was a basis for comparison across languages.
Initially, the obvious comparison was between Haskell and Clean, so it was obvious to translate the hash table from the Clean program.
Your speculation about why the Haskell program includes a hash table is wrong.
Actually you are wrong. Next time be sure to read the entire article before you make uninformed assumptions.
You’ve obviously missed important information such as this link below and unsurprisingly made a Haskell fanboy comment without verifying the truth to your statements.
http://www.haskell.org/haskellwiki/Shootout/Knucleotide#On_mutable_data_structures_in_Haskell
I’m surprised you didn’t mention F#, which has .NET for its concurrency support, and IIRC has OCaml-like syntax.
Its not old enough.
While it does solve all those problems you’ve mentioned, it opens up a whole can of worms on its own:
1. The only implementation has a non-commercial Microsoft license. MSR-SSLA
2. It uses the .NET framework.
3. Bad linux support: Mono uses a conservative collector (hello memory leaks).