C++ vs Java vs Python vs Ruby : a first impression
Executive Summary
I am a language agnostic journeyman programmer. I am not a fan of a
particular language (I almost said 'fanboy') but
thats a bit inflammatory). I just want to write useful programs
and have fun doing it. I know C++ and Java pretty well. I did some
beginner work in Python and Ruby. I then
came up with the following conclusions. But before you flame, read the
whole article.
Go right to the
side-by-side comparison
- C++ vs Java
Java garbage collection is the big
productivity gain Java is significantly slower than C++ C++ is
(much) harder to code correctly than any of the others - Java vs
Python/Ruby
Python/Ruby interpreted execution and
dynamic typing are big productivity gains over Java. Python/Ruby are
slower than Java Python/Ruby
programs need less extraneous scaffolding (cleaner code)
There are two important tradeoffs : [interpreted vs. compiled] and
[static vs. dynamic typing] - Python vs. Ruby
nearly
equivalent
intro
When running various distributions of Linux, I always ran into the
choice of KDE or GNOME. There are plenty of
advocates on both sides, but there was no overriding authority. Then
recently Linus Torvalds came out with a
definitive opinion. He took the unequivocal position that
KDE is best. Not that he is necessarily the final arbiter of user
interfaces, but at least he provides a strong
datapoint, and since he is smarter than me and since all the other
opinions seem to come from biased sources, I
can now pick KDE and feel better about it. Paraphrasing the old IBM
criteria, 'no one was ever fired for following
a Linus directive'. Heh, after all that it turned out that I wanted to
use Ubuntu which works best with GNOME so
I ended up with that for now. So 'most practical' won out over 'best'.
In the meantime I realized I needed to learn a new language and the
current buzz is Python and Ruby.
Again I couldn't find a definitive answer of which one is best. From all
the buzz,
I came up with a vague impression that Ruby is more pure and
is set to win in the long run, but that Python is currently more
practical for now. And Google uses Python, which
is a significant datapoint. They aren't idiots over there.
To see what I could figure out myself, I decided to code up something in
Java, then port it to Python and Ruby and
see how I felt about each, and try to identify where the big wins are
for each language.
One caveat. If something significant is missing from a language, like
garbage collection, then I don't
want to hear a response that says
"well, if you use XYZ unsupported library, or you do ABC convoluted
technique, then you can do the
same thing in [put language name here]". I am trying to evaluate the
STANDARD here, since of course you
can probably do anything in any language including assembler if you work
hard enough. And the problem with using a nonstandard
library is not just the extra integration work, it's that you are basing
your code on something that may
fall by the wayside later on and then you are stuck. Sometimes it's
worth it but that has to be proven
on a case-by-case basis as far as I am concerned.
history
I started in C in 1985, learned C++ in 1990 (Zortech C++) and have been
using it ever since. I learned Java
in the mid-90's when it was first coming out, and found three big win's
for Java over C++: Garbage collection,
portability and simplicity. Garbage collection and simplicity created
big productivity gains, and portability
is portability. Not having done a garbage collected language before, the
productivity gain was readily apparent.
The simplicity of Java over C++ was really nice. When coding C++, I
needed Meyer's Effective C++ on my desk
at all times
to be sure I wasn't invoking some weird type coercion or copy
constructor/assignment operator anomaly. And don't
even start with templates. With
Java I never needed that because it is just simpler.
And the Java libraries were more comprehensive and string handling was
easier. So in general I was more productive coding
away in Java. I still liked C++ but it seems that when programming C++,
the fun is in figuring out the language and library
, like solving a puzzle. That leaves less time to spend on solving the
application domain problem.
the code
The attached code samples are implementations of a Red-Black tree
algorithm adapted from descriptions in
"Algorithms in C++", Sedgewick and "Introduction To
Algorithms",Cormen/Leiserson/Rivest. I picked this
because it was short but had some complexity.
code notes and disclaimers:
- commenting is sparser than usual to avoid obscuring code
- I
probably made some convention errors in Python and Ruby due to
ignorance of the proper idioms
- all these programs compile and/or
run without warnings and output
the same result
- I believe the programs to be correct. there may
be bugs but if so
they are in all 3 versions
- Java 5.0 SDK,Python 2.4, Ruby 1.8.3,
C++ Microsoft Visual C++ 2005
porting
It was surprisingly easy to port the Java code to Python and then port
the Python to Ruby. A lot of it was regular
expression search and replace, getting some naming conventions right and
adapting to a few language differences.
During the porting process, the two big gotchas I ran into were Python
block indenting errors and Ruby's horrible
compiler diagnostics.
Porting the Java code to C++ was much more a hassle. I attempted to
make use of as much static type checking
mechanisms as I could. In Java I used generics for the tree, and in C++ I
used templates for the container and
'const' where appropriate. The big gotchas on porting to C++ were:
- The dichotomy between primitive types and objects in C++ is much
more pronounced even than Java (and Java is worse off
than Python or Ruby). This dichotomy makes it hard to write a class that
supports both primitives and objects. My
implementation might need some fixups to work with objects rather than
'int'.
- Java,Ruby and Python all use a consistent reference only
scheme to
refer to objects which are always on the heap
or equivalent. In C++, you can have a statically
declared objects, a pointer to an object, or a reference to an object,
each with features and limitations. A C++ 'reference' is
not the same thing as a reference in the other languages. C++ really
wants you to use pointers. These alternatives means that
when you write something in C++ you have to come up with a consistent
strategy for using the 3 types of object access, and your
strategy might not be the same as what others prefer. There
is 'more than one way to do it'.
- The lack of built-in mechanisms or even just conventions for
operations that should be common across types means
you have to make things up. Like converting a type to string
representation. All the other languages have support of one
kind or another but in C++ you have to make up your own convention
- Maybe
its just me, but C++ always leaves you wondering what you
might have done wrong. Its hard to tell. If you read Meyer's Effective
C++
you see that there are numerous detailed infrastructure things like
constructors and assignment operators that you have to get exactly right
or things fail
at runtime. C++ is really hard to get right, and I never feel totally
secure that I did it properly
In my opinion (and I have written a lot of C++), use C++ only where you
have to for compatibility or performance reasons, or where you
arbitrarily decide that
you would rather use C++ because its more fun because its harder. As Tom
Cargill (a noted C++ guy) said,
"If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor and when was the
last time you needed one?".
python block indenting
It took me a while to get my editor (JEdit) happy with Python and
getting to not use tabs. Fortunately
I never screwed up the file so bad that the code didn't work but I
always had an unpleasant uncertainty
about the indentation simmering in the background. Some (or all) of this
may be prejudice. I really liked braces better
than either Ruby Do-End or Python indenting, at least when I was coding.
On the other hand, a properly indented Python file looks much much
cleaner and is easier to read than any of the others because you don't
need all the block closing symbols. However
the explicit 'self' argument makes it look less clean than it could.
ruby syntax errors
Many of the compiler diagnostics I got during the Ruby porting simply
said "syntax error" and gave the line number for
the last line in the file. Great. I spent a lot of time doing binary
searches on my code to find the error source.
The visitor pattern
One thing I did differently in each language was try to adapt a
'Visitor' pattern (for traversing the tree) to
the preferred idiom for each language. You could of course simply code
up a Visitor class that is nearly identical
for each language, but instead I did the following.
- Java : one scheme : an anonymous class implementing a
predefined interface.
- Python : two schemes : a named class
similar to Java and just a
named function passed in as a parameter.
- Ruby : two schemes : a
lamba anonymous function, and a Ruby
block implementation
The Java and C++ approaches give you static type checking but takes a
lot more cruft to get going.
I found the Python named function parameter very convenient. But it
doesn't carry any state so if you
need state then you use a class. Surprisingly, I found the Ruby lambda
easier to
understand and implement than the Ruby 'block'. That is because my
traversal algorithm is recursive, and the lambda
just gets passed around as a parameter (like the python named function
parameter). I didn't exploit the
full potential of a lambda closure.
The Ruby block scheme (pun not intended) requires some tricky syntax in
the recursive calls, and
I could not find a good explanation of how to handle recursive use of
blocks in the Ruby documentation. I found
a single web hit with an example and after fiddling with it I got it to
work. I think I understand them now but it is still a bit
fuzzy. I mean, I know what to do now but it takes some concentration to
figure out what exactly is happening and why
the code looks like it does. I found that viewing a Ruby block as a
co-routine (per the documentation) and not as a subroutine to be the
best way
to understand the whole thing.
All that said about the Visitors, I am a Python/Ruby novice so possibly I
did things the hard way :)
interpreted vs. compiled
There has always been this tradeoff. In fact, the Python/Ruby vs.
Java performance controversy sounds a lot like the C/Assembler/Forth
discussions in the embedded systems world of the early 1980's. Forth was
interpreted, it didn't need a compile cycle and it had (supposedly)
productivity enhancing features that C didn't have. Development cycles
were much shorter with Forth. Performance was
not as fast as C or assembler but was close. The drawback of Forth was
the weirdness of the language. C won out and Forth
went to the dark corner of mostly forgotten languages.
Interpreted languages give you a much quicker development cycle,
especially on big programs. There is no doubting that.
Its simply a tradeoff of execution speed vs. productivity. Some
applications need the speed.
I think it is a premature optimization to say the "I like C/Java better
than Python/Ruby because they execute faster".
Interpreted is better if you can get it. When I was testing the code I
experienced the advantage of interpreted. I
didn't really measure performance but other sources show the
differences. But since Python/Ruby seem to interface to C/C++ pretty
readily, I would be very
comfortable working in the interpreted world and descending into the
netherworld of compiled C/C++ when required. Yes, Python
is actually compiled for a VM but you don't have an explicit compile
operation so it acts to the user like an
interpreted language.
static vs. dynamic type checking
Ok, I like the productivity increase provided by dynamic typing
because it eliminates a lot of scaffolding. I found
it quite interesting to see errors pop out at runtime that would
normally be compile time in C++/Java. These runtime errors were
obviously influenced by the paths taken in the test program (or how far
it got before it barfed). For a given run,
I clearly wasn't seeing all the instances of this class of errors as I
would have with static type checking.
Coming from a statically typed language background, my gut says that
dynamic typing creates a risk. The Python/Ruby bloggers say that if you
just unit test properly, then there is no problem. Brucke Eckel has a well
reasoned
essay on the issue. I would argue that expecting unit test to catch
typing errors has two issues:
- in a really big system, its hard to test exhaustively
- testing
for type correctness makes the programmer do the work that a
computer could do
We static typers may be wrong. I had a similar experience moving from
PVCS to Subversion version
control. Oh, My, God,
no locking? The code will be completely ruined in a week. But it turned
out to be a non-issue and Subversion added
so much less friction to the development cycle that productivity was
improved maybe 10%. The collective good experience overrode the
predjudice based theoretical 'proof'. The same argument can be made for
dynamic vs. static typing.
I wouldn't mind a separate 'lint' tool for Python/Ruby (is it
possible?). I use lint for C/C++ religiously. The whole
compiled language community is moving towards more static type checking
(Java Generics, for example) rather than
less. Are they all idiots? (don't answer that)
My conclusion is that dynamic 'duck' typing is more productive, more
pleasant, gives cleaner looking code but it incurs a risk that you
will get a runtime type exception in your application at some later
date. The risk is there, quit denying it.
the results
These results are meant to cover issues I noticed in the
porting/testing I did. Not an overall evaluation. If I
mention stuff that I didn't run into first hand, then throw that out.
C++ vs Java
- Garbage collection is THE big win for Java.
- Java
simplicity over C++ complexity is a big win for Java.
- C++ is
much harder to write and get right than Java or any of the
other choices
- C/C++ is way faster than Java
- Language
scaffolding requirements are similar for both
- C/C++ is the only
way to go for low level systems programming.
Java vs Python/Ruby
- interpreted vs. compiled is a big productivity win for
Python/Ruby
- dynamic typing is a big productivity win for
Python/Ruby
- Java is way faster than Python or Ruby
- minimal
scaffolding is a big productivity win for Python/Ruby.
Makes programming more pleasant not to have to build all the
infrastructure.
- mostly first class functions a big win for
Python/Ruby.
- built-in lists/arrays and hashes/dictionaries a
big win over Java []
and library based collections. Java 5.0 fixes some of this but
in Java collections still seem tacked on rather than integrated.
- dynamic
code loading in Python/Ruby is a big win. Yes you can do it
in Java but again, the cruft.
- Ruby OO completeness over Java
dichotomy between primitive types vs.
objects is a big win for Ruby, less so for Python.
- There is
some weirdness in Python and Ruby lexical scoping of names.
The documentation for each has several
warnings about edge cases where names don't bind in the expected way.
This gives me a queasy feeling although in
practice it may not matter. Another win for static type checking.
- Java
'Comparable' interface ugly compared to Python/Ruby built in
comparison mechanisms that require only that
a single function be implemented to get the full set of comparison
operators. An example of excess Java
scaffolding.
- lack of multiline comments in Python/Ruby was
annoying
TRADEOFFS
- static typing is a correctness win for Java, especially with
Generics in 5.0. The C++/Java trend is toward stronger static type
checking, not less
- dynamic typing is a productivity win for
Python/Ruby at the cost of
some risk
- interpreted vs. compiled trades off execution speed
for shorter
development cycles.
Python vs Ruby
none of these are that important
PYTHON WINS
- Ruby's compiler/runtime error messages were mostly 'syntax
error'
with no help. in many cases almost useless
- Why does Ruby use
rescue/ensure when the rest of the world has
settled on try/catch/finally? I mean, its an arbitrary
choice so why not follow the general convention?
- Once the
indentation is correct, a Python program is the cleanest
looking
RUBY WINS
- somewhat uneasy over Python indenting vs. Ruby explicit 'end'.
probably a predjudice.
- Python requirement for explicit 'self'
parameter to methods and
instance variable access is very annoying
- Ruby OO completeness
is a win over Python.
EQUIVALENT
- Ruby blocks/lambda/yield seemed more or less equivalent (to me)
to
Python's named class or function. Didn't seem
a big win to be able to write an anonymous function inline. In fact, one
could argue that anonymous
classes/functions/lambdas reduce testability because they can't be
tested independently of the containing code. But on
the other hand I wasn't using lambdas in the most complete sense, in
which they can act on the containing environment
in a way that a named function can't.
A final thought on C++. To me the C++ Standard Template Library is
distinguished from the other language libraries
in that it seems to be much more mathematically thought out.
The containers and algorithms in the STL all have explicit runtime
complexity guarantees. There seems
to be much more computer science in the STL than in the other language
libraries. Java is sort of like that, whereas Ruby
and Python libraries seem much more ad-hoc. That probably has a lot to
do with their open community driven
approach to libraries. I really like how the STL was thought out and
designed.
conclusion
Java is more productive than C/C++. Use C/C++ only when speed or bare
metal access is called for.
Python/Ruby is more productive than Java and more pleasant to code in.
There is a big question on static vs. dynamic typing. I contend that
static typing has to be better
for the purposes of program correctness, but the required cruft reduces
productivity. If actual practice in large
systems shows that in fact runtime typing errors don't occur often and
are worth the productivity tradeoff, then I will
bow to dynamic typing.
I can't come up with a definitive answer to Python vs. Ruby. They seem
very equivalent. Would choose based on practicality
in a given situation. My general feeling was that Python annoyed me in
ways that Ruby didn't, but I think those
annoyances would disappear if I was using Python all the time.
|