C++ vs. Python: Perspectives...
I have previously discussed a speed comparison between C++ and Python. For the nitpicky, it was actually a comparison between the gcc/g++ 4.4.1 and CPython 2.6.4. Now, I will try to address where Python beats C++: at ease of coding and understanding it.
As a vehicle for this, I am using the classic "calculator problem". That is, given a mathematical expression such as "2+3*(5-2)-(9*2)^(4-2/2)", the program can give the result, or -5821 in this case. This is a particularly different problem because of parentheses and because of the order of operations in math. This regular mathematical notation is called infix. To convert it to something easier to calculate, postfix notation is used.
Quick explanation of postfix: 1+2 in infix is 2 1 + in postfix. Basically, operators can only be applied if they have two numbers to their left. This completely eliminates the need for parentheses, since 1+2*3 becomes 1 2 3 * +, and (1+2)*3 becomes 1 2 + 3 *. Sadly, this system does not work very well for human use because 1 2 + 3 * can be misinterpreted as 12 + 3 *, which makes no sense. However, computers have no problem with that, so by converting to postfix first, we simplify our problem by a ton.
The way the algorithm works is as follows (thanks to my Data Structures and Algorithms class for reminding me of it):
- Parse the inputted expression into a list of numbers and operators (parentheses count as operators)
- Create a stack data structure for operators, and a list for the postfix equation.
- Cycle through all elements in the resulting list. For each of them:
- If it's a number, put it directly at the end of the postfix equation
- If it's a left parenthesis ((), put it directly on top of the operator stack.
- If it's an operator, check if the operator is of a higher precedence than the topmost item in the operator stack. If it is of equal or lower precedence, take the top operator off the operator stack and put it at the end of the postfix equation. Repeat until the topmost item is either of lower precedence, or is a left parenthesis, or the stack is empty. Then, put the operator on the top.
- If it's a right parenthesis (``)``), remove one operator from the top of the stack and put it at the end of the postfix equation. Repeat until a left parenthesis is removed. Do not put any parentheses on the postfix equation
- Take any remaining operators in the operator stack off the top one by one and put them at the end of the postfix equation. The postfix equation is now ready.
- Create a stack data structure for numbers.
- Cycle through the elements in the postfix equation, left to
right. For each of them:
- If it's a number, put it directly on top of the number stack.
- If it's an operator, take the top two numbers off the number stack, and perform the operation on those two. The second number pulled off the top of the stack is the first in the expression. Place the resulting number on top of the number stack.
- The answer is the only item left on the number stack. You're done!
A non-programmer upon seeing the above.
For those who would rather look at my code, I have uploaded the C++ version and the Python version.
You can judge the readability of them yourself, but I will just point out some differences:
Duck Typing
In Python, I don't need a base class to be able to make a list or data structure of certain elements. Python doesn't care for what class an object is. It just works with it. Similarly, the objects passed into methods can be anything. The pros? Conciseness of code for one. What is:
vector<Piece*> parse(const string& expr)
in C++, is, in Python:
def parse(expr)
The downside? Unexpected inconsistencies in what classes your objects are can cause problems much after the program appears to be running fine. This is what happened to me with the wrapper problem. C++ maneuvers around that by informing you of type mismatches at compilation time.
Class Awareness
In the C++ version, to be able to tell between Operators and Values, I had to write a specialized isOperator() method, which had a different implementation in both subclasses of Piece. This is because objects in C++ are not aware of the real class they are. It all depends on the type they're declared as. On the other hand, Python objects have a __class__ attribute, which becomes the class object of the class they are. There is also the instanceof() function for the same type of functionality.
Anal Code
For the same reason C++'s strict typing is a good thing, C++ keeps track of which references/pointers/values are constant and which are not. This is to more report mistakes that programmers make at compilation time, rather than having to log through tons of wrong output that's wrong for a very strange reason. Take the following function, for example:
// Inside some container class of some kind void printElementsUntilNotOne(){ int i = 0; while(elements[i] = 1) { cout << elements[i]; i++; } }
Not only will this code set all the elements in the array to 1, but it will segfault when it overflows. However, since the programmer knows that such a method should never change the elements, but merely print them, a const can be added at the end of the definition to specify just that.
void printElementsUntilNotOne() const {}
The compiler would then complain about setting elements to 1, not allowing the program to ever run in its broken form. Since this could do modifications to a database and has the potential to wipe data, it's a useful security feature. The downside? Very very anal-looking code. Plus having to substitute const_iterators for iterators, or other such inconveniences.
Standard Libraries
Check the number of imports I had to do in Python vs. the number of includes I had to do in C++. It might be a personal taste, but I would rather have a fully-stocked standard library to start with rather than having to import its bits and pieces.
Plus, there are some nonsensical choices made in the C++ libraries that just bothers me. For example, the pop() method of a stack does not return the popped element. Why?! Ever since the first stack I implemented in Pascal it seems like common practice/courtesy to return the top value!
In addition, for a 21-st century language, C++ lacks a lot of ability (and inherits this problem from C) to turn strings into numbers. Do I really have to use a stringstream to do this? And why doesn't the stringstream have a simple clear() functionality? clear() clears its flags, and then I have to set its contents to empty. Wouldn't it be simpler to have clear(), and clearFlags()?
Then again, Python's stack implementation isn't very optimised: it's the basic list itself. For most usecases this doesn't make a difference, and the choice does slim the library down, but when large amounts of clears are necessary, it may be slowww... Slower than it already is.
Code Neatness
Compared to Python code, C++ code is symbol laden, and syntax heavy. A personal choice, but it makes me keep C++ just for "serious business" coding, while I am more than happy to use Python for my smallest programming whim.
SLOC (Source Lines Of Code)
C++ programmers are traditionally (is this actually true?) paid more than Python developers. Maybe because Python code is easier to write, maybe because of the smaller learning curve, maybe because C++ is more secure against stupid programmer mistakes. While I don't quite understand the former, I can sort of understand why the average C++ programmer may be more "pro" than the average Python programmer.
However, what ticks me off is when people use the number of lines of code as a justification for higher pay. Just because the C++ program has two times as much code as the Python one (195 vs. 92) does not mean it's worth any more.
Heck, I can rewrite the Python program to work just as well in one line. Watch:
print(eval('**'.join(raw_input('Expression? ').split('^'))))
I'll agree with lots of the issues with C++ you mention. The const keyword has to be my biggest annoyance with C++. It is useful some corner cases, but overall it causes more annoyances than it helps, in my opinion anyways. References are also really annoying when overused, and the lack of a hash table in STL is annoying, though there are certainly extensions that support them.
However, there are many many pages about C++'s pitfalls, so instead of going down that road, I'd like to answer this: "For example, the pop() method of a stack does not return the popped element. Why?!"
This is true for all of STL's pop functions. This is done for two reasons:
1) Speed. Objects in these containers are stored by value. top() returns the value by reference, which is fast. This is possible because the object being returned still exists in the container, so a reference is enough. Your code can then do whatever it wishes with that object. If pop() were to return the object as well, it would have to return the object by value. That operation would invoke the copy constructor, which is inefficient and unnecessary.
2) Prevent possible exceptions. As I said in (1), if pop() were to return the top object by value, the copy constructor would be called. This means that exceptions could possibly be thrown by the object's copy constructor.
Python isn't perfect either. For example, the python reverse() method of a list doesn't return the reversed list. Why?!
I kind of like ruby's solution to that problem, where you put an exclamation point if you really mean it (and want your list to be changed), and just call the method normally if you want it to return stuff and not change your list.
Testing comments.