Monday, February 1, 2010

Python 3000, or Confuse-A-Coder

 (Photo by:Matthew W. Jackson)

So here I am, newbie pythonista. Writing code. On my computer. On other people's computers, using Portable Python. Doing new, geeky things. Impressing boring my wife. (Nitey-nite, honey.) This is truly beautiful. I now almost grok Dive Into Python, and have actually even read parts of it. I can even pick Guido van Rossum out of a police lineup, if needed.

OK, to be honest, How To Think Like a Computer Scientist is more my speed, and I have worked through all of that. (Alas, I still think like Mike.)  I even did the entire MIT 6.00 class, (which now has video!) although I do not yet grok the dynamic programming solution to the knapsack problem. I did study electrical engineering, and even earned a BSEE, but math/CS types study some different stuff, I think. Or perhaps I was dozing through class? Either could be true, as far as I can tell.

Things change over time. Some changes are a good thing. Isn't there anything you would like to change in your life? Wouldn't would you like a few do-overs? I sure could use a few. The Python guys seem to think so, too. Python is transitioning to Python 3, and things are bit different. Some things have been really done over. Here are some differences that I have found, explained simply.

Print is now a function and not a keyword.
This means you have to say:  
print()
instead of :
print
You can no longer say: 
print 'spam spam spam',
to suppress the newline. You now must say:
print('spam spam spam', sep="")

Yes, I know, this is gripping material. Apparently someone, somewhere wants to be able to redefine print to do something useful, like making spam musubi topped with foie gras, or confusing a cat. This is not possible with print defined as a keyword, so get used to the parentheses. A good way to do that is to put the following line near the top of all your Python 2 code:
from __future__ import print_function
This will enforce the Python 3 syntax, and you won't have to change your code if you ping-pong between Python 2 and Python 3, as I do.

Integer division can now return a float. 
Before Python acted like, well, the C programming language that I remember learning at night school. If you divided two integers, your result was always an integer. So 5/2 would return 2
(Note that the sentence lacks a period because 2 is an integer, while 2. is a float. Adding a period is confusing.)

In Python 3, if you divide two ints, you can get a float. In Python 3,  5/2 will return 2.5
(Again, I left the period off on purpose.) This is the correct answer, after all. I think Guido van Rossum changed his mind here. No biggie, unless you do as much crap ass integer calculation that comes from working  Project Euler  problems.

To get used to this new behavior, put  the following line near the top of all your Python 2 code:
from __future__ import division
If you want to use integer division, use the integer division operator  // instead of the division operator /

If you want to work on Project Euler problems, seek professional help.

Strings have changed.
I've just started working with this. According to Guido,
Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however encoded Unicode is represented as binary data.
I'm still sorting this out so I can't help you much here. To enforce Python 3 strings text,  put  the following line near the top of all your Python 2 code:
from __future__ import unicode_literals

My current project, a Python synthesizer noisemaker,  pukes and errors out  if I use this line. Works fine in Python 2.6.4 without it. I seem to be confused about using the wave and struct modules. At least I think that's the problem. I'm not using much else in the way of strings text in the program. I plan to sort this out down the road.

A lot of functions that used to return a list now return an object that is not a list.

Guido writes:

Views And Iterators Instead Of Lists

Some well-known APIs no longer return lists:
  • dict methods dict.keys(), dict.items() and dict.values() return “views” instead of lists. For example, this no longer works: k = d.keys(); k.sort(). Use k = sorted(d) instead (this works in Python 2.5 too and is just as efficient).
  • Also, the dict.iterkeys(), dict.iteritems() and dict.itervalues() methods are no longer supported.
  • map() and filter() return iterators. If you really need a list, a quick fix is e.g. list(map(...)), but a better fix is often to use a list comprehension (especially when the original code uses lambda), or rewriting the code so it doesn’t need a list at all. Particularly tricky is map() invoked for the side effects of the function; the correct transformation is to use a regular for loop (since creating a list would just be wasteful).
  • range() now behaves like xrange() used to behave, except it works with values of arbitrary size. The latter no longer exists.
  • zip() now returns an iterator.
I grok some of this, and I have run into problems with these changes in the wild. Especially when plagiarizing copying using other people's code. You can't make lists anymore by saying things like:
s = range(1000)
In Python3, s is not a list. This idiom is real popular in the wild. Remember this when you use Google to find someone else's code to solve your problem. It doesn't work any more. You must explicitly create the list:
s = list(range(1000))
if you really want s to be a list.

I tend to read in numeric data from text files, and it winds up as strings in a list. Being the schemer that I am (more later, maybe, on the joys and sorrows of Lisp) I would use map() to convert all the strings in a list to integers. Here's a line of code that I wrote on 13 Mar 2009:
temp1 = map(int,data.split())
For Python 3, you would have to say:
temp1 = list(map(int,data.split()))
A better solution is to stop writing Lisp code in Python and use a list comprehension, which works in both Python 2 and Python 3:
temp1 = [int(x) for x in data.split()]
which is much more pythonic, and cleaner once you start comprehending list comprehensions.

There's an automagic script, 2to3, which I have never used because I feel that I am still learning Python, and I need to solve this kind of problem using my finely tuned mental machine. Or all by myself, anyway. It may be useful.

There is a -3 switch that will warn you about things that will not work in Python 3. I am starting to use this switch, but I think that some of the results are so not my problem to fix:

 mikey@hatshepsut:~/workspace/mysynth/src$ python -3 mysynth
/usr/lib/python2.6/site.py:1: DeprecationWarning: The 'new' module has been removed in Python 3.0; use the 'types' module instead.
  """Append module search paths for third-party packages to sys.path.

Another problem is that many useful libraries are not yet available for Python 3. No numpy  yet, and of course nothing that depends on numpy. I think this stuff will exist in Python 3, but not yet. The best thing do right now is to stick with Python 2 if you need something that doesn't exist yet in Python 3, but use the from __future__ import statements. (I'm working on it...). Otherwise, start using Python 3. Unless you're actually paid to write Python code, and your boss says No Python 3 for you!

 At least, thats what I'm doing for now.

Cheers!
References:
 Whats New in Python 3.0  by Guido van Rossum
A clear explanation of list comprehensions . by Olli
How To Think Like a Computer Scientist  (Uses Python 2.) 
MIT 6.00 Introduction to Computer Science and Programming (Uses Python 2.)
Dive into Python 3  by Mark Pilgrim (A challenging read for me.)





No comments:

Post a Comment