Tuesday, December 22, 2009

The most important part of Google's open letter

Today, Google published a previously internal email. It talks about the important of openness within the company. I thought it was extremely well-written, and brought up a lot of great and thought-provoking points. However, one part in particular really stuck out to me:

So if you are trying to grow an entire industry as broadly as possible, open systems trump closed. And that is exactly what we are trying to do with the Internet. Our commitment to open systems is not altruistic. Rather it's good business, since an open Internet creates a steady stream of innovations that attracts users and usage and grows the entire industry.


Generally, when a company professes its love for openness or charity or something similar, it does so with an airy "because we're so nice" attitude, and must be read with a grain of salt. This paragraph (and the explanations that followed) were extremely refreshing. There are very few philanthropic claims in this letter; it explains exactly why Google sees value in openness, and how it helps their business.

I really enjoyed reading this piece, but it was this aspect that allowed it to be enjoyable. Without it, it's just another fluff piece. With it, it becomes a powerful explanation of Google's actions and intents.

Wednesday, December 16, 2009

Snide remarks against women in technology

"It's this culture of attacking women that has especially got to stop. Whenever I post a video of a female technologist there invariably are snide remarks about body parts and other things that simply wouldn't happen if the interviewee were a man."
- Blogger Robert Scoble, responding to threats against tech author Kathy Sierra

Dreaming of Floyd-Warshall

Last night, I was studying the Floyd-Warshall algorithm. I finished at around 11:30, read a little of The Hobbit (I'm rereading it in preparation for the movie), then went to bed.

I dreamed that I was sitting inside a table cell. I was looking up above and watching the empty cells above me get filled in one at a time. The last cell in a row would fill with some gibberish, and then the first cell in the next row would fill. The filling was getting closer and closer to me. Eventually, my cell was filled (I remember feeling very stressed at this point). I looked forward, and watched my neighbors fill. I looked under me, and watched the cells beneath me fill.

Suddenly, I was off on the sidelines, watching another table (that I was not a part of) get filled. Soon after this, the dream ended.

I can only conclude that, in my dream, I was a member of one of the cells in one of the tables of the Floyd-Warshall algorithm. Once the table I was in was filled, the algorithm incremented k and started on a new table.

I notice that I never saw a third table. Perhaps I was in table k == n - 1, and the table in front of me was the last one. Or perhaps this particular version of Floyd-Warshall was optimized (as it should be) to use O(|V^2|) space, and my table (and thus, my life) was deleted from memory after its successor was filled.

Or perhaps I just have really deep-rooted issues, and dreaming about being inside an algorithm is nature's way of warning me that I should see someone.

Sunday, December 13, 2009

The Knuth-Morris-Pratt Algorithm in my own words

Over the past few days, I've been reading various explanations of the Knuth-Morris-Pratt string searching algorithms. For some reason, none of the explanations were doing it for me. I kept banging my head against a brick wall once I started reading "the prefix of the suffix of the prefix of the...".

Finally, after reading the same paragraph of CLRS over and over for about 30 minutes, I decided to sit down, do a bunch of examples, and diagram them out. I now understand the algorithm, and can explain it. For those who think like me, here it is in my own words. As a side note, I'm not going to explain why it's more efficient than naïve string matching; that's explained perfectly well in a multitude of places. I'm going to explain exactly how it works, as my brain understands it.

The Partial Match Table



The key to KMP, of course, is the partial match table. The main obstacle between me and understanding KMP was the fact that I didn't quite fully grasp what the values in the partial match table really meant. I will now try to explain them in the simplest words possible.

Here's the partial match table for the pattern "abababca":


char: | a | b | a | b | a | b | c | a |
index:| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
value:| 0 | 0 | 1 | 2 | 3 | 4 | 0 | 1 |


If I have an eight-character pattern (let's say "abababca" for the duration of this example), my partial match table will have eight cells. If I'm looking at the eighth and last cell in the table, I'm interested in the entire pattern ("abababca"). If I'm looking at the seventh cell in the table, I'm only interested in the first seven characters in the pattern ("abababc"); the eighth one ("a") is irrelevant, and can go fall off a building or something. If I'm looking at the sixth cell of the in the table... you get the idea. Notice that I haven't talked about what each cell means yet, but just what it's referring to.

Now, in order to talk about the meaning, we need to know about proper prefixes and proper suffixes.

Proper prefix: All the characters in a string, with one or more cut off the end. "S", "Sn", "Sna", and "Snap" are all the proper prefixes of "Snape".

Proper suffix: All the characters in a string, with one or more cut off the beginning. "agrid", "grid", "rid", "id", and "d" are all proper suffixes of "Hagrid".

With this in mind, I can now give the one-sentence meaning of the values in the partial match table:

The length of the longest proper prefix in the (sub)pattern that matches a proper suffix in the same (sub)pattern.

Let's examine what I mean by that. Say we're looking in the third cell. As you'll remember from above, this means we're only interested in the first three characters ("aba"). In "aba", there are two proper prefixes ("a" and "ab") and two proper suffixes ("a" and "ba"). The proper prefix "ab" does not match either of the two proper suffixes. However, the proper prefix "a" matches the proper suffix "a". Thus, the length of the longest proper prefix that matches a proper suffix, in this case, is 1.

Let's try it for cell four. Here, we're interested in the first four characters ("abab"). We have three proper prefixes ("a", "ab", and "aba") and three proper suffixes ("b", "ab", and "bab"). This time, "ab" is in both, and is two characters long, so cell four gets value 2.

Just because it's an interesting example, let's also try it for cell five, which concerns "ababa". We have four proper prefixes ("a", "ab", "aba", and "abab") and four proper suffixes ("a", "ba", "aba", and "baba"). Now, we have two matches: "a" and "aba" are both proper prefixes and proper suffixes. Since "aba" is longer than "a", it wins, and cell five gets value 3.

Let's skip ahead to cell seven (the second-to-last cell), which is concerned with the pattern "abababc". Even without enumerating all the proper prefixes and suffixes, it should be obvious that there aren't going to be any matches; all the suffixes will end with the letter "c", and none of the prefixes will. Since there are no matches, cell seven gets 0.

Finally, let's look at cell eight, which is concerned with the entire pattern ("abababca"). Since they both start and end with "a", we know the value will be at least 1. However, that's where it ends; at lengths two and up, all the suffixes contain a c, while only the last prefix ("abababc") does. This seven-character prefix does not match the seven-character suffix ("bababca"), so cell eight gets 1.

How to use the Partial Match Table



We can use the values in the partial match table to skip ahead (rather than redoing unnecessary old comparisons) when we find partial matches. The formula works like this:

If a partial match of length partial_match_length is found and table[partial_match_length] > 1, we may skip ahead partial_match_length - table[partial_match_length - 1] characters.

Let's say we're matching the pattern "abababca" against the text "bacbababaabcbab". Here's our partial match table again for easy reference:


char: | a | b | a | b | a | b | c | a |
index:| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
value:| 0 | 0 | 1 | 2 | 3 | 4 | 0 | 1 |


The first time we get a partial match is here:


bacbababaabcbab
-|
-abababca


This is a partial_match_length of 1. The value at table[partial_match_length - 1] (or table[0]) is 0, so we don't get to skip ahead any. The next partial match we get is here:


bacbababaabcbab
----|||||
----abababca


This is a partial_match_length of 5. The value at table[partial_match_length - 1] (or table[4]) is 3. That means we get to skip ahead partial_match_length - table[partial_match_length - 1] (or 5 - table[4] or 5 - 3 or 2) characters:


// x denotes a skip

bacbababaabcbab
----xx|||
------abababca


This is a partial_match_length of 3. The value at table[partial_match_length - 1] (or table[2]) is 1. That means we get to skip ahead partial_match_length - table[partial_match_length - 1] (or 3 - table[2] or 3 - 1 or 2) characters:


// x denotes a skip

bacbababaabcbab
------xx|
--------abababca


At this point, our pattern is longer than the remaining characters in the text, so we know there's no match.

Conclusion



So there you have it. Like I promised before, it's no exhaustive explanation or formal proof of KMP; it's a walk through my brain, with the parts I found confusing spelled out in extreme detail. If you have any questions or notice something I messed up, please leave a comment; maybe we'll all learn something.

Saturday, December 12, 2009

Unit tests need to be fast

If you have a unit test suite, you need to be able to run it fast. Like, in under a minute (if you're on a huge system with tons of tests, then each individual component's tests should run this fast).

Many programmers develop a reflex of hitting Command+S (or Ctrl+S) every few lines of written code. This is possible because saving a file takes under a second. Ideally, programmers should develop a similar reflex with unit tests. Maybe not every few lines, but after every "chunk" of work a programmer produces produce (a new or altered function, a new instance variable in a class, etc.), he or she should be running the test suite, even if they know it's going to fail. This is a fantastic rhythm to get into; it provides instant concrete feedback, and a visible goal to strive for at all times.

Unfortunately, it's pretty much impossible if your unit tests take minutes to run. No programmer is going to start a five-minute test run if they know in advance it's going to fail, and without doing that, no programmer can really get into the full unit testing rhythm.

To put it in simple terms, unit tests are one of the most important tools in the "fail fast" toolbox, but if the unit tests themselves don't "fail fast," there's no way they can do their job properly

Tuesday, November 24, 2009

Losing all respect for Rex Ryan

Not that I ever had much respect for New York Jets coach Rex Ryan in the first place, but I now have zero.

In the Patriots-Jets game on Sunday, with the Patriots up 31-14 and 30 seconds left on the clock, Tom Brady tried one last deep throw. It fell incomplete, and the Patriots punted on the next play. Pulled from Yahoo! Sports:

New York Jets coach Rex Ryan felt "disrespected" by the New England Patriots for throwing a deep pass with the game well in hand Sunday.


[...]


"We need to stop them anyway, so it's no biggie, but I was surprised, and I did feel a little bit disrespected," Ryan said.


Ryan added that he didn't know if Bill Belichick was behind the call, saying it might have been something Brady and Moss did on their own.


[...]


Ryan, who created controversy in the offseason when he said he didn't come to New York to "kiss Bill Belichick's rings," said he called a timeout with 5 seconds left as a response to the Patriots' play call.


If he wants to trash talk before every game against the Patriots, there's nothing particularly horrible about that. But if he follows that up by crying like a little baby when the Patriots do something that he perceives as "disrespect," he and the rest of his team should be embarrassed.

Tuesday, November 17, 2009

A method with no unit tests is a broken method

If you write software, you need to write unit tests. If you've written a method/function, and you haven't written a unit test for it, it's safe to assume that it's broken (even if it compiles and your other tests pass).

I'm not necessarily advocating full-fledged test-driven development. I'm just saying, if you release code into "the wild," and there are methods you haven't unit tested, your customers will almost certainly run into multiple bugs in each one of them.

That's an atomic point. Separate from that, I'd like to mention that this isn't always a bad thing. For a startup that wants to iterate as quickly as possible (and is writing non-life-critical software), writing the code with no unit tests, releasing it, and reproducing each customer-discovered bug with a unit test before fixing it is a totally reasonable model. These startups just shouldn't operate under the illusion that their software "works". In the hours after they make one of these releases, they should feel blessed if a single customer is able to register or log in.

Monday, November 16, 2009

Using C, convert a dynamically-allocated int array to a comma-separated string as cleanly as possible

EDIT: There are no "dynamic arrays", so to speak, in C. What I meant was "dynamically-allocated". I've updated the wording to reflect this.

EDIT 2: Someone on Reddit pointed out that my Python example doesn't actually work, since I have an array of ints rather than strings. I've updated the code example so it works.

I'm much less experienced in C than I am in higher-level languages. At Cisco, we use C, and I sometimes run into something that would be easy to do in Java or Python, but very difficult to do in C. Now is one of those times.

I have a dynamically-allocated array of unsigned integers which I need to convert to a comma-separated string for logging. While the integers are not likely to be very large, they could conceptually be anywhere from 0 to 4,294,967,295 In Python, that's one short line.

[code lang="python"]my_str = ','.join([str(num) for num in my_list])[/code]

How elegantly can people do this in C? I came up with a way, but it's gross. If anyone knows a nice way to do it, please enlighten me.

Tuesday, November 10, 2009

Any teachers or professors using classroom management software?

Every few months, I get an idea for a web app. My most recent one stems from hearing constant complaints from my UMass professors about the UMass classroom management software. It does a whole boatload of stuff, but apparently it's extremely hard to use. Keep in mind, I'm hearing that criticism from Computer Science professors. I'd wager a bet that it's much worse for professors whose subjects are unrelated to computers.

With that in mind, I've been working for the last month or so on my own classroom management web app. I doubt I'll find many teachers or professors with this post, but if I do, could you please leave me a comment? Specifically, I'd like to know if you use classroom management software. If you do, what do you use, and do you have any complaints about it? If you don't, what areas of your job would you like managed by software? Grading and student notifications ("You have a test in 2 days!") are the obvious ones to me, but are there any good ones I'm not thinking of?

Wednesday, November 4, 2009

Little Joys of Rails (Part 1) - Renaming Associations

I've spent the last month or so teaching myself the basics of Ruby on Rails. By far my favorite thing about it is that, at every turn, I find a little feature or helper that works really well and does exactly what I want without feeling messy or "magical".

Unfortunately, I have no one to share these little joys with, as pretty much none of my programmer friends know Ruby or Ruby on Rails. So, I figured I'd start sharing them on here. Hopefully, Rails newbies (like myself) will learn something, and Rails veterans will be able to chuckle knowingly as they fondly look back on the glory days of learning Rails.

Renaming Associations



If you have one model that belongs_to another, the association (in both directions) is named after the models. For example, if I have Users and Courses, and I want a course to belong_to its teacher (who is a User), I would start with the following:

[code lang="rails"]
class User < ActiveRecord::Base
has_many :courses
end

class Course < ActiveRecord::Base
belongs_to :user
end
[/code]

This is nice and simple, but if I want to access a course's teacher, I have to do it like this:

[code lang="rails"]
my_course = Course.first
puts my_course.user # Print out the course's teacher
[/code]

That's a little confusing. What if my course also has_many students (also Users)?

[code lang="rails"]
my_course = Course.first
puts my_course.users.first # Print out the course's first student
[/code]

So, my_course.user is the teacher, and my_course.users are the students. There's nothing to indicate that, I just have to know it. Wouldn't it be nicer if the associations could be named for what they represent?

[code lang="rails"]
class Course < ActiveRecord::Base
belongs_to :teacher, :class_name => 'User'
end
[/code]

To make this work properly, the foreign key needs to be teacher_id instead of user_id (that can be overridden too, but I won't go into that right now). Then, you get to do this:

[code lang="rails"]
my_course = Course.first
puts my_course.teacher # Print out the course's teacher
[/code]

Much better!

Now, I recognize that this is far from a revolutionary feature. I'm pretty sure it's possible in Django and most other MVC frameworks. I was just very happy to bump into it.

Tuesday, October 13, 2009

Twitter Weekly Updates for 2009-10-13


  • "isn't sour cream like, the mayonnaise of mexico?" - alicia #

  • dunno if obama deserves a nobel peace prize or no, but he does deserve recognition 4 peace achievements, and im not a judge, so im fine w/it #

  • anyone have any google wave invites? i'd really like to take a look at it #

  • don't need a google wave invite anymore! thanks a ton @MattMarzilli #

  • me: "it sux cuz sports stars peak at like 28. we wont peak till we retire."
    alex: "i dunno, i think i peaked in 7th grade" #

  • haha randy moss's first caught ball of the day was thrown by kyle orton #


Powered by Twitter Tools

Tuesday, October 6, 2009

Twitter Weekly Updates for 2009-10-06


  • i wish the code examples in all the ruby on rails guides didnt omit the parentheses on methods with parameters. it always messes me up. #

  • Bueno y Sano 5/5 on #Yelp: Along with Anna's Taqueria (which I believe is only in/near Boston), this is my favorite ... http://bit.ly/2q1SB #

  • asked my man at the grill for "rice" and he heard "fries". +20g fat for the day :( #

  • "First I was aroused, then I was furious" - Jane Lynch, funniest line of all the episodes of Glee #

  • this looks like it could be everything Puzzle Quest Galactrix wasn't and more: http://bit.ly/8fuQU #

  • outlook is definitely microsoft's best product. it's their only one that gives me more "that's cool" moments than "omg i wanna die" moments #

  • 11 yd WR screen = edelman's least bad play of the year #

  • this is the longest last 2 minutes of the first half of any NFL game i've ever seen #

  • weird time to call timeout by john harbaugh. either call it right away or wait till after the play, not with 3 sec left on the clock. #

  • haha i dunno if i've ever seen a penalty against a team's bench before. ravens need to chill. #

  • wtf you can't say "injury timeout on the field" and then cut to commercial before you say who's injured #

  • im glad we won, but i feel bad for the ravens. he absolutely should've made that catch. #

  • testing and debugging in a compiled language really makes me appreciate interpreted languages #

  • wooo pats bringin back junior seau!! #


Powered by Twitter Tools

Tuesday, September 29, 2009

Twitter Weekly Updates for 2009-09-29


  • can anyone with ruby on rails experience recommend a good book? im about to start on it. #

  • Anna's Taqueria 5/5 on #Yelp: Anna's Taqueria is one of my two favorite fast Mexican restaurants (the other is Bueno... http://bit.ly/Xr2rt #

  • got ruby 1.9 + ruby on rails 2.3.4 installed on my mac (snow leopard) last night. rvm caused more trouble than it was worth, so i removed it #

  • "hey can we find a Family Guy to sleep in?" - Alicia #

  • 1 play left. go lions!!! #


Powered by Twitter Tools

Friday, September 25, 2009

Does this count as ironic?

Salad with dressingI was sitting at my cubicle today, eating a salad (with fat free sun dried tomato dressing) for lunch. Despite a perfect knowledge of my spill-happy past, I ate the salad leaning back in my chair.

About halfway through, I decided I was just asking for trouble, and decided to lean over my desk for the rest of the meal. Two bites later, a piece of lettuce (covered in dressing) fell off my fork, hit the side of the desk, slid off the desk and onto my pants, then slid off my pants and landed on the floor.

It's things like this that make me think God doesn't exist.

Tuesday, September 22, 2009

Morning Sickness

Over the last dozen or so mornings, I've felt anywhere from below average to awful. I'm not sure what it is; my friend Ron thinks it's the change of weather, which is definitely possible, but it's never happened before.

I wake up with a sore throat and headache (each of varying intensity). It starts to dissipate as the day goes on, and I usually feel fine by the early afternoon.

I've been getting about six hours of sleep per night. Maybe going to bed earlier will help.

Twitter Weekly Updates for 2009-09-22


  • do the analysts saying TO had "no impact" not know football? u think fred jackson wouldve torn it up if we didnt need to double-cover TO? #

  • Latest Facebook blog post (http://bit.ly/VB1qa) says theyre now cash flow positive. If thats true, its incredible (they burnt $500m last yr) #

  • Aw, that's a sick tattoo! http://bit.ly/eDblr #

  • man tests out bulletproof glass by having his wife hold it in front of her face and shooting it: http://bit.ly/9iTlh #

  • Goten of Japan 5/5 on #Yelp: The food here is delicious. I've been here five times, and I've loved it every one of t... http://bit.ly/d9Wp9 #

  • "went to church this morning. worst mistake of the day." - some guy yelling in central square #

  • you can see why NE picks up vet players. fred taylor knows exactly how to set his blockers up. #

  • Celebrity Pizza & Dairy Bar 3/5 on #Yelp: My roommates and I walk here from our house once every couple weeks.

    The ... http://bit.ly/uMEAo #

  • Antonio's 4/5 on #Yelp: Antonio's makes the most unique and delicious pizza I've ever had. My favorites are the buff... http://bit.ly/10Wh8e #

  • i love when i wish a feature existed in an app, then find that it actually does. just happened with "bookmarks" on @yelp #

  • Me: let's diagram it out tomorrow. we should get a whiteboard.
    Alex: ya
    Alex: and tattoos. #


Powered by Twitter Tools

Monday, September 21, 2009

Tough as Nails

I set a goal for myself at the beginning of last week: stop biting my nails.

Nail Biting

It's been a lifelong habit. Personally, I don't think there's anything wrong with it. In fact, once I do end up successfully stopping, I may start again. I just want to see if I can do it.

So far, it's been a lot harder than I expected. I sometimes catch myself halfway through biting a nail (of course, I have to finish at that point, lest I leave half a nail hanging off). For the first few days, I wasn't even sure if I'd be able to do it.

Now, my nails are longer than they've been since I was a little kid. Once they get to the point where a normal person would cut them with a nail clipper, I'll consider myself victorious.

Saturday, September 12, 2009

Installing MySQL on Snow Leopard

Early on at Cisco, I realized that it's really beneficial to write down (step by step) what I've done when I install something new. It forces me to think about what I'm doing, it provides me with a guide in case I mess up and have to start over, and other people can benefit from it later.

With that in mind, I've decided to document my installation of MySQL on Snow Leopard (OS X 10.6). Hopefully, someone will get some use out of it, but if not, at least I'll have documentation of what I did.

The PATH Variable



MySQL has a bunch of useful executables that aren't in PATH by default. I could symlink to them, but I think that's less maintainable than just appending to PATH.


  1. Open the Terminal

  2. Open up the .bash_profile file in your home directory. This is a bash script that runs every time you start the Terminal, so the PATH variable will be extended properly every time. If you have TextMate and its UNIX command line tools installed, do it like this:
    [code lang="bash"]mate ~/.bash_profile[/code]
    Otherwise, do it like this:
    [code lang="bash"]/Applications/TextEdit.app/Contents/MacOS/TextEdit ~/.bash_profile &[/code]
    If you use the second one, make sure to add the ampersand at the end.

  3. We want to add MySQL's bin folder to PATH. MySQL's going to be installed at /usr/local/mysql, so let's add /usr/local/mysql/bin to PATH. Add the following line to .bash_profile:
    [code lang="bash"]export PATH="/usr/local/mysql/bin:/usr/local/sbin:$PATH"[/code]
    This replaces PATH with two new locations and the old value of path (so essentially, appending the two new locations to the beginning of PATH). You'll notice that, in addition to /usr/local/mysql/bin (which I mentioned earlier), I also added /usr/local/sbin. OS X doesn't include this in PATH by default, but I think it should, so I added it. I have no defense for that position, but this is as much a guide for myself as it is a guide for other people, so I don't need to defend it :)

  4. Save the updated .bash_profile and close it.

  5. Quit and reopen Terminal so that .bash_profile will be rerun (you could also run it explicitly, but I prefer quitting and reopening).

  6. Run the following command:
    [code lang="bash"]echo $PATH[/code]
    You should see /usr/local/mysql/bin and /usr/local/sbin in there now.


Downloading + Installing


I might try compiling and installing from source at some point, but I don't see any reason to when there are official OS X binaries available. Worst case scenario, it messes with a bunch of my settings, and I'll uninstall and then do it from source.

  1. Go to http://dev.mysql.com/downloads/mysql/5.1.html#downloads

  2. Scroll down to the Mac OS X section (near the bottom). Download the version labeled "Mac OS X 10.5 (x86_64)". It's very important that you download the 64-bit version. The 32-bit one will give you preference pane problems (I call them PPPs whenever I talk about them, though this is the first time I ever have). Eventually, there'll probably be a Mac OS X 10.6 version, but for now, this works perfectly.

  3. Open the .dmg, then run mysql-x.x.xx-osx10.5-x86_64.pkg. Continue through all the screens without changing anything, until it finishes.

  4. Optional: install the preference pane by running MySQL.prefPane.

  5. Optional: make MySQL start up along with OS X by running MySQLStartupItem.pkg. I didn't do this (I often use my computer for things other than development, and I don't want to bog it down unnecessarily), so I can't provide any instruction or vouch for how well it works on Snow Leopard.


  6. You should now be able to start MySQL by going into the MySQL preference pane and clicking "Start MySQL Server". If it doesn't work, leave me a comment, and I'll try to help you.

Monday, August 10, 2009

My Mornings

Every morning, when my alarm clock goes off, a little demon wakes up before I do and tries as hard as he can to delay my day.

Sometimes, he sets my alarm clock back 30 minutes. Other times, when he wakes up way before me, he just turns the alarm clock off and I sleep for another three hours. He used to whisper in my ear (when I was in school) that I was way ahead in my classes and that I should take a break from them, or at least from my earliest one. Now, he looks for my work laptop so he can send an "out sick" email to my boss.

Yeah, he's a jerk, but I've got his number. I put my alarm clock at the other end of my room, and leave my work laptop out in the car. He still makes it to the "snooze" button from time to time, but I always wake up and stop him before he can do anything more dangerous.

Problem is, I have to stay awake to keep him at bay. I have a 45-minute drive to work, which is almost entirely straight highway driving. From time to time, I catch him reaching for the wheel, or moving towards one of the pedals. So far, nothing's come of it, but something in the back of my head (maybe him) tells me that it's only a matter of time.

Monday, May 18, 2009

Django Middleware vs. Context Processors

This may be old news to many people, but it's something I just recently learned (after doing it wrong for about nine months), so I figured someone else may be able to benefit from my mistakes.

For a long time, when I needed to access the currently logged-in user in one of my Django views, they would follow this pattern:

[code lang="python"]from django.template import RequestContext

# some other view code

def my_view (request, obj_id):
context = RequestContext(request)
obj = get_object_or_404(SomeModel, pk=int(obj_id))

# do some stuff, including:
some_function(context["user"])

return render_to_response("some_url_name", {"some_var": some_val},
context_instance=context)[/code]

Now, seasoned Django vets (why are you reading this post, by the way?) are probably laughing, but this seemed perfectly fine to me. Luckily, my ignorance was revealed to me while asking on IRC about a problem which, while didn't seem so at the time, was completely tied to my misuse of RequestContext.

To put it simply, context processors are made to be used in templates. The only time they should ever be instantiated is at the very end of a view, like so:

[code lang="python"]return render_to_response("some_url_name", {"some_var": some_val},
context_instance=RequestContext(request))[/code]

I was using them all over views, and even in a few of my decorators. What's wrong with this? The main thing is, certain mutation functions are sometimes performed when a RequestContext is instantiated (such as user.get_and_delete_messages(), which gets all of a user's messages from the database and then deletes them), and performing them multiple times before loading the template can cause unexpected results. In my example, I was instantiating a RequestContext in a bunch of decorators, which meant that by the time my template was loaded, all of the user's messages were deleted (and stored in an earlier instance of RequestContext), making it look to me as if my auth messages were being thrown out.

What's the solution? Middleware. Middleware allows the programmer to attach variables to the request object before it reaches the view. With this (and the provided django.contrib.auth.middleware.AuthenticationMiddleware), I can still achieve my original goal (accessing the logged-in user from a view), but now I can do it without creating a RequestContext and potentially running mutation functions multiple times:

[code lang="python"]from django.template import RequestContext

# some other view code

def my_view (request, obj_id):
obj = get_object_or_404(SomeModel, pk=int(obj_id))

# do some stuff, including:
some_function(request.user)

return render_to_response("some_url_name", {"some_var": some_val},
context_instance=RequestContext(request))[/code]

I've removed all the references to RequestContext from my decorators, and made sure to only instantiate it at the very end of views. Now, messages work perfectly. I've even written my own middleware (which you can learn how to do here) to load instances of a few of my own models into the request.

To reiterate, I'm aware that this is common knowledge for many people, but it took me 9 months of moderate Django use and embarrassment on IRC to discover it for myself. Hopefully, this will help someone else in a similar position.

Sunday, May 10, 2009

Are you a better programmer than you were two years ago?

Two years ago, I was working on a fairly complicated (for my level of experience) web app for posting news articles. Quite a few times, I ran into situations where I was about to create tight coupling between two somewhat unrelated parts of my app. Sometimes, I wouldn't even recognize this as a problem. Other times, I would, but not be able to think of an easy fix, so I'd continue on.

Today, I find myself fixing tightly-coupled situations almost instinctively. I use design patterns that I was barely aware of two years ago, and I do it without straying into "design pattern fever" territory. This isn't to say I'm an expert at software design; it still takes a lot of thought to actually think of the best fix, and I'm sure I still make plenty of mistakes. My point is, my growth as a programmer is very obvious to me in these situations.

How about you? If you've been programming for more than two years, you're probably a better programmer than you were two years ago, but can you tell? If so, how can you tell? Can you give any good examples of moments when you realized it?

Tuesday, May 5, 2009

Non-painful email on Django development servers

I've been actively learning and using Django since August 2008, and I've loved almost every bit of it. There are plenty of places to read all about the virtues of Django, so I'll leave that out for now.

One thing that's always bugged me about web development in general is the sending of emails. I do development on my local computer (with a badly set up Apache / MySQL / PHP / Python / whatever else stack), and I've never felt like dealing with the headache of setting up a mail server. This means, when I add something that's supposed to send an email (like an activation email after registration), I have to get very hacky to test and debug it (making sure the email text is being produced correctly, making sure it's being sent to and from the right people, etc.).

This was one of the few web development pains that I thought Django didn't solve. Whenever I'd test a bit of code that was supposed to send email, I'd get a "Connection refused" error page (meaning my computer has no mail server to send the email with). I would usually add in a bit of printf debugging to make sure the subject and body had the correct text, but beyond that, I'd usually wait to test the email portions until I uploaded to a server that could send email (usually the production server, unfortunately).

Yesterday, I bumped into a little section in the Django documentation that explains how to get around this. As usual, Python has all the solutions. First, set this code in your settings.py file:

[code lang="python"]EMAIL_HOST = 'localhost'
EMAIL_PORT = 1025 # replace this with some free port number on your machine[/code]

Then, assuming you're on a Unix system (I'm on a Mac), run the following on the command line to start a "dumb" Python mailserver:

[code lang="bash"]python -m smtpd -n -c DebuggingServer localhost:1025[/code]

Make sure to replace 1025 with whatever you filled in for EMAIL_PORT.

Now, try running the email-sending code in your Python application. Voila! No error pages (or at least, none related to email), and the full text of the email (headers and all) appears in whatever command line prompt you ran the dumb mailserver on. This allows you to the see senders, recipients, subject, and body of the email being sent out, all without getting hacky or sending to an email account you own.

Taking this a step further, I created a small bash script called "dumbmail" in /usr/local/bin that looks like the following:

[code lang="bash"]#!/usr/bin/env bash
if [ -z $1 ]
then port=1025
else port=$1
fi

echo "Starting dumb mail server on localhost:$port"
python -m smtpd -n -c DebuggingServer localhost:$port[/code]

Now, when I'm testing a Django application and I get to a section that is going to send an email, I just run "dumbmail" (or "dumbmail some_number" if I need to use a different port, for some reason I can't imagine), and I'm ready to go.

Hope this helps people. The documentation was always there - I just never noticed that part until yesterday.

Tuesday, April 28, 2009

Round Rectangles (or Why Steve Jobs is a Visionary)

A story was posted on Folklore.org (a site full of old stories about the creation and initial development of Apple computers) about the addition of rounded rectangles to an old drawing program, and Steve Jobs's involvement in them. This section in particular struck me:

Bill fired up his demo and it quickly filled the Lisa screen with randomly-sized ovals, faster than you thought was possible. But something was bothering Steve Jobs. "Well, circles and ovals are good, but how about drawing rectangles with rounded corners? Can we do that now, too?"

"No, there's no way to do that. In fact it would be really hard to do, and I don't think we really need it". I think Bill was a little miffed that Steve wasn't raving over the fast ovals and still wanted more.

Steve suddenly got more intense. "Rectangles with rounded corners are everywhere! Just look around this room!". And sure enough, there were lots of them, like the whiteboard and some of the desks and tables. Then he pointed out the window. "And look outside, there's even more, practically everywhere you look!". He even persuaded Bill to take a quick walk around the block with him, pointing out every rectangle with rounded corners that he could find.

When Steve and Bill passed a no-parking sign with rounded corners, it did the trick. "OK, I give up", Bill pleaded. "I'll see if it's as hard as I thought." He went back home to work on it.


I think this is a fantastic example of what makes Steve Jobs one of the few true visionaries in the world. In the face of a big advancement like fast ovals (yes, it was definitely a big deal at the time), I would've been more than satisfied. I may have asked for rounded rectangles in a second iteration, but it would've been a simple feature idea. I believe most people would've reacted the same way.

What makes Steve Jobs special is his ability to quickly identify what's really important. It seems so obvious after the fact. Of course people would like computers with translucent colored cases! Of course minimalist controls would make for a more accessible and desirable MP3 player! Of course rounded rectangles are an extremely common shape, and are really important to have in a drawing program! These ideas (and more) are all obvious now. But a year before Apple did them, other companies were struggling to innovate in these fields, and they were only "obvious" to Steve Jobs.

Of course, anyone can have a great idea that turns out to be the right way of doing things. This is why I distinguish between "true" and regular (false?) visionaries. People called M. Night Shyamalan a visionary after "The Sixth Sense" and "Unbreakable". He then went on to make one more pretty good but decidedly un-visionary movie (Signs), an arguably good but decidedly un-visionary movie (The Village, which I loved, but I understand why others didn't), and two duds (my apologies to the three people who liked them). The visionary label was applied to Shyamalan before he'd reached first base, and he got thrown out at second. This happens all the time, and these people are certainly not true visionaries.

True visionaries come up with visionary ideas so consistently, that it becomes expected of them. And no one (off the top of my head) has a more consistent history of this than Steve Jobs.

Monday, April 27, 2009

A response to "The Extreme Google Brain"

A blogger named Joe Clark just made a post called The Extreme Google Brain. In it, he takes a side on the recent tiff between lead designer Douglas Bowman and Google, where the former left the latter out of frustration at having to prove every design decision with real-world test data.

Joe Clark whole-heartedly agrees with Bowman, as many others do (I myself am somewhat torn). However, I find Clark's article to be a ridiculous rant, full of stereotyping, fact-inventing, name-calling, and other marks of awful opinion pieces.

One frequently-used tactic in his piece is the inventing of a fact, followed by a single related fact that's supposed to prove it. Case in point:

Some of these boys and men exhibit extreme-male-brain tendencies, including an ability to focus obsessively for long periods of time, often on inanimate objects or abstractions (hence male domination of engineering and high-end law).


Yes, males dominate engineering and high-end law. However, the cause is the topic of endless debates, and yet Clark claims it's due to "extreme-male-brain tendencies" like he read it in a science textbook. This "here's a fact I made up (hence an already-known, tangentially-related fact)" pattern repeats itself a few times.

When he's not making up facts, he's stereotyping a group of tens of thousands of people based on the few he knows.

Apart from Bowman, I can think of only two Google employees I could stand to be around for longer than an elevator ride. My impression of “Googlers,” which I concede is based on little direct knowledge and is prejudicial on its face [note: apologizing in advance does not make it okay to say something idiotic], is one of undersocialized, uncultured, pampered, arrogant faux-savants who have cultivated an arrested adolescence that the Google working environment further nurtures. Their computer-programming skills, the sole skills valued by the company, camouflage the flaws of their neuroanatomy. Their brains are beautifully suited to the genteel eugenics program that is the Google hiring process but are broken for real-world use.


You get the picture. Throughout the rest of the article, he:

  • Contends that A/B testing has no value.

  • Makes up scenarios that he believes (and "speculate[s] that Bowman would not disagree") accurately represent Google meetings.

  • Tells us that we can't disagree with Bowman and still feel that technology juggernauts are becoming better at visual design.

  • Says that when a company uses anti-design (extremely minimal and not-necessarily-beautiful designs such as Google or Craigslist) and succeeds, they're succeeding despite the anti-design. He then concedes that can't prove it, but assures that if you're "visually literate" (which Adobe defines as the "ability to construct meaning from visual images", which anyone older than an infant can consistently do), you "just know it".



There isn't really much else to say about this. I've read quite a few articles that side with Bowman, and quite a few that side with Google, and many of them on each side were great articles with great points. This was not one of them. I've never heard of Joe Clark before, and based on this, I hope I never do again.

Sunday, April 12, 2009

Facebook causes me to facepalm

I looked at the "New Layout Vote" app on Facebook today. The wall is full of complaints about the new layout. While reading through them, I ran into this gem:

I don't like how my picture is already next to every comment box on every facebook page... it's like facebook is forcing me to participate and comment on things that I don't want to. Please stop taking my profile picture and putting it everywhere without my permission.


Ugh... honestly, how can she think that, without commenting, her picture is visible to everyone else looking at a page? Doesn't she wonder why she can't see everyone else's picture there as well?

And, as a side note, if she agreed to the Terms of Service (which she did if she has an account, which she does), then she definitely gave them permission to put her picture everywhere.

Thursday, April 2, 2009

Cooking like my Mom - Part 1

Every single holiday since I can remember, my mom has made a dish called Lokshen Kugel. She makes it with egg noodles, cottage cheese, sour cream, and a bunch of sugar, vanilla, and cinnamon (as well as the standard eggs and stuff).

Today, after getting the recipe from her, I made it myself. The weirdest feeling was opening the oven halfway through. There was a dish I'd seen dozens of times before, but for the first time, it wasn't in the oven at my house (the one I grew up in), and it was made by me instead of my mom. It was very surreal.

I haven't tasted it yet. But when I do, I'll post about it again. I also took some pictures, so I'll be sure to put them up (once I find the cord for my camera).

Wednesday, March 25, 2009

What is the best makeshift ice scraper?

Tic Tacs


TicTac container. Seriously, it saved my life this morning.

Sunday, March 22, 2009

How to export a 1024 x 768 screencast from Adobe Premiere Pro CS4 to Youtube

Over spring break, I made some screencasts for the website I maintain for my job. We used a free, open-source screen recorder called CamStudio. Our plan was to upload them to YouTube and embed the videos into our site from there. The one problem: the Export Media dialog on Adobe Premiere Pro CS4 (the software I used to edit the screencasts) only has a setting for low-definition (320x240) videos. My screencasts were 1024x768, and shrinking them to 320x240 would make the video part pretty much useless.

So, I set about trying to find a setting that would work for a 1024x768 screencast. The majority of settings produced some weird-quality results, even with the quality settings turned up as high as possible, which makes me think it had something to do with a faulty interlacing/deinterlacing setting somewhere. When I changed the one interlacing setting I was able to find, the exported videos would work fine in QuickTime, but look completely messed up in VLC and, more importantly, YouTube.

Exporting to F4V actually did work in every media player I tried it in, but for some reason, YouTube was unable to convert it to a playable format.

Finally, after about 9 hours of fiddling with Adobe Premiere Pro's Export Media dialog, I found a setting of acceptable quality that worked on YouTube. I doubt many other people will need this, but since I was unable to find any information about the problem I was having online, maybe I'll save another person 9 hours of work. Here are the settings I used:

Export Settings
Format: QuickTime

Filters
No Changes

Video
Video Codec: H.264
Quality: 100
Width: 1,024
Height: 768
(Width and Height are unlinked)
Frame Rate: 30
Field Type: Lower First (this was the setting that deals with interlacing I believe)
Aspect: Square Pixels (1.0)
Render at Maximum Depth: Checked
Set Key Frame Distance: Unchecked
Optimize Stills: Checked
Frame Reordering: Unchecked
Set Bitrate: Unchecked

Audio
No Changes

Others
No Changes

I'm sure there are some optimizations to be made in these settings; things that are causing me to produce unnecessarily large files, things that make rendering slower, or things that, if I tweaked, would give me even better quality. However, after 9 hours, I don't care; this works, so for now, I'm done. Hope this helps someone.

Saturday, March 21, 2009

My greatest beer-related discovery

Before college, I didn't drink at all, and even in college, I don't drink very much; once every couple weeks during the school year, maybe once or twice a week during vacations. During this short time, I'd never been able to get myself to like beer. I always wanted to, but I just never could. The taste always reminded me of garbage (or at least, what I imagine garbage would taste like). I spent a fair bit of time on Google trying to find what beers other people really like, and every one I tried had the same garbage-y taste.

One day, with the prospect of playing a drinking game while watching Independence Day (drink every time there's an explosion... ughhh...), I decided I'd try a lite beer, so I wouldn't feel like I'd eaten an entire loaf of bread after my third bottle. I did a little research, and found that people had great things to say about Sam Adams Light. I got a six-pack, and to my surprise, I didn't hate it as much as I hated other beers. $9.00 for a six-pack of somewhat-tolerable liquid is pushing it, but at least I could play drinking games with my friends without resorting to Mike's Hard Lemonade.

The important part was this: instead of tasting like beer (and, by association, garbage), it tasted like water with a slight aftertaste of beer/garbage. This made me wonder: is there a beer that really just tastes like water? I asked my friend Trenton. The answer? Natty Light.

I'd been looking in all the wrong places. The people who had been telling me the "best beers" were people who loved beer and think it tastes like the nectar of the gods. I hate beer and think it tastes like the nectar that drips from a leaky garbage bag, so of course, the best beer for me is the one that tastes as little like anything as possible. And as fortune would have it, the best beer for me is also the cheapest beer. For less than the cost of two six-packs of Sam Adams light, I can buy a 30-rack of Natty Light.

Don't get me wrong, it still tastes gross. In the end, I really just don't like the taste of alcohol very much, and beer is no exception. But unlike with "good" beer, I can drink a few cans of Natty Light without dreading the taste of garbage each sip. And that's something I think we can all drink to.

Monday, March 16, 2009

Benchmarking Python decimal vs. float

I'm writing a web app that includes, among other things, a good amount of (rational) non-integer numbers. Whenever I'm in this situation, and I'm using a language that supports Decimals (as opposed to just floats and doubles), I always wonder which one I should use.

If you're a programmer, you understand the difference and the dilemma. Floats/doubles are very fast, as all computers (built within the last 15 years) have hardware specifically made to deal with them. However, they're not perfectly accurate; because of binary representation, numbers that we use a lot (like 1/10 or 0.1) cause the same problems that 1/3 (0.33333...) cause in base 10.

Decimals, on the other hand, are slow. They are handled entirely in software, and thus take hundreds of instructions to do things that would take less than 10 with floats/doubles. The upside is that they're perfectly accurate; 0.1 is 0.1 is 0.1.

So the question becomes twofold:

  1. Do I really need my numbers to be perfectly accurate?

  2. How much slower are decimals than floats/doubles?



In my case, the accuracy would be nice, but not completely necessary. And thus, the latter question becomes important. I'm not writing a large application, and I don't expect it to get too popular too quickly, so if the slowdown is only moderate, I'll take the accuracy.

To learn what the slowdown was, I wrote two quick Python test programs:


# Decimal test
 
from decimal import Decimal
 
a = 0
for i in range(0, 20000):
    a = Decimal('%d.%d' % (i, i))
    print(a)



# Float test
 
from decimal import Decimal # kept this in on the float version
                            # so they'd have the same overhead
 
a = 0
for i in range(0, 20000):
    a = float('%d.%d' % (i, i))
    print(a)


When I ran each of these with /usr/bin/time (which I just learned about a couple weeks ago, and has replaced counting seconds on my fingers as my favorite benchmarking tool), the decimal version took an average of about 1.5 seconds (over 10 runs), while the float version took an average of 0.5. Just to make sure no overhead was getting in the way, I upped the limit to 40000 and ran it again. Decimal took 3.0 seconds, float 1.0. I can now confidently say that Python floats are about 3x the speed of Python decimals.

Or are they? While this tests the creation and printing of decimals and floats, it doesn't test mathematical operations. So, I wrote two more tests. I'm going to be doing a lot of division on these numbers, and that's definitely the most expensive mathematical operation to compute, so I made sure to do it in the tests (along with some subtraction).


# Decimal version
 
from decimal import Decimal
 
a = 0
for i in range(2, 20002):
    a = Decimal('%d.%d' % (i, i)) / Decimal('%d.%d' % (i - 1, i - 1))
    print(a)



# Float version
 
from decimal import Decimal
 
a = 0
for i in range(2, 20002):
    a = float('%d.%d' % (i, i)) / float('%d.%d' % (i - 1, i - 1))
    print(a)


This time, the float version averaged about 0.6 seconds (1.15 with 40,000 iterations instead of 20,000), while the decimal version averaged over 11 seconds (23 with 40,000 iterations instead of 20,000). So while Python float creation and printing is merely 3x as fast as Python decimal creation and printing, Python float division is almost 20x as fast as Python decimal division.

So what did I choose? Decimals. In the context of these tests, the decimal slowdown may seem significant, but if I finished my app using decimals and profiled it, I can almost guarantee (based on the speeds here) that the bottleneck would not be decimal division performance. If I was running an app that was handling hundreds of simultaneous requests, I may consider switching (I may also spring for better hardware, but that's a different topic). However, for my purpose, 1/20th the speed of floats is more than fast enough.

P.S. As my very late discovery of /usr/bin/time should suggest, I'm extremely new to benchmarking. If anyone has any suggestions for me, or criticisms of my method, please leave your thoughts. This is something I'd like to get better at.

Sunday, March 15, 2009

Why Facebook keeps changing their interface

Facebook has a new design. Every programmer I know loves it. Almost everyone else I know hates it. I could've written those three sentences a couple years ago, copy/pasted them every six months, and they would've fit perfectly every time.

Of course, the pattern has always ended the same way as well. People forget about it within a month, and when the next change comes around, it's "It was PERFECT the way it was, why are you changing it?!"

In fact, I've seen a bunch of people asking "why does Facebook keep changing the interface?" Most people are asking it rhetorically, with the implied answer being "to confuse users." However, if asked honestly, it's actually a pretty good question, with a pretty good answer.

Facebook has gotten where it is today by innovating. Prime example: the News Feed. When they added the News Feed, no one else was doing anything like it. Despite massive user riots, they stuck to their guns. Two years later, everyone loves it, all the social networks have copied it, and there would be massive user riots if they scrapped it.

Facebook has stayed at the forefront of social networking innovation by constantly throwing everything at the wall, keeping what sticks (News Feed), and scrapping what doesn't ("How do you know this person?"). It's in their best interest to do this; they make their money from venture capitalists and advertisers, not from charging users.

If they charged users, changing the interface so often would be a bad move; users would stop paying as soon as they became confused with a new interface, and they'd lose money. As it stands, users who are confused with a new interface can take a break at no cost to Facebook, come back in a few weeks (as they always have), and the advertisers and venture capitalists (who only care about long-term success) are happy.

Obviously, Facebook and its investors have become accustomed to the pattern: make a change, suck up the complaints (possibly while making some adjustments, like the additional privacy options after the News Feed was added), and reap the benefits of being a bastion of social networking innovation for another six months. Eventually, maybe the users will get used to it as well. I've actually seen a few status updates along the lines of "I guess I probably won't hate this so much once I get used to it" after this update, so who knows?

Wednesday, March 11, 2009

Apple takes it too far on form-over-function

Today, Apple announced the new iPod Shuffle. It drops the navigation buttons in favor of making it slightly smaller.

New iPod Shuffle


Your first question upon hearing this was probably the same as mine: "How do you switch songs?" Simple: the controls are on the cable of the prepackaged headphones.

New iPod Shuffle Headphones


Do you listen to your iPod with the headphones it came with? Most people I know don't. Admittedly, I don't know anyone who owns an iPod Shuffle (my mom used to, but that's it), so it's possible that iPod Shuffle buyers often stick with the default headphones. But be warned: if you buy an iPod Shuffle, and you want to use your own headphones, get ready to not be able to switch/skip songs.

I understand choosing form over function, up to a point. But to me, this just seems ludicrous. The difference in size between this thing and the last-gen iPod Shuffle is minuscule, and the functional sacrifice immense. But maybe I just don't get it.

Monday, February 9, 2009

What of the Google monopoly?

Jeff Atwood of Coding Horror made a post about the Google monopoly, suggesting that we should be concerned.

I don't 100% disagree with him, but this section struck me as especially egregious:

I'm a little surprised all the people who were so up in arms about the Microsoft "monopoly" ten years ago aren't out in the streets today lighting torches and sharpening their pitchforks to go after Google. Does the fact that Google's products are mostly free and ad-supported somehow exempt it from the same scrutiny?


This is an interesting argument, but there's one critical difference: Google does not partake in monopolistic activities.

One of the big problems with Microsoft was when they pre-installed Internet Explorer on Windows with no way to remove it, leveraging their OS monopoly to gain an unfair advantage in the browser market. They got sued, added a "remove software" option to let people remove pre-installed software (IE, Windows Media Player, etc.). Now, even though their OS market share has barely shifted (definitely under 5% shift), few people complain about their monopoly anymore.

I'm not saying they're okay now (or even that what Google's done is in the best interest of the internet), but the reason no one complains about Google's monopoly is because they created it legitimately, and they don't do evil things with it.