my face
About Me

Published Posts

All Posts

New Post


View by Tag:

interviewing, code, testing, philosophy, blog, wantmyjob, virtualization, railsmud, heroku, ruby, published, neoarchaeology, railsgame, rails, juggernaut, astrino, cheaptoad, shannaspizza, mongodb, refactorit, devise, rvm, passenger


FeedBurner picture


Online Portfolio

Resume

Profile on LinkedIn

Recommend Me

Page 49

Viewing all posts

Two Mixed Blessings: 'How Small Can You Write It?' and 'Not Invented Here'

Posted: 2 years ago (2007-12-04 20:58:57 UTC ) / Updated: 16 months ago (2009-04-23 21:33:03 UTC )

Imported from WordPress

Originally posted on 2007-12-04 20:58:57

There are two unrelated-looking tendencies that programmers have that can serve a common purpose. Like the Big Ball of Mud Architecture, these apparent anti-patterns are more useful than they appear.

You've probably seen various folks talking about how fewer lines of code are better. Every language partisan wants to prove that their language is more expressive than Blub.

You've probably worked with people who want to reinvent every wheel, and maybe you've reinvented a few as well in your wild youth. We won't tell anybody. But it's easy to look at a random tool and say, "that would be enormously better in (my favorite language)!"

Often different quirks look unrelated, but can share a common cause. This is one such time.

Let's start with "fewer lines is better". While shorter can be more powerful, it's a given that shorter is usually more readable. Not "cram it onto one line" shorter, but shorter in the sense of fewer symbols and operations. As a result, more succinct languages are often more expressive, and express common ideas in a smaller amount of code. An experienced programmer can read and absorb functionality faster in those languages. The tool or library is lexically shorter/smaller, so the programmer reads and absorbs it more quickly.

The second idea that's more useful than you think is "not invented here." By "not invented here", I'm specifically referring to people liking tools written in their own preferred language. Rubyists love tools in Ruby, Pythonistas want everything written in Python, Perl hackers rewrite everything in Perl. I claim that this is not purely about elitism and "whatever I use is better." It's also in the hope that they can understand their tools at a deeper level. Having a tool written in a language that you understand allows you to dig down into it in a way that isn't otherwise possible.

These ideas, together, are a vision of every tool and library, written in the same tremendously terse language -- whichever language that may be. It is the vision of all parts of a solution being readable at high speed. It is the vision of a programmer being able to inhale all of the design, all of the algorithms, all of the underlying quirks, at amazing speed.

We already know that it's silly to spend all your time reimplementing everything in your favorite language... And unless you get some very specific other benefit, it's silly to spend your time finding the shortest, most idiomatic way to express a chunk of code. But sometimes it just feels right to try it. And sometimes there's more underneath that feeling, even when we can't articulate it. Often bad habits come from noble motivations. Break your habits, but think about the goal that underlies them -- you may want it some day.

Quickie: An Excellent List of Ruby Idioms

Posted: 2 years ago (2007-11-30 18:47:32 UTC ) / Updated: 10 months ago (2009-11-07 01:10:29 UTC )

Imported from WordPress

Originally posted on 2007-11-30 18:47:32

Some time ago, I wrote about language idioms and why they're interesting and important to me. RubyGarden has published a quite nice list of idioms in the Ruby language (original link). Some of them will be familiar to you from other languages, especially Perl and Python.

Reading through the list, and thinking about language idioms and how they show valuable mindsets, I'm struck by something... A list like this is not a good way to search Ruby for the kind of language idioms I'm talking about, unless you already have a pretty fair grasp of Ruby. That's because the idioms in a list like this assume you already know the really interesting basic idioms like, "iteration in Ruby is usually done by calling 'each' on a Range object and passing a block" and "Ruby API functions usually take blocks for configuration." While these idioms may refine such points, and may point out interesting further things about them, an idiom like that won't usually wind up on the list of idioms, because it's obvious to somebody who learned Ruby from one of the normal sources. Which means that such a list doesn't help you if you're a non-Rubyist trying to mine the Ruby language for interesting idioms to use elsewhere.

Several places in the idiom list, the term 'Rubyish' is bandied about. You'll see equivalent words for other languages as well -- there are often specifically Pythonish ways to do things, for example. The word 'Rubyish' means two different things, and I think it's worth teasing the two definitions apart. A Rubyish solution sometimes means "elegant when expressed in Ruby," or equivalently "plays to Ruby's strengths." Those are the same meaning of "Rubyish." Creating classes at runtime is very Rubyish because, while Python allows it, Ruby has excellent syntax for it and other features that play into such things - it is very Rubyish because Ruby does it quite well. Boolean bit-twiddling is very C-ish because while C's bitwise operators have been copied in many other languages, C provides an efficiency in using them that is rarely matched, combined with an ability to get at the internal bit-by-bit representation of each data type that modern languages rarely provide.

A subtly different, second meaning of "Rubyish" is, "something most other languages do badly." For instance, procedural macros are extremely LISPish, simply because so few other languages implement them at all, and none do so very elegantly. In that sense, passing blocks to functions like "each" and "grep" is very Rubyish. Many languages don't allow it at all, and those that allow it tend to have inconvenient syntax (Python and LISP come to mind).

The second meaning varies from the first when you have features that exist in few languages, but work questionably even where they exist. Procedural macros in LISP are again a good example -- they work far better in LISP than elsewhere, but they're still a tremendous pain to use in LISP. But they exist, and sometimes that makes all the difference. DGD provides full transactional rollback in all variables accesses in any function tagged 'atomic' and everything it calls... In essence, it provides database-like rollback, but straight from the language itself. On the one hand, this can be very painful. Try getting error logs out of a bad request when all your variables and files are rolled back to starting state, for instance. On the other hand, that is an extremely DGD-ish facility because no other programming language except SQL has a facility even vaguely like it.

I think it's worth teasing these definitions apart because in these Dark Ages of programming languages, it's easy to lose sight of how things could be done better. There is always, always a difference between a good idea done in a good, usable way and the same idea with no competition. Contrast Ruby on Rails with solutions like ASP.net. Contrast the internet and HTML with previous services like AOL (pre-internet) and CompuServe. In each case, the previous solutions wasn't exactly good... But it was all there was.

So watch yourself, and try hard not to conflate definition two with definition one. When you judge a thing, try to judge it on its merits separately from judging it against its competition. Sometimes you must choose the best thing available whether it's actually good or not. But never lose sight of what it is.

What Do Macros Actually Give You: Reader Response

Posted: 2 years ago (2007-11-27 18:55:20 UTC ) / Updated: 15 months ago (2009-06-01 22:27:36 UTC )

Imported from WordPress

Originally posted on 2007-11-27 18:55:20

(Link to Part One)

After I wrote part one, the anonymous reader FooBar posted an excellent reply in the comments. I'd like to respond to the points he made there.

First off, I'd like to make it clear that I'm not talking about what optimizations macros can give you. Macros do often give good optimizations, but I'm not prepared to do the amount of profiling required to seriously address this topic. Macros can be a great way to avoid using closures, for instance, which is useless to know unless you know how fast macros compared to closures. You'd want to know in at least one LISP interpreter, and how fast closures are versus other workarounds in your language of choice. I'm not prepared to answer those questions properly, even for my dynamic language of choice (Ruby). I may do so at some point in the future.

I am trying to answer questions of expressiveness of methods other than macros versus expressing the same thing in macros. That's a subjective question, but I think raising it so that people can judge for themselves is basically useful.

He points out that macros can be used to create new control structures (for example, LISP's LOOP) and new semantics. Both of these are normally accomplished with the "fill in the blank" code in languages like Ruby, Perl and Python that have eval, first-class method objects and closures, but no macros. "Fill in the blank" code is discussed in part one. He also mentions that macros can expand symbols into code, which is even more literally "fill in the blank" code. The ITERATE construct for LISP is an example of this kind of macro coding, and its overall form and semantics should be familiar to most Ruby users. This is the kind of "build a control structure out of blocks" coding that Ruby excels at.

Macros also allow execution of code at compile-time, which is a neat trick. I still have to say, though, that Perl has the best facility for this that I've ever seen. I miss it in other languages. In Perl, code sections inside a BEGIN {} section are executed at compile time, and the BEGIN section is replaced with their value. I'm sure there are ways to fake this in other languages, but I can't think of a better, or less complex, way to do it. LISP macros use a fairly obscure method, which foobar mentions, to determine whether blocks are executed at one time or another. Ruby has a fairly similar facility with class_eval() versus regular eval(). It's obscure and hard to keep track of in Ruby, too :-)

However, macros can also analyze a piece of code and do compile-time optimization. This is a really, really good trick which can't be replicated in macro-less languages. He mentions the SCREAMER macro library for LISP which makes extensive use of this facility to implement PROLOG-style backtracking variables in a LISP setting. You can do that without macros, but macros allow for a lot more optimization in this case. So there's a trick that we simply can't do without macros or something like them. The SERIES macro library for LISP does tricks with iteration on (potentially infinite) series in a similar way, though it's easier to work around this in other languages.

I'd also like to point out that I was wrong about there being no way to parse Ruby in Ruby, or to get at Ruby parse trees in general. There's a ParseTree rubygem that will access Ruby's internal parse tree, and a lovely tool called Heckle that uses it to take your code apart and make sure your tests fail at all the right times.

Edit: Another blogger, Eric Kidd, has addressed a similar topic with respect to Ruby and macros elsewhere. He's also worth a read if you're interested in where LISP macros give you power that Ruby doesn't.

What Do Macros Actually Give You?

Posted: 2 years ago (2007-11-23 19:09:35 UTC ) / Updated: 15 months ago (2009-06-01 22:27:28 UTC )

Imported from WordPress

Originally posted on 2007-11-23 19:09:35

(Disclaimer: I would love to see somebody better-qualified tackle this topic. In the mean time, I think it's something that needs more discussion)

Paul Graham writes a lot of rhapsodies to using LISP for server-side apps. Particularly when it comes to LISP macros and their productivity.

For those who have never heard of LISP macros: LISP is a language with very little syntax. Its code is in the same syntax as its list data, so code consists of nested list data. For instance: "(print (+ 7 (- 6 4)))" is both perfectly good LISP code and, if you treat "print" as a piece of data, a perfectly good LISP list. You could write a little LISP function that takes a list as input and gives a list as output. If you then fed a piece of code through it to get another piece of code, that would be a macro. Neat, no?

Perl can do a similar trick by taking blocks of code as text and outputting more code as text, then passing it to 'eval'. That doesn't work very well because Perl is an incredible pain to parse. Python and Ruby do similar tricks, also with 'eval'. While they're not as awful to parse as Perl, they're still pretty bad. The brilliant stroke that makes LISP better is that it's trivial to parse, and can be passed around pre-parsed. So it's easy to write code that modifies other code.

So what would you do with this code-modifying-other-code? One answer is to fill in the blanks in a code template, rather like big messy C macros do. In languages like C where it's difficult to pass code around, this is a very useful thing. In languages like LISP, Python and Ruby where it's easier, this capability is mostly handled by method objects, closures and other higher-order-function stuff. So in those languages, you don't need macros to do that, and it's often a bad idea to have them do so.

Here's a simple example of fill-in-the-blanks code in Ruby:

  def grouping_iter(tokens, pre = proc{|tok|}, post = proc{|tok|}, &myproc)

pre.call(tokens)

newtokens = tokens.collect do |token|
token.kind_of?(Array) ? grouping_iter(token, &myproc) : token
end

post_tok = myproc.call(newtokens)
post.call(post_tok)
post_tok
end


This iterator is a bit like the 'map' statement, which takes a list and a function and returns a new list of results of that function call. The code above is like that, but it also descends into any lists inside the list and calls the function on them. So it's a simple, map-type convenience function. It also lets you pass separate "pre" and "post" functions for setup and teardown. So to call it, you might say:
grouping_iter(filelist, proc { |tok| init_jpeg_lib() }, proc { |tok| shutdown_jpeg_lib() }) { |sublist|

open_jpeg_files(sublist)
}

This is "fill in the blank" template code because the iterator leaves blanks for

A LISP macro can fill in those blanks in a similar way. In fact, LISP has first-class functions and closures, so common LISP macros like the LOOP statement could have instead been written to pass a function through as a parameter. That's how most Ruby iterators work, as well.

Macros can fill in the blanks in code templates with values rather than code. But that can be done by ordinary function calls -- only performance is potentially different, and it's not different enough for us to worry much about.

LISP macros can also be used for the kind of polymorphism that is handled by method dispatch in OO languages. That's a special kind of 'fill-in-the-blank' code like the above, except it runs different code depending on some other object. We know our other languages can do that, either by standard OO method dispatch, or by wacky things like Ruby's ability to override a method for any specific single object.

Macros can be used to replace particular operators, function calls or data constructs with slightly different ones -- for instance, in order to print debug messages, track allocated memory or otherwise do simple bookkeeping. Languages capable of Aspect-Oriented Programming do this tracking and bookkeeping routinely, but it's neat that LISP (vintage 1960) had a construct that got this right. Still, languages like Ruby that allow lots of rebinding of methods on standard object classes (for instance, overriding 'plus' on integers) can finally match this ability.

The final use that comes to mind is to track what functions or variables are used. On this, AOP and Ruby start to fall down, because they detect usages when these methods are called at runtime. It's much harder for Ruby or Java-plus-AOP to statically analyze code and determine what methods is can call or constants it can use [1]. LISP macros, since they can iterate through the code in full detail, can do this immediately at compile time. It's true that certain static analysis tools can do the same thing for the languages they operate on, but they're not usually callable from the program, and that makes a *lot* of difference. They also can't usually do much with functions that are defined dynamically or at run-time, while LISP macros have no trouble with that.

So a quick analysis suggests that Ruby (and other languages, I'm sure, just none I know well) can match most of the uses of LISP macros for most purposes... But *not* do code analysis the way good macros, applied well, can manage.

(I have written a response to this article's comments as well. If you found this interesting or enlightening, you may wish to read that as well)

[1]Ruby's "eval" on text blocks could be used to similar effect if it was easy to parse the text block for your static analysis. Sadly, there's not a good Ruby parser written in Ruby currently, nor invocable from the Ruby language. Other languages, especially easier languages to parse, may fare better on this point.

Choosing What You Do

Posted: 2 years ago (2007-11-22 17:32:05 UTC ) / Updated: 15 months ago (2009-06-01 21:31:26 UTC )

Imported from WordPress

Originally posted on 2007-11-22 17:32:05

It has been remarked elsewhere that a company should only outsource its non-core functions. So you should be sure to do in-house anything that is "what you do" as a company. Outsourcing is fine if, for example, the company is IBM and they hire another company to clean their offices after-hours. Cleaning up offices is not what IBM does, nor should it be. Now turn that idea on its head: when you have some other company perform a function for you, you are saying, "somebody else is better at this than we are, so we're not doing it." And that means it's not what you do. How can you, a software engineer, apply this principle to your own projects?

As an engineer, you're constantly making decisions about what code to write, and what existing libraries or applications to reuse. This is as it should be. Buying the wheel is generally much cheaper than reinventing it. Obviously you want to buy things unrelated to your core business. For instance, if you're building a web browser with better standards-compliance and better RSS management than your competition, you're probably better off using libjpeg than rolling your own JPEG decoder library. Fast JPEG decoding isn't really what you do. Starting to see the parallel?

And in the same way, buying or using a piece of software written by somebody else says, "this is not our core function". If your web browser uses libjpeg, you have declared "it's okay for us to have the same quality and speed of jpeg decoding that everybody else uses." So whatever may distinguish your product from the pack, JPEG decoding isn't it. By making these decisions, you aren't just expressing your product's intended niche, you're defining that niche.

Just because I talk about companies and products, don't assume that open source projects are exempt. As an engineer on a free software project you make all the same sorts of decisions -- write a new library, or use somebody else's? And your software has to distinguish itself just as much. What makes your software better than the incumbents? People will choose it for the same reasons as commercial software.

In general, you always want to be doing the very most important thing you can be doing. And of course, you want to knock together early versions as quickly as possible. This means that initially, you want to use other people's code as much as possible, to quickly get that initial version... Which means, defining yourself as doing almost nothing. Isn't that a contradiction?

Nope. It's what you do when you've just started. It would be nice to say "I do everything." But when you've just started, you don't say that. Your software can't live up to it. If you want to do everything first, it'll take ten years before it can do anything. And so the other half of the same lesson is that in the beginning, what your software does must be narrow, specific and focused. In other words, almost nothing.

Keep at it, and eventually you'll build up to more.

Ruby Type Coercion

Posted: 2 years ago (2007-11-20 00:08:00 UTC ) / Updated: 15 months ago (2009-06-01 22:27:19 UTC )

Imported from WordPress

Originally posted on 2007-11-20 00:08:00

Ruby, like most languages, has various methods to turn one data type into another. Unlike other languages, it often uses that mechanism for type-checking.

Methods can do type-checking in at least two different ways. They can use the is_a? function to see whether the given object is of the right type, or they can try to turn it into an object of the right sort. Here's an example of the first method:

def myfunc(a_str, a_list)

raise "Not a string!" unless a_str.is_a?(String)
raise "Not a list!" unless a_list.is_a?(Array)

# now do what you actually want to
end


Pretty straightforward. This approach can work well. There are also various ways to automate it, as a subset of programming by contract or design by contract.

The second method takes about the same amount of code, though it's not always as clear what it's doing. It's probably also possible to automate, though I can't say I've seen a Ruby module to do so yet:

def myfunc(a_str, a_list)

a_str = a_str.to_str # no-op if a_str is already a string
a_list = a_str.to_ary # no-op if a_list is already an array

# now do what you actually want to
end


Unfortunately, Ruby doesn't seem to have an easy way to do the equivalent of CommonLISP's 'coerce' function. If you wrote one and called it lisp_coerce, then using it would look like this:
#This doesn't work because there's no such Ruby function as "lisp_coerce"

def myfunc(a_str, a_list)
a_str = a_str.lisp_coerce(String) # no-op if a_str is already a string
a_list = a_str.lisp_coerce(Array) # no-op if a_list is already a list

# now do what you actually want to
end


The idea is that it would return the object, interpreted as a different type. So 37 as a string would be "37", 37 as an array might be [37], and 37 as a hash table would raise an exception and tell you that you were being silly. Ruby actually does have a coerce method as part of its Numeric class. Unfortunately, Ruby's coerce takes two arguments, and attempts to get them into a compatible numeric representation. For instance, if you give it a Float and a Complex number, it will convert them both to Complex.

I say "unfortunately" because that means it can't be used as the lisp_coerce() statement above. Instead, you've got to find or guess the method names to turn one type into another, and many of them aren't all that obvious. This problem has been noticed, but there seems to be no real consensus on what the right answer looks like.

There is also another way to do it, though it's less consistently used: constructors. For instance:

# This doesn't work very well

def myfunc(a_str, a_list)
a_str = String.new(a_str)
a_list = Array.new(a_list)

#now do what you actually want to
end


Unfortunately, constructors often don't call the existing conversion functions. String.new, above, will die if given a FixNum, even though "37.to_str" works perfectly well. So this method isn't recommended for real use because the methods aren't in place.

So if you don't want your methods to just be stodgy and raise an exception on anything that isn't instantly similar, you'll need to use the methods to_int, to_str, to_hash and to_ary to convert to a FixNum, a String, a Hash and an Array, respectively. And you'll need to do some exploring if you want to convert to other classes, or define your own questionably-chosen names for your own classes. This is an area where Ruby could probably use some standards...