How Much Does Your Choice of Programming Language Limit You?

The first program I ever wrote was for a computer which never existed.

Somehow I got into the High School math club. The 'Smart Kid' had spent the summer hanging out at the David Taylor Model Basin learning stuff from the mathematician who ran it. He (the Smart Kid) came in and made a 'presentation' of how a computer worked and gave us a problem to try: sort a sequence of numbers in storage locations.

So I took the mimeographed paper home and 'invented' the exchange sort.

All the paper computer had was a couple of registers, maybe 10 instructions (with decimal op codes), maybe 20 or 30 storage locations, and a clock.

I programmed it in 'machine language'.

So when I made it to college and found out that they'd just gotten the second IBM 360, model 50, ever made, it made perfect sense to buy a copy of the IBM Principles of Operation for the machine. Again, I was reading machine instructions [the book was pretty thin at that time].

The next year I took a Numerical Methods course and 'learned' FORTRAN. Of course I was thinking in terms of machine instructions, so it also made perfect sense to me.

All this biased me towards thinking of programming as controlling the hardware.

I think it took another 10 years before I realized I was wrong and started trying to figure out what programming is really about.

I'm still working on it - but with such a backward start, it's been difficult.

"Huh?" you say?

Let me explain.

The first 'High Level Language' was probably FORTRAN - which is short for Formula Translator, or something like that.

It was designed for translating algebraic formulas into machine instructions that the computer could execute. It's possible - but very difficult - to use Fortran to do other things. Like the first computer game I ever saw - Adventure - was written in FORTRAN. All the twisty little paths were modeled as arrays of indices to other array elements, etc etc. I/O in FORTRAN was disgusting. Character strings were non-existent: they were usually stuffed into 32 (or 36) bit words, left or right justified, and blank padded. (blech!!!!)

But formula translation works really well for formulas and such - which is why it's still in use.

But the modern use of computers is really moving data around, drawing pictures, communication, control, operating machines, managing your burglar alarm system, (pant pant pant) and more stuff. Very little of this involves encoding and solving Algebraic Formulas.

So I - and a lot of other 'engineering students' - started out with a very limited view of what computers and what can be done with them.

Here's another example:

COBOL - maybe the 2nd (or 3rd) high level language 'invented'. Common Business Oriented Language (or something close to that). It's essential view of 'reality' is a business form - a spread sheet, as it were. About all it's good for is copying forms from one place to another while modifying some of the information - like giving the boss a raise as his record goes through.

Again, it's a very limited view of the world.

Then there was Algol - designed to encode algorithms, and Smalltalk - which sees the world as a bunch of autonomous objects (forgive me if I oversimplify). (let's forget about APL - it seemed to think the world was some sort of algebraic calculator with strange grouping rules)

A big wake up for me was when I started using *NIX systems and found awk - which sees the world as lines of text which can be broken into fields and messed with. And then lex, yacc, troff (and his friends: tbl, grap, and eqn).

My computing world started getting larger as I realized that you can create a language and that the concepts you can express in it are the limitations of what it's easy to do.

I eventually realized that programming is about building models of reality.

I'm going to repeat that, because I think it's important: programming is about building models of reality.

We've got some pretty good languages now - C, Python and Ruby. I guess you might want to include C++, C#, and Java - along with the thundering herd of Web oriented stuff - and that's OK, but it's not really the point I'm going for.

All of those languages start out trying to describe a limited set of features of Reality. And in doing that, they build limitations into the ideas you can easily express. When we - as coders - try to do something that language doesn't support directly, we have to 'work around' the language. 

I ran across the difference between what a language supports, versus what a language allows during one of my many attempts to understand C++. Bjarne Stroustrup made the point that in designing C++ he built a language which not only allowed object oriented programming, but contained language constructs which support it. C allows object oriented programming - born out by the fact that his original C++ compiler generated C, not machine language, but C does not in any way support object oriented programming. (I know from experience in attempting to write C in an object oriented style)

So I think it's pretty much obvious that the design of a computer language strongly shapes how programmers program. More importantly, it creates a limiting context about how they see the world and create programming solutions which model it.

What view of reality is best for a Language to support?

Doesn't it seem reasonable to design a language using the most general model of reality we have? That is, assuming that the model can be made specific enough that we can compute precise representations of 'reality'.

So what's our most general model?

The most general model I know of is the mathematical concept of a function.

What's a function? You cooperatively ask.

Well, a function is a map from a domain to a range such that for any element in the domain there is at most a unique element in the range. How's that for a vacuous definition?

I'm going to sound more ridiculous, but I think it's worth the apparent diversion.

Remember - we are modeling 'reality'.

So here's something real: craps.

When shooting craps, you need a pair of dice, a flat surface, and some people who know the rules. To make it simple, we'll use a subset of the rules:

  • the shooter throws the dice
  • everybody counts the dots on the visible horizontal surface of both dice and adds them up
  • everybody applies the rules and decides what happens next.

Here's the point: the 'function' which 'maps' the roll of the dice into a 'win', 'lose', or 'roll again' is the rules as it exists in the minds of the people. Given any roll, there is one and only one outcome.

If I want to write a program to simulate craps, I need to grab a couple of random numbers and apply the rules. But that's not the real function. What I am writing is a model. The real function exists in the event in the alley where the guys are 'shooting craps'  (if it exists at all, but that's another story).

In order to model craps, I need to pick a language which at least allows all of the stuff I need to simulate the important stuff. You may think this is easy - and it's not too hard to find one - but I defy you to accomplish this in a language which does not allow repetition - such as HTML, CSS, or native SQL. You need to have numbers, arithmetic, some means of repeatedly throwing dice, and some logic operations to implement the rules.

The Point

So here's the point. (again in overly simplified terms)

Assembler sees the world as machine instructions and memory.

FORTRAN sees the world as Formulas

COBOL sees the world as business forms

C sees the world pretty much as an abstraction of computer hardware.

Ruby, Python, and Smalltalk (and friends) see the world as objects with state and methods.

Lisp sees the world as functions.

(yeah - I just snuck in Lisp)

Let's try it another way

Assembler is designed to translate text directly into machine instructions

FORTRAN is designed to translate formulas into machine instructions

COBOL is designed to translate business forms into machine instructions

C is designed to translate operating system concepts into machine instructions

Ruby is designed to translate object abstractions into machine instructions

Lisp is designed to translate arbitrary functions of finite arguments into machine instructions

So which language should give you the most flexibility in solving problems of arbitrary complexity and scope - i.e. building models of reality.

Paul Graham is on record as saying all programming languages are converging towards Lisp. I think he's right and that the reason is that the model of reality Lisp has is the most general one we currently have. Inasmuch as programming is modeling reality, it makes sense that the best way to expand the domain of possible things we can model is to use the most general language we have.

So why don't we dump all the specialized languages and all hack Lisp?

I don't know about you, but I'm not a good enough coder - yet. Like I wrote above: I started in the wrong direction to understand the abstraction that is programming. So after a lot of backtracking, poking around, etc I think I'm finally ready to tackle Lisp.

Oh, there's just one more thing

One of the things I learned this last year of studying Ruby and Rails is the importance of generating code at run time. The currently popular term for this is Metaprogramming - but I think that's really a yet another corruption of language (English, that is, not the Ruby Language).

The simple fact is that programming is hard and we don't have time enough to write all the code we need. In languages which do not allow dynamically generated code - such as Java - programmers have to resort to external resources to generate code which 'is the same, with minor variations'. Dynamic languages such as Ruby and Python allow run time generation of methods and objects, so they provide the possibility of making programmers more efficient.

But neither Ruby nor Python really support run-time generation of code. They allow it.

Lisp supports it.

So I think Lisp is a better language - and this year's project is to become a decent Lisp coder.

I'm really, really tired of being limited by the world view of my programming language.

How to make a Bad Law (i.e. SOPA)

Laws which deal with criminal activity are kind of a strange beast: they exist to control people who won't follow the rules - so they are really difficult to implement (read: find enough proof to catch somebody and do unpleasant things to them)

This is really frustrating. A good example of this is copyright.

Copyright is just about useless. If you have enough money and you follow all the rules, you can enforce your rights in the US and any country which has signed the Berne Convention, but even that doesn't work everywhere - as the existence of the 'pirates' demonstrates.

So we say 'the law doesn't work and we need a better one'.

So what can a politician do when faced with the prospect of needing to 'do something' when nothing really seems to work?

Usually they decide to make people who will obey the rules take over the job of implementing the law.

Someone who:

  • operates under the law of the USA
  • has a history of obeying the law
  • has a lot to lose if they get caught violating the law

Let's see how this works in practice:

Income Tax Collection

Nobody likes to pay taxes. Most people want to pay as little as possible. Faced with having to wring money out of all the citizens of the US - especially after they'd spent it all - the natural 'solution' was make a few people responsible for grabbing the $$ before the citizens get their hands on it and for coughing it up.

So, naturally, the income tax collectors for the USA became all the companies who employ people.

The federal and state governments have reduced the problem from something which was completely unmanageable to something which pretty much is. [works well for all sorts of reasons: as an exercise for the reader]

My Problem becomes Your Problem

This is an example of 'My Problem / Your Problem' Principle.

  1. If I have a problem, you won't solve it for me UNLESS
  2. I can make it Your Problem, and then you will.

This works all the time in almost every situation. 

Back to SOPA & friends

We can control the 'pirates' who are stupid enough to operate in the US, but we can't go raid them in some foreign country without a lot of extra expense, and possibly committing an act of war.

The people who really own the problem are big media distributors. [This isn't about the majority of artists whom copyright is supposed to benefit. Their problem is venue, but that's another issue]

So, the SOPA drafters looked around for somebody they could make responsible for 'stopping the pirates'.

The lazy s.o.b.'s!

Their job is to draft laws which make sense and which will no only do what we need but will also work.

SOPA and friends are a way to pass the buck and make this somebody else's problem, not to solve it.

So how can we 'Stop the Pirates'?

I don't know, but somebody out there knows how to attack web sites and shut them down. A kind of 'surgical strike', to use a military euphemism.

What would happen if our wise and noble legislators wrote some rules which made that legal - after sufficient review. Would that work?

Seems to me like it's worth a look.

SOPA & Protect IP are NOT About Copyright Protection

Lets start with a quiz.

Everybody in the room who steals from their friends, please raise your hands.

OK - you guys go over to the far corner on the right.

Now that cleared out - what - maybe 5% of the room? Probably a lot less.

And who's in that corner? Anybody you really like?

Now everybody who will steal a movie from Big Motion Picture Conglomerate - or Big Music Company - please raise your hands.

That's just about everybody else - isn't it.

How come? What's the difference?

When you steal from your friends, you know it hurts them. Besides - not stealing is part of being friends. Right?

When you steal from BIG ENTERTAINMENT, you know that they will still be able to pay the mortgage and eat. Besides, their prices are a rip off. They're not really 'losing money' - they're 'not making more money'. Besides - who really knows who those guys are anyway? They're probably scum suckers that nobody really likes. The kind of people who buy off politicians and racially discriminate and stuff like that. Yeah. Right.

Isn't that pretty much how you see it?

What matters is what your relationship is with the person - or entity - that you are dealing with.

Louis CK just proved this. Piracy is not the issue. It's Relasionship and Perception.

Louis CK's fans have a relationship with him. He told them how he made the DVD and how he was out of pocket for the expense. Then he asked them - personally - not to rip him off.

What happened?

Sure, there was probably copying and all that, but he grossed over a million in 12 days - and posted the Paypal page to prove it. He made so much money it was embarrassing. He decided to give a bunch of it away - and that protects his relationship with You.

So Piracy doesn't exist when you think your stealing from a friend (unless you're one of those characters over in the corner).

If that's true, then what are SOPA and Protect IP really about?

It's got to be protecting BIG ENTERTAINMENT BUSINESS.

(did you notice how many BIG BUSINESS exec's are over in the corner with the first group? Stealing a 'standard business practice' - (don't think so? see any recent front page article in the Smart Phone and Tablet wars))

Who does BIG ENTERTAINMENT BUSINESS need to be protected from?

You.

They have to make it impossible for you to compete with them. From doing what Louis CK just did.

Why?

Production tools - computers, software, cameras, musical instruments, etc etc - are commodity items. You can set up a pro quality production studio for a few thousand dollars - well, maybe a few $10,000 if you want to get fancy.

Suppose you want to go bigger budget? Crowd Source the funding like Iron Sky is doing (http://www.ironsky.net).

Or maybe just get creative.

Distribution? Movie houses are like newspapers: they are dying. The ones that don't will go to digital projectors and download the stuff. No more film. No more expensive film processing. etc etc. Everybody's got a Big Screen TV and an Internet Connection. Stream them directly home - or to the local pub - or set up your own Movie House in some cheap office space.

The Internet lets us have Personal relationships on a Global Scale - so we don't need the massive capital that BEB has in order to create and distribute product.

It's Democracy!!!

It's Your use of the Internet is the BIG threat to BIG ENTERTAINMENT BUSINESS. They have to stop you from using - and keep if for themselves.

How?

The only thing that BEB (about time to acronym-ize) really has is a BIG PILE OF MONEY.

What can they buy with that big pile that You can't?

Lawyers and Politicians.

SOPA and Protect IP will give BEB a means to crush You when You create something and try to distribute it on your own. Then they will hit you with Lawyers.

Piracy isn't a real problem for You if you build your personal relationships right.

Your friends won't steal from you - at least not enough to notice.

Besides - if you Crowd Source the funding, there's nobody to steal from.

But BEB will use their BPOM to Lawyer you out of business - probably before you even start.

That's got to be what this is all about.

News - We wrote a book!!!!

Lora & I wrote a book. It's not a big BOOK, it's just a little one called "Doggerel by the Pound" and it contains - you probably guessed - mediocre poetry.

Cover

But that's not all: it also has Dog pictures!

 

Dog1
It's on LuLu, it's available in both Paper. We tried to get it on LuLu as an ePub book, but failed. After four (4!!!!!) rejections by the mighty LuLu ePub verifiers - and all for different reasons!!!! - reasons which were new every single time!!!! - I quit trying. BUT as soon as we can get it on B&N's ePub site, I'll post it here.

Support independent publishing: Buy this e-book on Lulu.

Makes a Great Holiday gift - or maybe not, depending on how you like it.

 

Here's a sample Poem:

On Doggerel

The doggerel of delight, you see,
Is found most everywhere.
You can enjoy the rhyme of life,
While sitting in your chair.

You can make it dark.
You can make it fair.
You can use this ol’ rhyme scheme,
Forever if you dare.

Test Driven System Administration

This is a First Rough Spec for at a framework to support Test Driven System Administration.

It's rough, ugly and incomplete.

Comments Appreciated and Desired

 


Test Driven System Admin

The idea is to apply TDD (or BDD or whatever buzz-acronym is appropriate) to System Administration.

Modeled (more or less) after Test::Unit and (my lame understanding of) RSpec

We need some support software. It should implement the following activities:

  • Namespace a group of tests
  • Describe a Test
  • Define test prerequisites
  • Define a Test.
  • Run a group of tests
  • Describe Test prerequisites
  • monitor test prerequisites and run only those tests whose prerequisites have changed.

General Syntax guidelines

The general form will be

<command> arg(s)... [ block of code ]

Test naming

Test names are computed by transforming strings associated with the test. The transformation consists of lower-casing all words, stripping leading and trailing non-alphameric characters, and replacing interior runs of non-alphameric characters by single hyphens (-).

Note: alphameric characters satisfy [a-zA-Z0-9]; non-alphameric characters are everything else.

The strings associated with the test are:

  • the optional enclosing namespace
  • all enclosing desc descriptions
  • the test name string

The actual name is constructed by conjoining all of these names starting with the outermost on the left and ending with the test name and separating each name with a doubled colon (::)

For example

namespace "foo" begin
  desc "bar" begin
    task "baz" begin
    ...
    end
  end

  task "baz" begin
  ...
  end
end

task "baz" begin
...
end

Will generate the unique names:

foo::bar::baz
foo::baz
baz

This should make test names easy to read and easy to make unique, if difficult to write.

See Describe a Test for more examples of names.

Namespace a group of Tests

namespace commands group zero or more tests and give the group a name which can be referred to in order to restrict a test run.

Name spaces do not nest.

Syntax:

namespace "description of namespace" begin
  describe & test statements
  . . .
  . . .
end

The name of this namespace will be:

description-of-namespace

Describe a Test

desc commands come in two forms and are used to attach descriptions to tests. Form 1 desc commands do not nest, whereas Form 2 desc commands do.

Form 1: a simple desc command with no associated block attaches the description to the test which immediately follows.

desc "description of the next test defined"
test "this is a test" begin
  some test code
end

This example generates the name:

description-of-the-next-test-defined::this-is-a-test

Form 2: a desc command with a block

desc "here are some related tests" begin
  test "this is an anonymous test" begin
    (some code)
  end

  desc "test description!!!!"
  test "this is a described test" begin
    (more code)
  end

  desc "a nested desc" begin
    desc "interior nested desc"
    task "another task" begin
      (code)
    end
  end
end

This example generates names:

here-are-some-related-tests::this-is-an-anonymous-test
here-are-some-related-tests::test-description::this-is-a-described-test
here-are-some-related-tests::a-nested-desc::interior-nested-desc::another-task

Define a Test

The test command defines a block of code which runs a test.

tests may be anonymous or described - depending upon whether or not they are preceded by a desc command.

test "description of test" begin
  (some test stuff)
end

If this test is defined without an enclosing namespace or desc block, it's name will be:

description-of-test

Run a group of tests

We need a test runner. The runner should:

  • run all tests
  • run all tests in a list of one or more namespaces
  • run specified tests, where the specified test is named by the lower-casing the test description and replacing all internal white space with single dashes (-) and trimming leading and trailing white space.

Describe Test prerequisites

Monitor Test Prerequisites & Run

Monitor Test Prerequisites and run only those tests whose prerequisites have changed.

Usability is Important for Utility Programs

Utility Programs

What makes a good Utility Program?

Well, the obvious first thing is that it has to have some utility – in other words: “It’s got to be useful”

Let’s take the venerable old make program as our model.

for those who don’t know what make is:

make is a program for managing the building of complex projects. complex projects are projects in which the content is divided into many different files which must be processed in a specific order and reprocessed whenever a one or more of the files is modified.

The original version of make was packaged in the 1970’s version of UNIX. It had it’s own syntax and was completely describe in a single, relatively short manual page. In other words, if you had a question about how to use it or modify a makefile, all you had to do was type man make and read.

Once you got the hang of what make needed and how to specify it, it’s easy to use.

Which brings up the second point that a good Utility Program should have: Ease Of Use!!!

Usability Applied to Utilities

So what does Ease of Use mean for Utility programs?

Well, again it’s pretty obvious:

  • it should have concise documentation
  • it should have a small, concise, easily understood and easily rememberable syntax
  • it should work

Let’s expand on why each of these points is important

Concise Doc

You don’t use Utilities as your main project. You use them to assist you in getting your main project done.

Every micro-second you spend reading the the doc while figuring out how to use one is Wasted Overheadâ„¢.

Syntax

Small, concise, properly mnemonic-ized syntax is easy to remember and quick to write. Obscure, verbose, poorly named syntax isn’t.

Again, if it’s not small, concise, and easy to remember it creates Wasted Overheadâ„¢

It Should Work

Well – that sounds obvious, but there are a lot of things which ‘kinda work’, but require a lot of convoluted knowledge and training to go beyond trivial tasks.

It shouldn’t do that – or we have more Wasted Overheadâ„¢.

A Utility should do it’s job, simply efficiently and simply. It should provide the means to efficiently and simply leverage on other utilities rather than trying to do everything.

make doesn’t compile and link programs, nor does it know how to create archives, or copy files or – for that matter – even invoke programs. It relies on the operating system shell program do take care of that.

It knows how to create a minimal, properly ordered sequence of tasks which will bring an end product ‘up to date’ with respect to the product’s dependencies.

That’s how utilities should work: do one thing well and delegate the rest.

So?

make was so useful, it seems that every programmer – at least those who are any good – decides they can do it better to ‘fix the deficiencies of make’ so it will be better.

This can happen one of two ways:

  • add features
  • re-implement

Add Features

Occasionally a feature is needed. Even a good, solid, concise utility may have one or two or even a few things which:

  • it can do without expanding it’s scope too much
  • are more efficient for Utility users to do within the utility than via delegation
  • don’t degrade the Utility’s User Interface more than the proposed benefit.

Gnu Make

I happen to like Gnu Make or gmake as it gets named from time to time. It’s not really a failure, because it extends make without creating any ‘backward incompatibilities’ with ‘traditional’ make.

So it’s possible to use gmake as though it were make.

It’s pluses are that it brings make capability into the GPL'ed universe – which keeps evil software companies from taking it away from us.

Some of the extensions in gmake bring in things which can be accomplished by ‘shelling out a recursive make call’, but some are real feature adds which can be useful in managing large projects. But the syntax extensions are too arcane for me to easily remember and the feature adds more esoteric than I need.

I do use some of them occasionally, but have to work through the gmake info pages to figure them out – every time. The doc far exceeds the single man page which used to describe make, so it’s a chore.

Synopses: gmake is fine, but it would be nice if there was a short, concise document which told how to use it in a minimal way.

re-implement

Again, there needs to be a positive benefit.

make is a C program which has to be compiled and needs to access a command shell in order to work. The effectively confines make to *NIX systems.

rake

rake is a re-implementation of the concepts of make in Ruby. It has pretty much the same features of the original make program, but uses Ruby syntax [rather than make syntax] and a couple of additional methods.

rake documentation is concise and small. The syntax it adds to Ruby is concise, small, and mnemonically significant.

So it fits the pattern of a good, usable Utility – if you are already a Ruby programmer.

The value it adds is that rake works anyplace Ruby also works.

This works out well in rake’s area of applicability: managing the building Ruby projects, so it’s a nice ‘extension’ of the make family.

imake

For those who don’t know, imake came out of the MIT X-Windows project. It somehow combined an iMakefile, the C pre-processor [/bin/cpp], and some other stuff to create X-Window compatible source code which could then be compiled and linked into executables which ran under X.

From Wikipedia:

imake is a build automation system implemented on top of the C preprocessor.

imake generates makefiles from a template, a set of cpp macro functions, and a per-directory input file called an Imakefile. This allows machine dependencies (such as compiler options, alternate command names, and special make rules) to be kept separate from the descriptions of the various items to be built.

It was special purpose, arcane, awkward, and probably unnecessary. As you can suspect: I hated it.

imake never made it out of X and has pretty much died out even there.

It wasn’t easy to use nor was it easy to understand.

And it violated both of my Usability criteria.

Summary?

Well, I guess the point of this is:

when you write a utility, don’t get carried away with the wonderfulness of what you’re doing. Remember that your Utility is a tool, not Great and Wonderful Epic. Keep it simple, clear, concise and effective

Yet Another Fable

Little Alex picked up a stick and tossed it. Then another and another and another. Now it was time to go, but - of course - he didn’t want to stop. So he picked up another one and ran over to mommy.

“Time to go - do you want to take that one home?”

Little Alex nodded. Mommy kissed him and they got in the car - all three of them: mommy, Alex and his new stick.

And so it began.

 

When he was about 3, Little Alex learned to throw stones really far. He really liked the sound they made when they hit the side of the house. They made a kind of hollow ‘boink’ sound. Mommy thought it was cute, but daddy didn’t like the dents in the paint.

Little Alex remembered that lesson and kept it in his mind just like he kept his stick in his dresser drawer in his bed room. In fact, he had a lot of time that afternoon to play with his stick - all the way until dinner time and then he had to go back to his room again without any TV.

 

So Little Alex became Not-So-Little Alex. He went to school and learned how to sit in a chair. He didn’t mind it, but some of the kids did. They would ‘do things’ when the teacher wasn’t looking. At first, Not-So-Little Alex was puzzled, but then he found out that doing that had a name: ‘being bad’. So Not-So-Little Alex learned about ‘being bad’ and then started carrying that around in his mind.

He even connected ‘being bad’ to ‘throwing rocks at the house’. He was ‘growing up’ and learning to tell ‘right from wrong’. It was fun to watch what other kids did and pick out the ‘bad’ ones. It made him feel really good to know he wasn’t ‘bad’. It was a lot better than before he knew how to tell.

 

And on and on.

Little Alex to Not-So-Little Alex to Young Alex to Teenage Alex to … well, just Alex.

And all the time Alex was watching and learning about what was what. Some things were like his stick - stuff he could keep in his dresser or his bike or his coin collection. But most of the stuff were ideas. Ideas like what’s good and what’s bad and who’s acting right and who’s acting wrong.

And all the while Alex was gathering up all these ideas and things he was finding out who he was. What he was like. What he did. How he always reacted. What he liked and what he didn’t.

Everybody who mostly thought how he did, Alex pretty much liked. They made him comfortable to be around. But if they didn’t, he didn’t like them so much. They just didn’t think right. They didn’t make any sense.

 

But at some point, something a little strange happened.

When Alex was Little Alex, he used to look at stuff with a kind of wonder. He wondered what it was. He messed with and just experienced it. But Alex - who wasn’t little any more - didn’t do that. When he saw something, he knew what it was. He didn’t have to wonder. He knew what everything was - whether it was good or bad; whether he liked it or didn’t; whether he ‘did that’ or he ‘didn’t’.

It was like he’d filled up and couldn’t hold any more ideas.

Alex knew a lot of stuff. And he repeated it over and over all the time. Every time he saw something, he had to tell himself what he knew about it. And that told him how to react to it. And so that’s what he did and that’s how he was.

Besides, Alex knew he was right, because all his friends said the same things that he did. Well, pretty much, but you can’t expect everybody to know everything. So he made some allowances. And everybody else was just plain wrong - and they’d see that if they’d just listen.

 

Middle-Aged Alex didn’t like the way things were going. Nobody had any respect for real values. Instead of fixing things, they made them cheap and just threw them away when they got old. Nobody really knew the value of a dollar.

Especially the kids with that noise they called music. And their morals and the kind of TV shows they liked to watch.

He thought it was the fault of the schools or the government that they weren’t teaching the kids right from wrong. If they’d just listen, he knew what the answer is.

 

Laid-Off Alex couldn’t get a new job. It had been almost a year he’d been out of work. Nobody appreciates the kind of skills it takes a long time to acquire. They just said Alex was old fashioned and out of date. Sure, he wasn’t as quick as the young kids and didn’t really get the ‘new Technology’, but that wasn’t important. What’s important is experience and he had a lot of that. Hadn’t he been working for 40 years at the same job. Doing the same thing day in, day out, over and over again. Hadn’t he been loyal?

So he stayed home watching movies. The same movies he’d watch when he was a kid. And he listened to music. There was this Golden Oldies station that just played the music he liked. And he met his old friends for coffee at a local diner. And they talked the old talk - about who was good and bad and how things had changed and it was getting worse and worse.

That was comfortable - because these guys all knew what was what - like Alex did. So he knew he was right.

It never occurred to him that he didn’t really need that old stick. That he could throw it away and maybe find another one or maybe a frog or something.

The same with his ideas. He never thought about whether they were really right or wrong or even if they made any sense. He just kept them - some of them from when he was 3 or 4 years old.

Alex would never listen to something a 3 or 4 year old told him. Well, never seriously - no kid that young was experienced enough to really know what they were talking about. And it never occurred to him that he was doing that when he listened to himself. He knew that that’s how he was. And that’s what he knew.

 

And so it ended.

Alex started out looking at the world and gathering up things and ideas until he was full to the brim.

And then he stopped living, but kept repeating the same things over and over.

And then he died.

-The End-

Building a Rails 3.0 Gem

Building a Gem for Rails 3

Believe it or not, it’s not that hard.

Here’s the basic outline:

  • lay out a basic Ruby gem, with the normal directory structure
  • decide how you need to hook into Rails
    • if you need to patch some of Rails internal structures – such as add some functionality to ActiveController – then you need to write a Railstie
    • if you need to add a controller, a model, view, rake task or a generator – then you probably should use an Engine. [the difference between a Railtie and an Engine is that an Engine is a Railtie with more stuff]
    • Or, if you want to embed an entire Application into another, you can use subclass Rails::Application. If you need to write an Application, then you need to find somebody who knows how to do it.
    • Or, if you want to make a Plugin, you should forget it and just build a gem and either implement a Railtie or an Engine
  • build, test as usual. Below I’ll show how you can hook your local, development gem into a locally run Rails app.
  • package and ship out to github.com and rubygems.org

In everything which follows, I’m assuming that you are adding functionality to Rails which requires some sort of ‘initialization’.

What does ‘requires initialization’ mean?

Ok – when does your gem ‘require initialization’?

  • if you need to run a rake task or a generator to install some stuff in order for your gem to work – then it ‘requires initialization’. In fact, you should probably create an Engine.
  • if you’re adding controllers, views, etc, then you need an Engine
  • if you want to monkey patch ActiveController or one of the other basic Rails classes, then you will want to ‘include MyGemModule’ into that Rails class when it is autoloaded. For that you need to insert yourself into the autoload sequence and so you ‘require initialization’. Specifically, you need at least a Railtie.
  • finally, if you’re not doing any of this and there is some way for your gem to provide services without any startup initialization and without living in any namespace Rails knows about, then you don’t need initialization.

So, the answer is – pretty much all gems which will extend Rails are going to ‘require initialization’ and your initialization stuff goes into your Railtie or Engine.

So …

What goes into your Railtie / Engine

Surprisingly little – but figuring out what that ‘little’ is can be daunting.

Here’s the Railtie I wrote for my manage_meta gem:

module ManageMeta
    class Railtie < Rails::Railtie
      initializer "application_controller.initialize_manage_meta" do
        ActiveSupport.on_load(:action_controller) do
          include ManageMeta
        end
      end
    end
  end

There’s a lot going on here in 9 lines (typical Ruby compactness!)

Before going into this code, it will help to put it in the context of how this gem is structured.

  • the gem and github repository are both named manage_meta
  • all the code which does something is in manage_meta/lib
  • the file which is required is named manage_meta/lib/manage_meta.rb. All it does is require files from manage_meta/lib/manage_meta/.
    • It always requires manage_meta/lib/manage_meta/manage_meta.rb: this allows me to write unit tests which are independent of Rails
    • It conditionally requires the Railtie.

Here it is:

require 'manage_meta/manage_meta'
  require 'manage_meta/railtie' if defined? Rails

OK – so all we need to do is get our Rails app to include manage_meta/lib/manage_meta.rb. But this is a distraction right now. We’ll get back to it when we go over the Rails boot process.

By the way, the names are significant!

  • The root directory manage_meta, the top-level require file manage_meta/lib/manage_meta.rb, and the subdirectory of lib manage_meta/lib/manage_meta/ need to all be the gem name.
  • I think that the Railtie (or Engine) really needs to be named railtie.rb (or engine.rb) in order for Rails to find it. [this is a guess which I may never resolve because: 1. it works when I do it like this and 2. there’s no reason not to. If anybody can confirm this, I’ll remove this caveat and give them credit]

  • module ManageMeta – I’m extending my module with some Rails specific code. While not visible here, this is conditionally included in lib/manage_meta.rb

OK, back to the Railtie:

  • module ManageMeta

    Your gem must be namespaced to a Module and you have to define your Railtie (or Engine) within that module.

  • class Railtie < Rails::Railtie (or class Engine < Rails::Engine)

    Sure, you can call it Bob if you want, but why bother? It’s real name is ManageMeta::Railtie or MyGem::Railtie – which is safely namespaced, so there won’t be any conflict here.

    The important thing is that this get’s you all the stuff in Railtie – most importantly, for what we’re talking about here, this is where you get the ability to initialize by calling initializer [defined in rails/lib/rails/inializable.rb (see the pattern? everything is a gem)]

  • initializer “application_controller.initialize_manage_meta” do …

    initializer takes a name, a block, and (optionally) a couple of options. The option keys are :before and :after and we’re going to ignore them for now.

    It builds an Initializer instance and stuffs it into the array-like object named initializers. I say it’s array-like because all the Initializer instances are in sequence, but the options allow anyone in the know to place their specific Initializer.

    Let’s just take it on faith that as Rails boots, it goes through initializers and runs the code blocks we pass in.

Here’s the

Infinite Recursion in Parser Generators

Well, I've stuck my foot in it again and doing something which doesn't make a lot of sense.

Skipping the details, I decided I need to write a parser in PHP and the language I'm designing is embedded in PHP - which has a complex syntax and . . . anyway, one thing led to another and I've ended up writing kind of a parser generator in PHP.

It's not really a parser generator - it's more like a programmable parser where the program is a grammar specification.

So, I broke out the Dragon book and started reading, built a programmable recursive descent parser framework object and a hand coded parser for language grammars so I can can program it and a programmable lexical scanner - all in PHP and it all works pretty well.

And then . . .

I couldn't solve my problem with it.

Why, you ask?

Well, the problem I have cannot be solved by a parse tree created from a right-recursive grammar - which is what the book says a recursive descent parser needs to process.

Why?

Because when a recursive descent parser hits a left recursive production (which is what I need for my problem) it goes into an infinitely deep recursion.

Why does it do that?

It's stupid.

It turns out that no only will simple productions like: a : a TERMINAL ; create infinite recursions, but various, well hidden, mutual recursions will as well.

So - having faith in the Book - I decided maybe I need something which handles left recursive grammars. So I read and read and thought and thought and - as usually happens - I got tired, went to bed, and woke up this morning with a realization:

"It's not the recursion dummy, it's because processing non-terminals don't eat tokens!!!!!"

If that doesn't mean much to you - that's OK. The rest of this post is a boring explanation of what's happening and how to fix it.

First of all - why isn't it obvious from the book? Because it's not in there because:
  1. The book defines a mathematical formalism to describe language structure and parsing
  2. Like good mathematicians, they then ignore the actual problem and get buried in the formalism. And then . . .
  3. They come up with ways to solve problems in the formalism using programming techniques and computer constraints available at the time they are working
  4. The 2nd, 3rd, etc generation of 'students' become teachers and so they just teach the formalism in the computing context of the time of the original work
My dragon book is copyright 1977. Torben Ægidius Mogensen's "Basics of Compiler Design" is copyright 2000 through 2010 [nicely written, by the way] and the syntax analysis is a rehash of the stuff in the Dragon book [to be fair, I didn't read it all, but this is true to the margin of error inherent with a quick skim]

Believe it or not, things have changed.

The Apple 2 computer didn't exist in 1977 (I don't think it did. I got mine in 1979 or 1980) and it maxed out a a whopping 64 Kilobytes of RAM [that's 1024 bytes]. The processor executed one instruction about every couple of microseconds. In other words, both memory and speed were very very limited, so a lot of work went into algorithm design - at the expense of clarity and simplicity of code.

As a result, the compiler generators tend to avoid recursion ["function calls are expensive and take a lot of RAM"], but rather tended towards memory and speed efficient algorithms. As a result, the compiler generator section of the Dragon book is heavy into table driven parsers using conventional, non-recursive, non-functional programming techniques.

And - finally getting to the point - they are so deep into formalism and computing environment, they never actually get to the point of "what causes infinite recursion in parsers".

Well, here's the answer: any algorithm which revisits the same non-terminal without consuming a terminal symbol will infinitely recurse.

Huh?

This highlights another problem in understanding compiler generation: the compiler-eze terminology stinks. It emphasizes the algorithms, not the problem we're trying to solve.

So, here's what the Parsing Problem is:

Given a string of characters, does it make sense in a specific, grammatical language?

OK - that's not specific enough to answer. So let's make it more concrete:

First we will define a bunch of things we will call words and symbols. A word will be a string of 'letters' without any spaces in them. In English we also allow hyphens, so 'cat-food' could be classified as a word. In PHP a word might be a reserved word - 'foreach' or 'if' - or something like that. Anyway, we decide how to find these things in a string of characters.

We're going to call the things we find 'tokens' and it's the job of the 'lexical analyzer' to eat the string of characters and spit out an ordered list of 'tokens'.

These tokens are what the Language Grammarians call 'terminals' or 'terminal symbols'.

I'd rather call them 'tokens' or 'words' because that puts the focus back on what they are in the language. The term 'terminal' puts the focus on the activity of the parser - which we haven't gotten to yet.

Now, you might try to build a grammar description using only 'tokens', but it would get pretty large pretty fast and it would be really limited.

So you need something else. You need things which represent parts of the language. For example, you might need something called a 'sentence' [starts with a capitalized word and ends in a terminating punctuation mark: . or ! or ?] and maybe a 'paragraph' and maybe . . . well you get the idea.

These things which represent parts of the language can be composed of tokens or other parts of languages. In fact, in order to be really useful, these parts need to be able to refer to themselves as part of their definition - that is 'be recursively defined'.

For example, lets say I have only four words: A, B, C, and D. I also have a couple of symbols, say AND and OR. That's my whole vocabulary.

Now let's say I want to construct sentences. I might say something like:
sentence : A AND B | A OR B | C | D ;

where I'm using ':' to mean 'defined as' and '|' as 'or it might be' and ';' for 'that's all folks'.

But this is kind of limiting. So let's say I want to build more sentences than I can list using only words and symbols.

word : A | B | C | D;
sentence : sentence AND sentence | sentence OR sentence | word ;

In compiler-eze, these parts of sentences are called 'non-terminals' - again, putting the emphasis on the process of parsing [the parser can't stop on a non-terminal] rather than on the structure of the language. I'm going to call them 'fragments'.

Now, there are two ways I can use a grammar:
  1. I can build sentences using it - which you do all the time: writing, speaking, creating programs, etc.
  2. I can transform strings of characters (or sounds) into sentences so I can understand them - this is called 'parsing'
Before we get to parsing, let's look at how we can use the grammar to create a sentence.

Let's say I want to build a sentence - but I really don't care what it means, only that's it's grammatically correct.

I'll start with the fragment sentence. But this doesn't get me a string of characters. Grammars can only build sequences of 'fragments' and 'tokens'. Tokens are made up of sequences of characters - which is what I want - but 'fragments' aren't: they are made up of 'fragments' and 'tokens'.

So, in order to build a character string - or say something in the language - I have to get rid of all the 'fragments' so that I have a string of 'tokens' which I can (at least theoretically) feed to the un-lexical un-scanner which will produce a string of characters - which I can then print in a book.

So how do I proceed? (the arrow (->) means 'replace a 'fragment' on the left with one of the alternatives of the right side of the definition of the 'fragment' in the sequence and write it on the right side of the arrow.) (which is easier to do than say)

sentence -> sentence AND sentence -> A AND sentence -> A AND D

and now I'm done. I have 'produced' a sequence of 'tokens' [TERMINALS in compiler-eze]
which I can un-lexical analyze to produce a sequence of characters.

Now in compiler-eze, the alternatives on the right side of the definition of 'sentence' are called 'productions', because replacing a 'fragment' by one of them 'produces' something which is grammatically correct.

Ok - this is pretty straight forward, if boring. So let's turn to the 'parsing'. That is, given a string of characters, is it a grammatically correct sentence?

The mathematicians would say 'it's grammatically correct if (and only if) there is a sequence of replacement operations I can find using productions which will generate the sentence'. So - as they would have it - they have 'reduced' the problem of 'parsing' to finding a sequence of productions which will produce the sentence.

How do we do that? The Dragon book starts by analyzing algorithms, but let's take a different approach: let's look at what we do when 'parsing' a sentence somebody says or that we've just read.

What I think you do (or we do) is look over the sentence and divide it up into chunks which make some sort of sense. Like 'Joe ran through the forrest'. Well, what's this about? 'Joe'
What did he do? 'ran' Where did he do it? 'through the forrest'. Stuff like that.

Let's formalize this procedure:

First we'll lexically analyze the sentence: for 'Joe ...' this amounts to classifying each word according to its possible uses:
  1. 'Joe' - is a noun and a name. It can be used in a subject or the object of a phrase
  2. 'ran' - is a verb. It can be used as a 'verb', as part of a predicate, part of a compound verb, or a phase ['seen to run']
  3. etc
Then we start parsing by examining the first token: Joe. Some sentences start with a noun, so we put 'Joe' on the shelf and look at the next word to see if it fits with how sentences which start with nouns are constructed. etc.

The point is, we are scanning from left to right and trying sentence forms to see if they fit
the noise we just heard or read. [left to right, right to left, up to down - doesn't matter so much as the fact that it's really focusing on one word at a time in a consistent order].

So, in parsing we have two scans going on:
  1. we are scanning the token stream
  2. we are also scanning across a production to see if these tokens fit into it
The 'parse' terminates when the token stream is exhausted and all the tokens have been stuffed into 'fragments' OR something won't fit into any fragment. This is controlled by the sequence of scans across productions. Each time we start scanning, we start with some 'fragment' definition and exhaustively try all of it's productions to find a fit with the token stream - remember that we are scanning the stream left to right. So the only way to get into an infinite recursion is to find a production scan which does not terminate.

Scanning a production terminates on one of three ways:
  1. a segment of the token stream matches the entire production - then the production is accepted. Accepting means that we don't have to look at those tokens any more and we can make a record of the fragment we recognized. [in compiler-eze we then 'reduce' by replacing the production by it's non-terminal in the non-terminal definition (again, emphasis on algorithm rather than process]
  2. a token doesn't fit, in which case the production is rejected.
  3. the production can be empty - and so it's trivially satisfied. [I forgot this early and have to think some more about it. Golly! that's meat for another post on this topic]
So if - in our scan across the production - we never look at any 'tokens', we will never terminate the scan. How can this happen?

Here's an artificial example:

frag1 : frag2 | WORD1 ;
frag2 : frag1 | WORD2 ;

No matter what I scan, my production scan will first look for frag2 which will look for frag1 which will look for frag2 which will . . . and I will never examine a token, so I will never reach the end of the token stream.

To go to a less artificial example, let's go back to my A, B, C, D language.

I'm given with a sentence A AND C and I want to see if it can be produced by the grammar. I decide to 'run the grammar backward' to see if I can find a sequence of substitutions which work.

OK, I start by guessing it's a 'sentence', so I write down:

sentence

Now I say - 'what production might this be? Let's try the first one!', so I grab:

sentence AND sentence

Now you can look at the whole sentence and say 'Yep!!! It fits', but the computer will only look at what it's programmed to do. So, lets say that I've programmed up a recursive descent parser, which works by defining a function for each 'fragment' which it calls when it sees it's name in a production.

So my 'parser' will see 'sentence' and call the 'sentence' function which will then look at the first production and will see sentence and will call the 'sentence' function and . . .

And there you are - infinite recursion.

So we can't use a recursive descent parer. Right? Well, . . .

The recursion isn't caused by the parsing method, it's caused by any algorithm which attempts to match the same 'fragment' twice without recognizing and moving past a 'token'.

So infinite recursion in parsing results from designing an algorithm (any algorithm) which can cycle through a sequence of 'fragments' without ever recognizing (and using) a 'token'.

So, can I patch up a 'recursive descent parser' so that it handles 'right recursion' and other forms of infinite recursion?

Sure. I just have to keep track of my progress through the token stream and reject any production in which a 'fragment' occurs which I'm in the (recursive) process of examining AND which is at the same place in the token stream as it was before. Again, this will be easier to code than to write.

I'll post a note when I've finished fixing this thing - in case you want to look at the code

Mike

Change - again

Just about everyone I know would rather die than change something they believe.

That's too vague.

Let's say I believe I'm too fat. That can make sense if I look in the mirror and see somebody who looks like a sphere. But for an anorexic, when they look in the mirror they see somebody who looks like a stick.

The reason we say they are 'anorexic' isn't because they look like a stick. It's because they look like a stick and think that they are too fat AND they won't change what they believe.

So how do we react to this?

We call something like this a 'disease' and look for something to do to them to make them change. Probably some chemical we can put in a pill or an injection or a patch or a suppository.

Does it really make sense that an inert chemical can cause someone to have a specific idea? Isn't an 'idea' or a 'belief' more complex and specific than a single chemical?

So what can these 'drugs' really do? - other than slow down or speed up thinking?

If that's all they can do, then 'drugging' people just changes their ability to think - their 'thinking environment' - not their beliefs.

So 'drugs' can't 'cure their disease', although they may make it possible for them to think about it differently. Maybe they it makes them think more ssssllloooowwwwwlllllyyyyy. Maybe it makes them stop thinking at all. Or maybe it just makes them passive so we don't have to think about them at all. Or maybe - as my friend who knows these things says - they generally don't work.

But that's not the point.

The point is: if an anorexic didn't believe he/she was fat, she wouldn't be an anorexic. She'd be a skinny person who knew she was too skinny and would do something about it - like eat some more.

So how do you change a belief?

Take football for example. The team which wins consistently believes that they can win. Not only that, they believe they can win this game. Right now. If they think they can't, they always lose.

What makes them believe this?

It's pretty simple: they have a slogan, a mantra, a rallying cry, a whatever to repeat over and over again. So as long as they can keep telling to themselves they can win, they will win, they're going to win - then they believe they can, will and are going to win.

Is a belief anything more than something we keep repeating to ourselves?

What happens when we stop talking to ourselves about one specific belief? Doesn't the alcoholic or a smoker keep reminding himself that he needs a drink or a cigarette? What would happen if he - instead - reminded himself that he needs an ice cream cone? (Besides getting fat and maybe getting diabetes) Wouldn't he eventually go from being an alcoholic to an ice cream-aholic?

A belief is just a thought. It's not made out of stone or steel or even jello. It's 'mind stuff'. There's two kinds of 'mind stuff'. There are memories and there's 'what I'm thinking now'.
All you can do with a memory is either lose it or drag it out to 'think about it now'. Everything you do and experience is the 'what I'm thinking now' stuff. That's where the anorexic and the alcoholic and the smoker 'belief' exists.

There isn't any automated thought loader which pushes thoughts into your 'thinker' and makes you think them. You get to pick and choose.

Don't believe it? Close your eyes and try to count the thoughts which come up over the next 10 seconds.

If you're like I am, there were a lot of them. Ten, a hundred, I don't know. Just lots and lots of them. I'll bet you 'thought about' just a couple - maybe one or two. What happened to the rest of them? They're like the kids you didn't pick to be on your team: they just wandered off.

The stuff you and I believe - about life, goodness, and - especially - ourselves - are just these familiar little thoughts we keep repeating. And by repeating them, we think their real. And that's all a belief is.

So really, how hard is it to change a belief?

It's easy - if you want to and are brave enough to give it up.