Rixstep
 About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Home » Learning Curve » Developers Workshop

Learning to Crawl

I'm learning to fly but I ain't got wings.
 - Tom Petty


Get It

Try It

Ten years ago in the nascent years of the WWW things could look like this on day one of an 'introduction to computer science' class.

The teacher takes the students into an admin office. There's a PC laid open on a desk. The students gather round. The teacher points carefully at the innards of the machine.

'This is the power supply. This is the fan. This is the hard drive. This is the motherboard. Here on the motherboard you can see the CPU. These other chips are RAM chips - memory chips. That's where the computer stores memory when it's working.'


And so forth. And this would be for a class of end users - a class in 'introduction to office automation' or the like. Not for programmers - for ordinary end users. This short introduction to the physical aspects of computer science would be followed up by several days command line processing, spreadsheet work, word processing work, and so forth.

Skip now to 2009 and take a peek at university courses in computer science and you're likely to see none of that. People are graduating with degrees in computer science with no chops whatsoever in programming. Quark Xpress tops.

But they might learn Java. Or Ruby. Because they're everywhere. They're the new thing. They're cool. And above all easy. But what happened to the basics? Why are people being asked to run before they can walk or even crawl?

A couple of recent entries on the Cocoa Dev mailing list.

Some (or most) people might be aware of this caveat, but I was not, so I'll share it.

Consider this code:

NSArray *array = [NSArray arrayWithObjects:[MyCounterClass newObject], [MyCounterClass newObject], nil];

where [MyCounterClass newObject] is a static method that returns a new autoreleased instance that simply stores an incrementing int32 counter in its instance variable, e.g.

self.oid = SomeStaticCounter++; // (or ideally, OSAtomicIncrement32Barrier(&SomeStaticCounter);

Now one would expect that the array would contain:

element 1: MyCounterInstance.oid=1
element 2: MyCounterInstance.oid=2

However, this is NOT the case. Either the compiler or the runtime executes the SECOND call to [MyCounterClass newObject] FIRST, so the array actually contains:

element 1: MyCounterInstance.oid=2
element 2: MyCounterInstance.oid=1

NSArray arrayWithObjects: is of course correctly putting the objects into the array in the correct natural ordering, but the objects are CREATED on the stack in the oppose order. Maybe most people knew that, I did not. So the (or a) workaround is:

MyCounterClass *object1 = [MyCounterClass newObject];
MyCounterClass *object2 = [MyCounterClass newObject];
NSArray *array = [NSArray arrayWithObjects: object1, object2, nil];

Never mind the fact that the 'oid' serialisation with 'SomeStaticCounter++' isn't going to produce the same results as OSAtomicIncrement32Barrier(&SomeStaticCounter) - there's something far murkier afoot here.

The word 'expect'.

Objective-C, like C++, was initially a preprocessor wrapper around C. It still functions as that for method calls. The calls go through the Objective-C runtime where they're transformed into ordinary C function calls. The above call would look like the following with 'func' as the address to the NSArray class method arrayWithObjects:.

objc_msgSend(func, [MyCounterClass newObject], [MyCounterClass newObject], nil);

We're back to C again. And in C there's expressly no ordering of the evaluation of function arguments. It should be patently obvious why and C makes this very clear. There are strict rules for the evaluation of expressions - execution only continues until the outcome is known and always proceeds from left to right; there are strict rules for how operands bind to operators, whether they have right to left or left to right associativity, and which have precedence over others. This is basic stuff every C programmer is 'expected' to know by heart.

But someone on the mailing list and most likely tasked with a programming assignment didn't seem to know this. Compiler generated code for function calls is dependent on the call model being used. Intel prefers the PL/M method where arguments are pushed right to left; C functions such as printf have to use left to right for obvious reasons; and so forth.

[Yes as there's a 'nil' argument to end the list you might 'expect' the compiler to use 'left to right' but once again: you simply don't know and the matter is left undefined for a reason. The programmer has no control over how the code is generated. The compiler could very well keep track of the number of arguments and proceed right to left anyway. You have no business with 'expectations' here and good programs end up bad when programmers 'expect' things.]

Some of the answers to the above question.

This is actually a 'feature' of C which ObjC inherits. C does not define an order of operations except across 'sequence points' which are basically semicolons [sic]

'Feature'? No it's not a feature. It's something weird and abstract called 'reality'. Several environments (systems) offer alternative call protocols that can be specified in compiler directives. It's not a feature.

Function prototypes are used to help the compiler decide what type of code to generate. By default the call protocol is going to be left to right but several environments use other protocols.

C can't know what the underlying environment is going to be, C is platform independent, C cannot make assumptions of that nature. That's first term Comp Sci 101 - the first three months.

As such, conforming C (and thus ObjC) code must never rely on the order of execution of function arguments, arithmetic subexpressions, or anything else of that nature.

Simply false. Function calls are one thing but precedence and associativity tables and rules about evaluating truth expressions are followed strictly.

All of these rules are well defined for C. They have to be and they are. The original example is covered by the rules for evaluating expressions in function calls (you cannot depend on any given order) and precedence and associativity for evaluation otherwise.

[Perhaps this is one of the reasons modern production code is so uniformly crappy?]

You've discovered the joy of implementation-defined behaviour. The problem is not anything inherent in arrayWithObjects:; it's in the fact that you're modifying a variable (SomeStaticCounter) twice between a single pair of sequence points. The elements of an argument list are not guaranteed to be evaluated in any order. It could go front to back, or back to front, or alternate left and right from the outside in.

But it's not implementation defined at all and it has absolutely nothing to do with any 'sequence points'. It's defined by the language. Compiler directives can set the call protocol but this is not a question of implementation. A programming language can never rely on implementation. No serious programming language ever has.

Note, by the way, that the order in which the arguments are evaluated has nothing whatsoever to do with the order in which they're put on the stack. They are not *created* on the stack. They're all evaluated and then they're pushed in the proper sequence. [sic]

There's no way of knowing this. One might possibly find that all the subexpressions are evaluated and stored in compiler temporary variables but they need not be. There is absolutely no way of knowing this save by inspecting the assembler code and it's totally wrong to base coding on what's found there. The code's supposed to be implementation independent.

Bottom line? There's no 'expected'.

Another example.

Hello,

From what I know so far, memory allocated using the malloc() family of functions is freed using the free() function. Literal values such as :

char *aString = 'some text';

are automatic values and are deallocated by the compiler automatically.

When I free some pointer that was allocated as in the example declaration above I get a warning that a non page-aligned, non allocated pointer is being freed. Then in practical terms, what does a literal value such as a #define that is used to initialize pointers such as the one above serves for ?

If for example I have a group of string #defines that are used in log messages, that means that I will have to malloc space for them the sprintf them to it, so I can be sure that I don't get that warning when deallocating the log messages.

when you pass as pointer to bytes (like a void*) to cocoa (for example NSData), what does it do ? It copies the bytes or just copies the pointer ? If I pass &aString to it that means that at the end of the scope it will be deallocated, and NSData will have a dangling pointer ?

There are so many false steps here one hardly knows where to begin.

  1. 'Literal values such as : char *aString = "some text"; are automatic values and are deallocated by the compiler automatically.'

    They are? "some text" is a string constant. It's not going to be 'deallocated' - it exists in the binary image of the program. And for that matter: char *aString seems to belie a typical glaring example of how storage is wasted to get access to a string constant - something which for the record many Microsoft programmers aren't aware of. aString is a pointer to type char. It has storage. It's initialised to point to the string constant "some text". There are two allocations when in all too many cases there should be only one.

  2. 'When I free some pointer that was allocated as in the example declaration above I get a warning that a non page-aligned, non allocated pointer is being freed.'

    The poster seemed to initially understand that calls to free match calls to malloc or the equivalent. Yet there's been no allocation call here. Why is he suddenly interested in trying to free something that hasn't been allocated?

    When he sends the call the memory manager uses the address contained in the pointer to find the memory block that had been allocated. In this case there was no allocation. But the address points to something in a read-only part of the executable image.

    But hold on for it gets better.

  3. 'Then in practical terms, what does a literal value such as a #define that is used to initialize pointers such as the one above serves for ? [sic]'

    It's not certain attempting an explanation for this person will help. He obviously doesn't even grasp the most rudimentary parts of the language. #define doesn't have any magical capabilities - it's a word processing type preprocessor with argument substitution. It's great and one of the reasons C is so powerful and abstract but it can contain anything. You don't know what you get until you put something in. Our friend here seems to have no inkling about that.

  4. 'If for example I have a group of string #defines that are used in log messages, that means that I will have to malloc space for them the sprintf them to it, so I can be sure that I don't get that warning when deallocating the log messages.'

    This person is in real trouble. He's sailing for the Straits of Magellan in an open dingy with a map of the Sea of Japan.

  5. 'when you pass as pointer to bytes (like a void*) to cocoa (for example NSData), what does it do ? It copies the bytes or just copies the pointer ? If I pass &aString to it that means that at the end of the scope it will be deallocated, and NSData will have a dangling pointer ? [sic]'

    There's really no point trying to sort this person out. He needs courses in crawling.

Some of the replies.

  1. 'String constants are stored as constant data in your program and cannot be deallocated (because they were never allocated in the first place). Passing them to free() will cause an error like the one you observed.'s

    Yep.

  2. 'In actual fact, they are neither allocated nor deallocated. String literals are stored in a section executable itself, and the compiler just initializes the aString pointer have the address of that literal.'

    Yep. Same thing said another way.

  3. 'You're attempting to free something that was never malloc'ed.'

    Right again.

  4. '#define is a different thing altogether. The C compiler never sees #defines; by the time the C compiler is processing that code, all macros have been evaluated and that is what the C compiler sees.'

    Impressive answer.

  5. 'This is basic C (no cocoa or anything involved). I strongly recommend that you find a good C book that talks about pointers and how they work. You will save yourself years of grief and unexplained bugs if you understand pointers now; they are the most critical concept in C and if you don't understand them, you will never be a good C (C++, Objective-C) programmer.'

    Amen.

A final example.

Hi Everyone, What is equivalent of Redraw method of MFC on Cocoa?

One of the replies.

Asking clearly shows you're not ready to just dive into Cocoa yet. Please read up on the basics first at: http://developer.apple.com/DOCUMENTATION/Cocoa/Conceptual/CocoaViewsGuide/Introduction/Introduction.html

Note the use of upper case in the URL path. Somebody's sending a not too subble hint.

In this case the prospective developer can't even read a cursory introduction, much less a few good tomes on the subject, much less a battery of courses, and who knows what else is lacking in this person's background - whether this person would fare better with either of the first two riddles.

The issue isn't about programmers who jump before they can. This is about programmers who are pushed before they are ready. The question becomes how long this pushing has been going on. Corresponding with people involved in education today - what with Java and Ruby all the rage - seems to indicate it's been going on a long time.

Kudos must be given to the people helping out on the developer lists - they're performing an impossible task. They rarely come outright and say 'you're not ready' but they must often be thinking that. And there is no way they can compensate for the fact so many people haven't learned the basics. The questions must come at the appropriate level if the answers are to be truly helpful.

A few years back the contributors to this site gave yet another course in systems programming. There were twenty four delegates - much too many - and all great people but delegate #24 was a delegate apart - an ordinary office worker at an embassy for a country in the middle east and new to his job to boot. The people running the embassy reckoned they needed in-house IT expertise and no one else was available.

This person had never opened a computer and studied the innards, much less written a line of BASIC on one, much less dabbled in C in his spare time, much less taken a degree in systems engineering, much less had the minimum three years full time experience the course prerequisites demanded.

This extreme example is indicative of what seems to be going on everywhere. Programming education in many countries is down the tubes. Nobody talks anymore about CPUs, program counters, stack pointers, register flags, accumulators, source and destination index registers, base pointers, assembler instruction sets, microcode. They talk about Java classes. They try to learn Ruby and Java before they learn what a computer really is - without ever learning what a computer really is. They try to learn to fly but they ain't got wings. They try to learn Objective-C and the Cocoa classes before they know C.

There's a reason the Apple tutorial on Objective-C claims it only takes 2-3 hours to get up to speed: it's because it's true. But the claim rightfully and justifiably assumes you already understand C like the back of your hand and have spent sufficient time learning and using the language day in and day out for years.

There are programmers who can learn new languages in a matter of hours as the Apple tutorial suggests; but they can do this only because they already know a number of programming languages and grasp the underlying principles. But these underlying principles aren't taught too often anymore. Often the teachers themselves aren't too skilled in them. Given enough time none of them will know the basics. And then you'll hear that flushing sound...

Somewhere the educational system has gone wrong. The complexity of computer science is grossly underestimated, even dismissed; teaching programmes aren't taking up the really important subjects and aren't taking them up in the proper order, and the teachers themselves who come from this environment lack the ability to pass on skills they themselves don't have.

About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Copyright © Rixstep. All rights reserved.